[UPHPU] Web site scraping
justin
justin at justinhileman.info
Thu Sep 25 17:21:15 MDT 2008
On Thu, Sep 25, 2008 at 8:52 AM, Nathan Lane <nathamberlane at gmail.com> wrote:
> I want to make what in effect is a website scraper using PHP, but it isn't
> obvious how this would best be done. I've tried using DOMDocument and I'm
> not sure if that's the best option or not. I'd really like to use something
> where I could use XPath to get the elements out that I want. Recently I
> wrote a similar program in C# that I call HttpAnalyzer. Could I just use
> that with PHP (i.e. call it from PHP) to get what I'm looking for? Any
> suggestions?
>
I hate to sound like a heretic by mentioning it on a PHP mailing list,
but I always turn to BeautifulSoup (Python) for this sort of thing.
It's absolutely incredible.
You may all now return to your PHP scraping :)
justin
--
http://justinhileman.com
More information about the UPHPU
mailing list