Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The part on how to avoid detection was particularly useful for me.

I use webscraping (https://code.google.com/p/webscraping/) + BeautifulSoup. What I like about webscraping is that it automatically creates a local cache of the page you access so you don't end up needlessly hitting the site while you are testing the scraper.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: