Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, that's good idea - I need to add that to my suggestions for how to implement this.


If you're scraping any significant amount of data (>500K), and depending on the frequency, you might also want to add etag/cache-control headers as well as accept-encoding, to save server bandwidth.

Collecting 1 kB every minute might not be a big deal, but collecting 1 MB every minute would cost an AWS-hosted service >$40/year in additional data transfer costs


It should definitely be optional. I can only imagine some busybody PM insisting they block harmless scrapes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: