Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would like to know what engine they use to fetch the text of the article from URL? That technology is neat.


I think this[0] comes close to what is used to extract text from an HTML document. Fetching can be done via any HTTP client. Will need jsdom to convert the text to DOM before feeding it to readability.

[0]: https://github.com/mozilla/readability


Thanks for sharing. Given the description, it can also be used as back-end for build link preview, like FB.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: