Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

HN has an Algolia-based API. It’s also very easy to crawl.

I wouldn’t call this evil, however: it’s merely demonstrating a technique that you should be aware of, if you’re a privacy-conscious person. It looks like they also provide some resources for avoiding stylometric detection.



I would bet my bottom dollar that the likes of Reddit and Google already have models to turn a corpus of text into probable demographic data and models to measure the similarity of users.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: