Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Schema.org is a nice resource. If you can find that meta-data on a site, you can be just a little more sure they don’t mind getting that data scraped. It’s the instruction book for teaching google and other crawlers extra information and context. Your scraper would be wise to parse this extra meta information.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: