Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Absolutely nothing has to obey robots.txt

And absolutely no one needs to reply to every random request from an unknown source.

robots.txt is the POLITE way of telling a crawler, or other automated system, to get lost. And as is so often the case, there is a much less polite way to do that, which is to block them.

So, the way I see it, crawlers and other automated systems have 2 options: They can honor the polite way of doing things, or they can get their packets dropped by the firewall.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: