No it doesn’t. It politely requests to crawlers that they do not, and if said crawlers choose to honour it than those specific crawlers will not crawl. That’s it. It can and is ignored without penalty or
enforcement.
It’s like suggesting that putting a sign in your front yard saying “please don’t rob my house” prevents burglaries.
> Robots.txt does absolutely apply to LLMs engines and search engines equally
No it doesn’t because again, it’s a request system. It applies only to whatever chooses to pay attention to it, and further, decides to abide by any request within it which there is no requirement to do.
From google themselves:
“The instructions in robots.txt files CANNOT ENFORCE crawler behavior to your site; it's up to the crawler to obey them.”
And as already pointed out, there is no requirement a crawler follow them, let alone anything else.
If you want to control access, and you’re using robots.txt, you’ve no idea what you’re doing and probably shouldn’t be in charge of doing it.
No it doesn’t. It politely requests to crawlers that they do not, and if said crawlers choose to honour it than those specific crawlers will not crawl. That’s it. It can and is ignored without penalty or enforcement.
It’s like suggesting that putting a sign in your front yard saying “please don’t rob my house” prevents burglaries.
> Robots.txt does absolutely apply to LLMs engines and search engines equally
No it doesn’t because again, it’s a request system. It applies only to whatever chooses to pay attention to it, and further, decides to abide by any request within it which there is no requirement to do.
From google themselves:
“The instructions in robots.txt files CANNOT ENFORCE crawler behavior to your site; it's up to the crawler to obey them.”
And as already pointed out, there is no requirement a crawler follow them, let alone anything else.
If you want to control access, and you’re using robots.txt, you’ve no idea what you’re doing and probably shouldn’t be in charge of doing it.