Yeah it's hosted by cloudflare. I'm currently IP-blocking them, as because they keep prompting my crawler with a captcha, presumably because it's made millions of requests from their CDN.
Some rigmarole getting recognized as a good bot by the CDNs. I've submitted a request fairly recently, but haven't heard back from them yet.
Like I would like to be on good terms with them, and other websites that block small independent crawlers.
I can't blame them though, there's a lot of bad bots out there. But I'm doing my best not be part of the problem.
Aha, I was going to ask how you were coping with CDNs like Cloudflare blocking bots. It's sad we've got to this point where basically only the established search engines are grandfathered in to be able to crawl sites.
Some rigmarole getting recognized as a good bot by the CDNs. I've submitted a request fairly recently, but haven't heard back from them yet.
Like I would like to be on good terms with them, and other websites that block small independent crawlers.
I can't blame them though, there's a lot of bad bots out there. But I'm doing my best not be part of the problem.