I tried using this and it was too much of a straitjacket. It's really, really ha...

Denzel · on March 11, 2014

That's interesting. I'd love to hear more about your problems with it. My email is in my profile, feel free to drop me a line.

I deployed a small but moderately complex crawler backed by a database with Scrapy. It had some custom pipelining as well that deduplicated information (based upon a hash) and exponentially backed-off if the page hadn't changed in a while.