Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tried using this and it was too much of a straitjacket. It's really, really hard to write a decent framework (for anything!) and I don't think scrapy really succeeds.

In the end I dumped my scrapy code and replaced it with a combination of celery, mechanize and pyquery. That worked much better, used less code and was much more flexible.



That's interesting. I'd love to hear more about your problems with it. My email is in my profile, feel free to drop me a line.

I deployed a small but moderately complex crawler backed by a database with Scrapy. It had some custom pipelining as well that deduplicated information (based upon a hash) and exponentially backed-off if the page hadn't changed in a while.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: