For a browser that runs JS the author mentions PhantomJS, but looks like its Pyt...

pjin · on March 10, 2014

A hearty second for mechanize. It basically wraps urllib2 which is neat. Although, I've encountered situations where language support for distributed systems would have really saved some frustration. There's a version of mechanize for Erlang [1], which I intend on trying out whenever I get around to learning Erlang :)

[1] https://github.com/tokenrove/mechanizerl

Jake232 · on March 11, 2014

Thanks for the mechanize link, I'll add a note about it.

Mechanize is awfully slow though, if you need to crawl quickly it's not asynchronous. I wouldn't want to use it for general crawling. I guess you could patch the stdlib with gevent, and try and get something working that way.