Hacker News new | past | comments | ask | show | jobs | submit login
Mechanize 0.2.0 released (sourceforge.net)
48 points by iamelgringo on April 24, 2010 | hide | past | favorite | 19 comments



It is such an awesome library. Totally rocks.

To get a quick glimpse of how easy it is to work with look at this screencast: http://railscasts.com/episodes/191-mechanize


Unfortunately, Perl-WWW::Mechanize does not yet handle Javascript, has anybody (Perl, Python, Ruby) a solution for that? DOM+Javascript simulation comes to my mind. Or maybe a fully controllable browser component (where DOM and JS are built-in) with an API. Anyone with first-hand experience?


Aaron Patterson (the maintainer of Ruby's mechanize) and I have been working on this on and off for a few years. Take a look at http://github.com/jbarnette/johnson, http://github.com/mynyml/harmony, and http://github.com/mynyml/holygrail to see where things are heading.


Thanks, harmony looks interesting, can actions like "set form field", "click button" etc. be programmed?


Have you looked at Selenium? It might have more of what you're looking for.


Things like Mechanize and HTMLUnit are what I call "HTTP-only protocol drivers" and are not a true-fidelity simulation of the HTML+JavaScript environment. This is exactly the reason I created Selenium in the first place. I had regression bugs in IE and Mozilla that I needed to verify were fixed in the actual browsers. An emulated browser environment like Mechanize or HTMLUnit isn't a real browser that users actually use, so it doesn't solve the regression testing problem I had.

Best of both worlds these days is in driving a headless WebKit (a true browser driver that includes the JS interaction), which will be coming in Selenium 2.


Thanks, Selenium can replay user interaction sessions, but I need to set e.g. form fields programmatically, is that possible? Maybe CPAN/Gtk2::WebKit is also an option.


Yes, it is possible, and fairly routine in Selenium. Also, you can execute arbitrary Javascript for whatever else you may need.


you can even use your recorded sessions to spit out the code that you can then use as a template for your code


Yes, along the same lines, you could use Waitr.

http://www.layeredthoughts.com/automation/how-to-write-your-...


Not yet, thanks for your tip.


Internet Explorer.

No seriously, IE is fully controllable via COM and you can use pretty much any scripting language you want (the python COM api is pretty nice, so is the ruby one).

Steve Yeggey had a post on the subject some time ago: http://sites.google.com/site/steveyegge2/scripting-windows-a...

BTW, MS Office has a similar feature.


You can use Java based HTMLUnit. Emulates a headless browser complete with Javascript and DOM.


There's awesome JRuby wrapper around it: Celerity

I went the route: Mechanize -> HtmlUnit (from inside Clojure to be more scriptable) -> ended with Celerity and most happy.


Also worth checking out is Scrapy: http://scrapy.org/

It's a Python scraping framework based on Twisted. It lets you use some of Mechanize's API to describe the scraping jobs.


This is a fantastic library. Great to see that it's still moving along.

Now, if it only had a way to execute javascript....


I imagine you could use python-spidermonkey to execute isolated bits of JS. But yes, getting the window object into context in python would be super useful


We use the mechanize ruby gem for integration with other sites and web applications - doing things that are not exposed through APIs but only interface. Really cool!

Way better than Net::HTTP


Nifty. Always glad to see cross-pollination in the Ruby/Python/Perl world.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: