Doesn't work at all with JS. This is a big thing on many sites now. Also, since ...

wrapapi · on April 21, 2017

For sites that load data using AJAX, we recommend you take a look at our Chrome extension (https://wrapapi.com/#/chromePlugin). Our philosophy isn't to run a full headless browser (similar to Phantom), but rather make it really easy to find the AJAX requests that actually load the data you need.

RandomBookmarks · on April 21, 2017

If JS a problem for you, try Kantu. It works with screenshots and uses OCR for scraping. The beauty is that it works with any kind of site. But clearly, the speed can not match a node.js or perl based scraper (mechanize etc), so it is not suitable for high volumes.

gardnr · on April 21, 2017

Do you find it better than Phantom?

Just reading about Kantu now. It reminds me of http://www.sikuli.org/

RandomBookmarks · on April 21, 2017

Yeah, the concept is the same as Sikuli, but all inside Chromium (and the OCR is better).

>Do you find it better than Phantom?

It depends. Once you have a working script, web scraping with Phantom is much faster and much more resource efficient. But since Kantu works visually, you do not have to touch any page source code. That makes it much easier/faster to create the automation in the first place, especially for complex sites with date controls, drag & drop and other Javascript.

supermdguy · on April 21, 2017

I've had issues with web scraping with content generated from JS, and I just ran it through PhantomJS, then extracted the rendered HTML.