Using a headless browser for scraping is a lot slower and resource intensive tha...

PeterisP · on Sept 9, 2020

I don't find this as a concern - in all the scraping I've done, the only bottleneck was the intentional throttling/rate limiting, not the speed and resources spent by the headless browser; a small, cheap machine could easily process many, many times more requests than it would be reasonable to crawl.

sullyj3 · on Sept 9, 2020

Sure, but it might be the only way to get the data.

hansvm · on Sept 9, 2020

It might be, but _starting_ a scraping project with a headless browser might be excessively expensive if you don't need the additional features.

lyjackal · on Sept 10, 2020

"only" is a bit of an overstatement. The data is always coming from somewhere, it just depends on how much effort needed to reverse engineer the JavaScript code path to the data