Last time this question was asked on HN was in 2017 (
https://news.ycombinator.com/item?id=15694118), a lot has changed in the last 5 years in the world of web scraping (legal landscape, antibot unblockers, data type specific APIs, etc), so I thought it may be a good idea to refresh this question and see what are the most popular tools used by the HN community these days.
I'm really impressed by Playwright. It feels like it has learned all of the lessons from systems like Selenium that came before it - it's very well designed and easy to apply to problems.
I wrote my own CLI scraping tool on top of Playwright a few months ago, which has been a fun way to explore Playwright's capabilities: https://simonwillison.net/2022/Mar/14/scraping-web-pages-sho...