Hacker News new | past | comments | ask | show | jobs | submit login

I like web scraping in Go. The support for parsing HTML in x/text/html is pretty good, and libraries like github.com/PuerkitoBio/goquery go a long way to matching ergonomics in other tools. This project uses both, but then also goes on to use github.com/dop251/goja, which is a JavaScript VM and it's accompanying nodejs compatability layer and even esbuild, in order to interpret scraping instruction scripts.

I mean, at this point I am not sure Go is the right tool for the job (I am actually pretty confident that it is not).

A pretty neat stack of engineering, sure! This is cool, niely done. But I can't help but feel disturbed.




Your comment was posted 4 minutes ago. That means you still have enough time to edit your comment to change it so it contains real URLs that link to the project repos for the packages mentioned:

<https://github.com/PuerkitoBio/goquery>

<https://github.com/dop251/goja>

(Please do not reply to this comment of mine—if you do, I won't be able to delete it once the previous post is fixed, because the existence of the replies will prevent that.)


Even if I saw this post in time, I wouldn't have edited it. They are all proper Go package names.


That's... not the point. But thanks to both[1] of you for reminding me how much this place sucks.

1. <https://news.ycombinator.com/item?id=38232101>




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: