Hacker News new | past | comments | ask | show | jobs | submit login

I'm not sure I understand the criticism. What exactly is supposed to be the alternative? It's not like this is supposed to be used for high-availability purposes or stock trading or something. It scratches a particular itch for a handful of end users who just want to get all their news in their feed reader. It's expected that they'll maintain their own scraper parameters.



Pick a number of programming languages and you can scrape a page in 10-20 lines of code in many cases. The barrier to entry is understanding the DOM layout for a particular website, which is subject to change at any moment.

Purely IMO, a more friendly way to go about it to abstract from code and CSS knowledge is to run a UI that highlights elements, lets you select them, select the title, description, link etc and there you go. Same thing but without the knowledge of DOM/XPath/selectors.


> highlights elements, lets you select them,

And how would that software then remember your chosen elements if not by their CSS ids/classes?


You must have missed the point, or I'm missing the point of this thing. Select whatever you like, do I need an external tool to do that? Either I have knowledge of the DOM or I don't, and if I don't then a UI selector would be the next best guess.


By positional indexing, of course.

Just kidding, that's horrendous.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: