Hacker News new | past | comments | ask | show | jobs | submit login

I’m curious. Scraping seems to come up a lot lately. What is everyone scraping? And why?



To add to others’ points, we can do two, more things:

1. Pretain models with any legal, scraped content. That includes updating existing models with recent data.

2. Have our own private collection of pages we’ve looked at. Then, we can search them with a local engine.


With people making LLMs act as agents in the world, the line between "scraping" and "ordinary web usage" is becoming very blurred.


Context for LLMs, and use cases uniquely enabled by LLMs, mostly I think.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: