Hacker News new | past | comments | ask | show | jobs | submit login

One of the questions I have is whether models are being trained on the SEO {spam|blogspam|adsense optimized|spun} websites.

Almost certainly. The web crawl data that GPT (and similar) LLMs are trained on is far too large to be entirely curated.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
