Hacker News new | past | comments | ask | show | jobs | submit login

> my content was directly vacuumed

This happens with the HTML contents just as easily. Perhaps just include a unique identifier in the text per client, and a link back to the source.




> This happens with the HTML contents just as easily.

On an individual scale, yes. But if you're a spammer and want to vaccuum up text from 1000 sites, you'll skip writing a scraper for individual sites (which may change their formatting later anyway) and just use reliably-formatted RSS feeds.


That was probably true back in 2004, but today there are numerous reader view, full text converter, and advanced web spider projects that anybody can plug in and get the full text from at least 90% of web sites, with no extra effort.


Correct. To the point where most black hat tools that spammers use will have rss feed ingestion as the primary or only data ingestion method.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: