Hacker News new | past | comments | ask | show | jobs | submit login

> One nice thing I recognized is that the generated RSS feed contains full html text. I wish more blogging platforms are like this; save a lot of effort for feed readers.

The reason some blogs don't include the whole thing in the RSS is because they want you to visit the site for the ad revenue. It's not an oversight.

Blogging is (was?) a job for some people. It wasn't completely supported by ads, but ads are part of it. It's like YouTubers before YouTube.




Maybe today, but I personally stopped providing full text in the feeds of my now defunct and free from advertising blog, because my content was directly vacuumed and inserted into a spammy full of advertising website.

This was around 2005 for what I can remember. Time is flying...


Unethical people is going to do whatever they can. Life is much more enjoyable if you just ignore them and whatever shit they are going to do with the content you willingly shared with the world for free.


> Life is much more enjoyable if …

That sentence has widely different endings for different people.


> my content was directly vacuumed and inserted into a spammy full of advertising website

For all its faults, this is something I have no shame invoking DMCA for. You still own the copyright on your writing even if you publish it for free online.


How do you "invoke DMCA" on a random spam site? I've only heard it mentioned WRT central authority sites like YouTube, etc.


Contact the host. I can't recall if I've done this before but I do recall that it often works.


> my content was directly vacuumed

This happens with the HTML contents just as easily. Perhaps just include a unique identifier in the text per client, and a link back to the source.


> This happens with the HTML contents just as easily.

On an individual scale, yes. But if you're a spammer and want to vaccuum up text from 1000 sites, you'll skip writing a scraper for individual sites (which may change their formatting later anyway) and just use reliably-formatted RSS feeds.


That was probably true back in 2004, but today there are numerous reader view, full text converter, and advanced web spider projects that anybody can plug in and get the full text from at least 90% of web sites, with no extra effort.


Correct. To the point where most black hat tools that spammers use will have rss feed ingestion as the primary or only data ingestion method.


There are some downsides: interactive pages won’t work and automated content stealers. It very easy to copy articles from such RSS feeds because you don’t need to search for the text in HTML.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: