There's a Reason RSSCloud Failed to Catch On

patio11 · on Sept 9, 2009

I think a more fundamental reason is that RSSCloud solves the problem "I used to use RSS to read articles, but the 15 minute delay between the article being posted and me being able to skim it in my RSS reader was unacceptable" and that this is a problem real people do not have.

wmf · on Sept 9, 2009

I think it was originally designed to reduce the server load caused by polling but now it has been dragged out of the attic to join the real-time hype wave. I agree about real people, though.

nir · on Sept 9, 2009

I think the problem it really solves is the Twitter problem rather than the RSS reader one. That is, if RSSCloud works as advertised, implementing a Twitter-scale infrastructure becomes pretty simple.

The article raises important issues re scaling, though I'd expect Wordpress and Winer are aware of them. I guess it remains to see if they are (and if Twitter-like apps are something people actually want)

derefr · on Sept 9, 2009

Not quite. It's a problem they don't have any more. If RSSCloud had been supported from the start, I doubt Twitter would have been invented, because everyone would have just "microblogged" from their actual blogs in realtime.

EastSmith · on Sept 9, 2009

I think a more fundamental reason is that PubSubHubDub solves the problem "I used to use RSS to read articles, but the 15 minute delay between the article being posted and me being able to skim it in my RSS reader was unacceptable" and that this is a problem real people do not have.

igrigorik · on Sept 9, 2009

RSSCloud or not -- I do happen to think that we need a push RSS solution -- what really bugs me is the fact that once again, we're inventing different standards to do the exact same thing. As if ATOM vs RSS wasn't enough, now we have RSSCloud + PubsubHubbub to worry about. PSHB already has Google behind it (all of Feedburner feeds support it), so I really fail to see what wordpress won by adopting RSSCloud.

Besides, while Dave is a brilliant guy, PSHB already has a lot more people working on it with open source hub and client implementations (heck, I wrote one for Ruby!).

blasdel · on Sept 9, 2009

Atom : RSS :: PSHB : <cloud>

Dave will never stand by and let a perfect solution replace an old poorly-specified ill-used mediocre one that he somewhat-falsely claimed to have invented. Instead he'll repeatedly change the canonical version of the spec live, without changing the version number or telling anyone (much less preserving the previous versions). It's super effective, at least for creating drama.

jonknee · on Sept 9, 2009

The proposal never made sense. Most people don't use desktop feed readers and even if they did this solution wouldn't be scalable (as Cadenhead mentioned). Google Reader knows when you update your feed because you already ping Google. In my experience they are grabbing the feed in a few seconds anyway--it's up to them to show this to users in real time if they want to but there is nothing stopping them.

It seems like Dave wants to continue using his out dated desktop software (the OPML Editor) to view feeds in a manner much better suited for a hosted product. That might be great for him, but I'd rather not change publishing on the internet so his decades old software can keep pace with Google Reader.

udekaf · on Sept 9, 2009

The proposal has the potential to support pooling of multiple feeds. For example, you can get the updates of thousands of feeds in one request. That saves a lot of bandwidth and processing time. Does that make sense?

blasdel · on Sept 9, 2009

The client would still have to be publicly routeable!

lurkinggrue · on Sept 9, 2009

I use a desktop feed reader and I don't see this as being useful.

If I am worried that nothing has come in for 5 minutes I can always hit refresh.

mcdtracy · on Sept 9, 2009

Here's my understanding of the potential for a scalability problem.

EXAMPLE: twitter.com has 24,650 twitter followers. If Dave gets 24,650 followers on an RSSCloud architecture then this is what happens when he posts.

1. Dave creates a 140 char post. Hs blogging software sends a notice to the cloud server that he has an updated RSS feed.

2. the cloud server sends update notices to the 24,650 subscribed "listeners" to Dave's "RSSCloud-twit-sream".

NOTE:It does not send Dave's new post text just the alert event.

3. the 24,650 listeners then do an RSS GET from Dave blogging server. This could create "cattle stampede" (i.e. slashdot effect) and many users may not get service when Dave's server is overrun. The server would likely be swamped with this massive interest in Dave's blog in a few seconds from these real-time subscribers.

At small levels of users the architecture is effective and elegant. At very large numbers it's missing an essential optimization. Only the "new blog" text should needs to be sent... maybe with the RSSCloud event for example.

An RSS GET will pull the whole string of recent blogs posts for all 24,650 users. A lot of excess text that most users already have from being real-time listeners anyway.

The RSSCloud Blogger's software needs to see a difference between a RSS GET for the recent blog text and an RSSCloud GET for the latest update text ONLY. Reducing the amount of text being sent out but a change to the protocols as described I think.

Of course, I could be way off base but I'm really trying to understand the overall architecture and the "realtime" problem this is intended to resolve for us all.

NOTE: If you federate the RSSCloud servers you just make the "GET" problem even worse. More demand on the blogger's RSS feed in a few seconds. It's like a user driven "slashdot effect". Post a 140 char message and notify the cloud and boom... you're server falls over.

I'll await corrections to my understanding.

NOTE: PubSubHubBub has an entirely different approach to the real-time optimization for bloggers. The Hub Server gets the blogger's new post text and the Hub Server forwards this delta to subscribed listeners. The Blogger's server never sees any excess traffic in or out. Of course, the PubSubHubBub service could require the resources of a Google, Amazon or Yahoo. A centralized service that could potentially have a "fail whale". Dave's RSS Cloud has a million "fail fishies".

Life as always is rife with tradeoffs. Go figure. YMMV.

easp · on Sept 9, 2009

A good deal of your critique is a rehash of concerns people expressed about RSS in the first place. They generally did not come to pass because either they weren't real problems in the first place, they were easily addressed, or general advances in technology moved faster than their onset.

You are concerned about the inefficiency of fetching all the items in a feed when just one item changes. Is that a real issue? Consider your use case. How much data is really being requested? If it is a real issue, the server might want to limit the # of entries returned based on the if-modified-since header. As for the load of all that traffic hitting the server in the space of a few seconds, ngnix can push a lot of requests on modest hardware and the load on whatever application logic is involved in generating the feed can be knocked way down by having it cache all feed requests for a second or two.

blasdel · on Sept 9, 2009

RSSCloud is useless, for increasingly ridiculous reasons:

  10e2  It's idiotically-designed
          (he thinks traditional SOAP posted to a resourcey URL is REST!)
  10e4  It doesn't help centralized aggregators scale at all
  10e8  It doesn't work with NATed clients
  10e16 It was specced/never-implemented/forgotten by Dave Winer 8 years ago

EastSmith · on Sept 9, 2009

You seem to like that word "idiot", arent't you? In two days you used it for Dave Winer and Matt Mullenweg (http://news.ycombinator.com/item?id=810288). Who's next?

blasdel · on Sept 9, 2009

a) I'll take that back about Matt -- he's not an idiot -- he just writes code naively, and accidentally became a prominent BDFL after SixApart repeatedly threw away their own accidental prominence.

b) Dave isn't an idiot -- he's a poisonous egomaniac asshole that successfully weedles his way into anything tangentially related to any of his pet projects, and proceeds to do whatever he can to be credited for other people's work, impede progress, and fuck people over. It's his designs that are idiotic.

jaymon · on Sept 9, 2009

The thing that really gets me with all this real-time hype is people attribute the rise of Twitter to all the awesome desktop clients built for it so you can receive your tweets in real-time, the problem is they aren't actually real-time, all those desktop clients use polling, just with a really short interval.

If you really wanted your desktop RSS reader to be just as awesome and real-time as Twhirl then just set all the feed refresh intervals to 5 minutes, no RSSCloud needed.

bmelton · on Sept 9, 2009

Having done almost this exact thing, the issue is that 4 out of 5 large websites will have blocked you buy your fifth poll for abusing their service.

I always thought that it was somewhat ironic that nobody had a problem with me refreshing slashdot.org every 10 seconds trying to get a first post, but that they had a very large problem when my rss poll hit them more than once an hour (is at least what their recommended interval used to be.)

jaymon · on Sept 9, 2009

I was going to mention this but I decided it wasn't really necessary to make my point. I was just trying to show that no matter how hard certain people push real-time it usually still comes down to polling in a desktop situation. I think people have gotten it into their minds that 15-60 minutes delayed isn't real-time while 1-15 minutes is.

But I agree that most places will throttle you if you request their feed too often, so how are they going to feel when every one of their subscribers requests their feed within the same minute every time they make an update? I guess that's the big question.