I can't believe nobody has mentioned naive Bayesian text classification yet. It sounds like it could work wonders for Twitter. I'm much more likely to be interested in tweets with words like "hylomorphism" than tweets with words like "omglol", and a text classification algorithm could learn that if you trained it up some. It doesn't have to be perfect; it just has to improve the signal-to-noise ratio significantly.
I've looked into this a bit, albeit more in a spam-filtering context; tweets have very little text for naive Bayes to latch onto. 140 characters would be 20-30 words, tops. That is so few words that it is hard to move the prior very much, unless there are blockbuster words that almost always indicate a bad tweet; as the article suggested, "breakfast", "beer", etc.
The article's whole premise was that tweet quality does not correlate well within an account; e.g., some marvelous twitter streams include breakfast tweets.
I think stat/learning classifiers could work, even with the brevity of tweets. However, as the author says, one persons gold is anothers garbage. You'd need a platform/system that would allow for personal classifiers for each/every tweet consumer.
I've always thought of this problem in reverse for both Twitter and Facebook. I have certain followers/friends that are interested in my thoughts about programming and business and others who would care more about where I'm going this afternoon. It would be nice if there were different publishing channels I could publish to different friends.
You can do that on Facebook: when posting a status update, the drop-down box under the lock icon has a "Customize" option, which will pop up a dialog where you can select specific people or pre-created lists of friends for it to be visible to (the lists function essentially as channels, like "colleagues" or "family"). You can also set one of those channels as the default. I know a few people who do that fairly regularly to separate friends vs. colleagues vs. family updates. But it's a bit of a hassle and sort of buried.
For Twitter, I know a few people who have two Twitter accounts, a "work" and a "personal" one. Also somewhat of a hassle, though maybe actually less of one.
That's close, but not exactly the same thing as selecting people to publish to. If I set a post to go to the "Tech People" friend list on Facebook, none of my other friends can see it, nor can the general public. I'm okay with everyone seeing it; I just don't want to pollute people's news feeds with things I know they won't be interested in.
I don't think that's a solution because it assumes everyone would agree that X is for personal posts, Y for professonal, etc. I don't think that kind of agreement will ever come about. I believe in filters but all the ones currently out there aren't at the right level of granularity. It's not about the service, the person, the list, or the search result. The update itself is the right granularity to consider for relevance (as this blog post shows).
Sure, that's one way to look at it. But the more people you follow (seriously, who in their right mind would follow 50,000 people!), even noise gets noisier.
I think a quick and neat workaround would be to assign "importance" levels to folks you follow. For e.g. @billgates might be a '1' whereas @amanda9875 might be a '5' - so while you get every tweet @billgates sends out, you might only get every fifth message @amanda9875 tweets.
Sometimes I really miss Jaiku. There you could add multiple streams to your feed, but when following, could also decide to follow certain streams, or not. So I could follow someone's post, but not their flickr pics or delicious bookmarks. Comments on a post where separate from posts as well. Twitter makes this all one big mess.
Let's hope annotations will be used to add this filtering (although I haven't seen post client, which is an annotation as well, used for this).
That's about what I envisioned while reading the article. All you'd need is different sections on Twitter, and to choose which to subscribe to when following someone. All the clients would need to know is that you're subscribed to X's stream 0 but not their stream 1.
I've never met the author of the article, but I assume he's the type of person who is bothered when his Reader unread tally switches over to the plus mark.
I mean, honestly, tweets are 140 characters or less. The average tweet takes a handful of seconds to read. Is my time so important that I can't spend a few minutes of my day learning what my friends thought was important to share with me? Must I tailor their interests down to only those that I deem relevant? Am I so bad at skimming content or choosing which content is worthy of in-depth inspection that I must have a computer do the editing for me?
Obviously, the answers to those questions are highly subjective and use-dependent. Personally, the only editing I need is the unfollow button. If I respect a person enough to want to hear what he has to say, I gladly take the risk that sometimes his output won't be immediately relevant. (As an aside, a year ago I met one of my favorite journalists. I asked how his dog was, since he had been tweeting about his new puppy. It was a nice ice breaker. I didn't follow him for dog-training updates.)
I should make a disclaimer: I'm not a heavy Twitter user. The ratio of feeds:twitters followed for me is something like 10:1.
I respect your sympathy for and willingness to read what those you follow deem interesting. But I think the author of the blog post is speaking of a problem that emerges at a larger scale when you follow more & more people. Your stream becomes bigger while the amount of time you have stays the same so your options are 1) unfollow people, 2) don't read everything, or 3) put in more time. Personally, I don't like 1 or 2 because I feel like I'll miss out on timely, relevant information and 3 isn't an option for most of us. Thus, the need for a solution like the one the blog post describes. Shameless plug: here's how I'm trying to fix the problem - http://slipstre.am/
I agree with all of that, except I've found 2 to be the most useful, especially with regards to how I manage Reader. My time spent in feeds has been a lot less hurried since I learned to stop worrying and love the Mark as Read button.
I just feel like I personally get information in so many different ways, that if something is big enough, I'll see it in many places and am pretty much bound to read one of them. I can count on seeing any big tech story here on five different feeds, Google News (go to time-killer on my phone in the rare occasion my feeds are empty), several twitter accounts, and probably here on HN (more like 15 feeds if it has Apple in the title.)
I went to your site but you lost me with the lack of content--maybe at this stage that's important to weed out the laziest of the "testers." Have you thought of doing just a quick before/after screenshot to show default twitter vs. your application, or is it too early for that?
Edited to add: er, my fault. My brain skips over flash almost automatically. I didn't watch your video.
If you are looking for harbingers and early insights they typically have a weak signal, so you have to sift through quite a lot of content / message traffic to find them.
Your use case is fine, it's just not the author's.
The author misses a trick (or I missed a mention of it). Filtering out by the client used by the third party helps a lot. You can immediately filter out tweets coming automatically or semi-automatically from systems like Gowalla, Foursquare, blip.fm, Sharefeed, last.fm, auto news posting services, or even just through the Twitter API. Anyone else who posts crap in a manual, deliberate way should just be unfollowed.
This isn't Twitter's garbage problem. It's the garbage problem of the people this guy follows. Seems to me that building out a complicated system for channeling different tweets would hardly be worth the resultant complexity to Twitter and their users.
Is it so much to ask to employ a little restraint in publishing, and on the other hand, a little taste in following?
I disagree. Some of the tweets I subjectively label as garbage might be legit. Hey, maybe some guy's mom is on twitter and wants to know what he had for lunch. It's not a people problem, it's a drawback of the platform that all of this different subject matter has to be broadcast in the same stream.
It's a people problem, but the problem is you. If you don't care about what he's having for lunch, then I'm not so sure what's so hard about ignoring his tweet. It's only going to be on your screen for a split second as you're scrolling through other stuff. Presumably anyone you follow would post more things that are worthwhile than not; otherwise, you should just find new people to follow.
One of my friends tweets quite a bit about Magic: The Gathering. I don't care about that. Somehow I get along just fine by not paying much attention to the posts I don't care about. I don't quite understand your dilemma.
What I intended to convey was that your mind already possesses a filter for this type of thing that you could never hope to reproduce in software. I just don't understand what's preventing you from unconsciously applying it, since plenty of other people seem to be doing perfectly fine at doing that.
I read all my twitter on my iphone, and if I could filter on the server, it could cut my daily reading from 20 minutes to 10 if this problem was solved, so to me that's a big deal.
I would post much more "garbage" if there was a decent filtering system. I only post substantive tweets as I'm worried I'll annoy people with useless stuff. Why would someone care about where I'm eating or what movie I'm watching if they don't even know me. So I keep it focused on purely technical topics.
I honestly don't see this as a problem. I don't read every Tweet that comes through my stream its kind of random access information, I look at twitter every now and then and if something interesting strikes me I look into it.
While filtering would be a good feature, I think saying its a killer feature is going a bit far. I don't think the reasoning that you could follow twice as many people would make much sense from a roi perspective. If your interested in roi, you wouldn't even worry about following people, you would create custom searches about topics you are interested in and deal with those. You would certainly get more info for your time spent on twitter that way and much more focused. You can do this with basic tools like Tweetdeck search columns or even or saved searches in the standard twitter web interface.
"I honestly don't see this as a problem. I don't read every Tweet that comes through my stream its kind of random access information, I look at twitter every now and then and if something interesting strikes me I look into it."
Don't you feel like you miss out interesting and relevant tweets with that kind of random access? Imagine if the proposed filters were in place. They would give you the best and most relevant of everything. Isn't that better than random access?
No. Because I follow like 350+ people, I dont have time to read through every tweet. Also sometimes that tweet spam is relevant, you may notice someone is at some location close by via 4sq and meet up with that person, maybe I care if my close friends had eggs this morning for breakfast(maybe not, but you get the idea).
Custom searches don't capture all of the good information that's out there because I may not specify the right keywords. It's ANTI-search that I'm personally looking for.
You are not going to catch all the good information that is out there with your follow list. So searching fills in the gaps(if your interested in a subject area or something). Twitter lists help too. (ie/ gather up all the people who are known to talk about x subject and put them in a list)
Exactly. I read every tweet I'm subscribed to, and therefore if someone has too much noise, I have to delete them, but as a result I miss the real gems these people post now and then.
I totally agree and was just contemplating the same topic. Spooky.
What gets me is even some of Twitter's own employees don't know how to use the service in a 100% useful manner. And you would be surprised who....Tweets like: "He said that?" or "Can't wait to see it" without references are GARBAGE. I also don't care that you are currently eating or thinking, in general, without learning SOMETHING in the process. They don't teach that at the company? In this age of videos, location-based tagging and pics, we have the power to do so much more than just talk.
Finally, I think this is an author controlled issue. We have the power to dictate what we write/ tweet and should respect our audience for listening/ following us. Unless we don't understand the tool, then it's a training problem.
I, @super74, try to direct message any personal or conversational tweets to only those who know what I'm talking about. My public messages tend to lean towards disseminating information, sharing creative ideas and offering my opinion on public topics.
Although I'm not perfect and may err from time to time, I have occasionally looked back and have been somewhat pleased with the bulk of my tweets.
Excellent post! I came to the same conclusions in February and am working on fixing the Twitter garbage problem (what I see more broadly as information overload) with Slipstream: http://slipstre.am/
Would love your thoughts here or over email: arthur@slipstre.am
>A first approach is to simply filter out tweets by keyword. I think of this as anti-search: specify a keyword, and never see any updates containing that word.
>Keyword filtering alone could probably solve 1/3 of the Twitter garbage problem
If you are interested in particular subject matter, as opposed to genuinely interested in the people you follow, just run a search or advanced search. Both auto-update and provide filtering functionality.
Would be nice too if I only saw an RT once, not each time someone else on my list RT's it.
Heck, why not update the RT'd by info to include several people - to make it clear. But please, only show me the thing once!
Only a slightly different note, would be nice if twitter would store the "read" status - so that my different clients know what I've read and what I haven't. If it was core twitter functionality then different clients could all sync together nicely.