Funny this is being posted the same day as Matt Cutts' HN post which is currently at #1.
Also per exhibit one of the article: The first hit for "nfl jerseys" I get, even with pws=0 is to nflshop.com. The website that nfl.com links you to when you click on the "shop" link.
More bandwagon jumping about google spam being out of control? I like to think so.
Note that Cutts is talking about a different problem: which are sites that 'syndicate' content and end up ranking better than the original site. People have been complaining about this for years (usually third-tier bloggers who don't have much ranking power) but people perceived that this became a crisis in the last few months.
The morals are also different too. Some people might not like eFreedom, but the fact is that StackOverflow is CC-BY-SA. Anybody who wants to repackage StackOverflow content in a different way is free to do that. I do think that StackOverflow should generally outrank eFreedom, but a site like eFreedom can potentially add value a lot of value.
On the other hand, other spam sites are generating original crap content with their own crap content generation system... And if they aren't, they can switch to some other content generation method to get around duplicate content filtering.
(And speaking of which, duplicate content filtering content of some kind is absolutely essential for a workable web search engine... It's not even a matter of spam. Building a search engine for one the largest units of a large Uni, we found that there were many documents that were duplicated all over the place for all sorts of reasons, and that since the on-page factors are the same, these tend to form 'plugs' of search results that displace other results.)
At least they are impartial, today I have seen osdir outrank google on google for a result from the archives of a googlegroups hosted mailing list. But it is annoying.
"...but a site like eFreedom can potentially add a lot of value."
Genuine question here...are you talking specifically about eFreedom, and if so exactly what value does it add? When I've inadvertently stumbled in there, the questions and answers are an exact ripoff of SO, and I (and I suspect everyone else) just immediately clicks on the "from StackOverflow" link so all the responses in the original can be read.
"Simplification of the user interface. We show only the accepted answer (or highest voted answer if no answer has been accepted yet). We removed the sidebar, comments and vote counts in order to minimize distraction. This gets you to your answer and on to your project quickly."
There are a few more points listed there such as translation and displaying snippets of related questions that seem to show a genuine effort to help answer questions.
I think eFreedom nicely illustrates the problem Matt Cutts and the folks at Google face. The site appears to be playing by Stackoverflow's and Google's rules, possibly doing SEO better than Stackoverflow. If Google also ranks based on page load times, eFreedom might be helped even more, since the site lacks the majority of Stackoverflows features and might load faster. So a programmer, having never heard of or cares about Stackoverflow, interested in only finding a specific answer that includes their Google query, might see nothing wrong with the eFreedom response. Suppose the majority doing Google searches preferred eFreedom based on measured clicks. Should the fact that Stackoverflow was the originator of the content guarantee them a higher page rank? What if Stackoverflow was slow? And stepping back from these two specific sites, how do you deal with that across all sites and their clones?
I avoid eFreedom links because I enjoy participating in Stackoverflow and use the other features, and despite the assurances of the folks at eFreedom the site still seems shady. It seems perfectly reasonable to me that Google would take steps to ensure Stackoverflow ranks higher. But that's just one case out of many. It seems like a tough problem for Google to solve across the Internet.
Thanks for the link, but sorry, I don't get it. The accepted answer on SO is right at the top, and if I thought eFreedom had a better interface we wouldn't be clicking through to SO each time.
To each his own, I suppose, but I'm not impressed by what appears to be their "value" - SEO - and agree the site seems a bit shady.
For one thing, eFreedom.com actually answers your question. This is different from ExpertsExchange (which promises you might get an answer if you fork over $, yeah right) or eHow which only sometimes answers your question, and if it does, does the worst possible job that could possibly be done.
Community sites, at least in their early phases, need to focus on getting people to put content in more than they need to focus on making it easy for people to get it out. Delicious is the classic example: it's a roach motel which makes it very easy to put your bookmarks in, but doesn't provide a useful browsing interface for your and other people's bookmarks (other than having a list of recently hot for various tags.)
Particularly in the semantic age I think there's a lot of room for remixing CC content to improve browsing and discoverability.
>For one thing, eFreedom.com actually answers your question. This is different from ExpertsExchange (which promises you might get an answer if you fork over $, yeah right)
Not sure if I'm taking you too literally, but Experts Exchange does answer your question without having to pay (scroll down).
I don't understand this. Are you saying that efreedom adds original content to that which they scrape? I avoid them like the plague, but my exposure has taught me that they are merely reprinting SO content with crappy formatting. Not much of a value-add in my eyes.
No, I'm not really defending eFreedom. However, I think that sites that are ~like~ eFreedom in some ways to be useful. For instance, large scale text mining could create things that are more than the some of their parts.
For example, I think within 10-20 years at the most we'll have systems that can decompose text into facts and then reassemble it into 'original' text.
Actually there are link farms that are doing exactly that in order to appear to robots to be original text. However, it's just chunks of text "mined" into a mass of subject-focussed sentence fragments. That's the thing, the race to the bottom is: original content is scraped without improvement in order to pay someone else via ads, and original content is generated without regard to coherence in order to pay someone via ads.
Two ways to do this: you have good content either left intact (no value-add) or rearranged or otherwised structurally corrupted in order to appear to be a different/better answer (value-minus), or you have advertisers being led to believe their ads are showing on relevant content, when it's really just a jumble of random words loosely oriented around a concept. "The dog was dog walking. Dog food always is in the grocery store. RALEY's. It dogged him for years..." so on and so forth.
On one hand users are being defrauded and on the other, the advertisers/affiliates. There is no defense for eFreedom, nabble, mail-archive, and their ilk. They are bad people, bad for business and bad for the internet. I sincerely believe this.
Well, I'm not trying to belabor the point and maybe I haven't looked enough on eFreedom, but I don't see any original content there, it's just a scrape. Even the "related links" is scraped.
I still don't get the value in that. Where is the "content in" (which I take to mean content generation rather than duplication) that you mention?
Also per exhibit one of the article: The first hit for "nfl jerseys" I get, even with pws=0 is to nflshop.com. The website that nfl.com links you to when you click on the "shop" link.
More bandwagon jumping about google spam being out of control? I like to think so.