It's the crisis of the hyperlink: It's diminished to the point where Google is afraid of losing it's previously dominant signal for good.
Personal publishing has moved almost entirely to platforms that `nofollow` links. And commercial sites do whatever it takes to avoid linking anywhere but back, deeper into their own site.
There have been some cases where I think people are behaving somewhat irresponsible with "nofollow": the German Wikipedia, for example, does it to all outbound links.
That's understandable from a spam-fighting perspective. But, if search engines actually implement it draconically, it deprives the search index of vast amounts of collaborative filtering.
I actually just checked, and couldn't find nofollows on either HN or Reddit links. Are they somehow setting them via some method other than the actual link tag?
In any case, I was going to suggest that nofollow makes sense for the "new" queue, but it wouldn't hurt and possibly help to remove it once user generated content has reached certain milestones, i. e. user karma, or hitting the front page.
A related source of great frustration: put some good stuff on the web. Lots of people link to it but almost entirely from social media platforms where every link is a nofollow. Somebody else working on gaming the system with SEO makes a less valuable and useful thing, but obtains (by subterfuge or diligent direct asking) some non-nofollow links. Your high-quality content will be outranked in search results by the other less-high-quality quality content.
A reasonable improvement to this, to reinvigorate the hyperlink: social media platforms could stop using nofollow for links put in by users who themselves have built up a degree of reputation.
It really does seem hard to get visitors these days, even for original niche content. The first page of Google search results is half social/video/news carousels instead of genuine long-term content, and a lot of the other links point to repetitive articles published by generic big-name websites that never really add anything new.
The English-language Wikipedia used to take an... interesting approach to nofollow, not sure if they still do. External links to most websites were nofollow with the exception of cross-wiki links to approved wikis including links to Jimmy Wales' commercial website Wikia. So the net result was that the trusted sources which Wikipedia heavily relied on got no boost but Wikia got a substantial boost in ranking.
> couldn't find nofollows on either HN or Reddit links
HN nofollows job ads and posts under ~10 points (it's 5 in the opensource code, but it seems HN has been modified there).
Old reddit has rel="nofollow" but then removes it via JavaScript for some reason. NewReddit displays OldReddit if it detects Googlebot is asking.
> It's diminished to the point where Google is afraid of losing it's previously dominant signal for good.
Is Google afraid of losing that signal? I don't have any data to back this up, but my guess is that Google's reliance on PageRank is greatly diminished now, and they probably use search result click-through rate and website engagement data (from Google Analytics and ads) as the primary signals for website relevance and quality.
> but it wouldn't hurt and possibly help to remove it once user generated content has reached certain milestones, i. e. user karma, or hitting the front page.
Wouldn't that also give a lot more incentive to reach those milestones illegitemately?
I don't plan on implementing these, and neither should you, unless they get added as part of an official HTML specification. Maybe they are, but I don't see them on this list of link types [0]. Until they get formally specified, to me this is just Google doing Google things and fragmenting the web.
where "may" is defined in the usual way (they aren't in there, but the page hasn't been updated for over a year, so I'm not sure if anyone cares about it anymore after already adding so many extension values).
It's common to have user agent (in this case googlebot) specific rel values.
That's a fair point, thanks. I do still feel that the attribute is woefully underspecified (evidenced by the lack of the microformats list being updated, and the fact that it's a wiki) so in my opinion the least Google could do here is to shore that up. If they contributed to the wiki or formalized it a little bit (like the HSTS list or public suffix list) it would go a long way as far as how I feel about their additions to the attribute. I realize that's quite a lot to ask for "just" a few new link types though.
> To be fair, the WHATWG HTML "Standard" is also nothing more than a Wiki, or even just a collaborative space where everything can change at any time.
well unless you're waiting for the second coming for someone official to come along and bless an RFC, all standards are born of a collaborative space where everything can change at any time. Versions are just snapshots, and good luck if you want to stick with one forever (hows that TLS 1.0 server going to work out after March 2020?).
It feels like yet another way for Google to offload the hard work of indexing the web into other people.
See also; microdata, and the monthly alerts I get from Google about the content of my sites being malformed, yet they validate just fine in a dozen other tools.
If this keeps up, eventually we'll have to log in to Google to provide it with with the URLs of new content, and fill in all of the meta data about each page. All in the name of whatever buzzword Big G comes up with that month.
>If this keeps up, eventually we'll have to log in to Google to provide it with with the URLs of new content, and fill in all of the meta data about each page
FWIW, most people who care about their search ranking would absolutely love a tool like this. The point of the automated crawl isn't just to find the information, it's because if you let people submit their own content to the index they'll lie about what it is.
IMO if you mean that Google should check linked-to content rather than relying on link qualifiers, Google has been doing a bad job in recent years, as I can't find useful material among an ocean of low-effort clickbait most of the time. OTOH, relying on metadata by publishers won't solve this problem.
Dare I say that a better indexed web is a public good that we all benefit from immensely.
Logging into Google and providing it with the URLs isn't analogous because that only benefits Google, it provides a barrier to the entry of competition, which isn't good for us users.
You're confusing forward-compatibility (ignore markup with tags you don't recognize, which is awesome) with accepting invalid markup (which is stupid).
Complete agreement. Furthermore, bad actors out there will abuse "ugc": they will mark terabytes of robot-generated content as "user-generated". It will not be possible to rely on it as an indication of organic content or anything of the sort. Search engines won't be able to use that for ranking, for instance.
If I know that I'm traversing and processing facebook/twitter/reddit links, doesn't "nofollow" already indicate user generated content? (Even if some "nofollow" links do not do that, I can probably separate those based on their position in the documents.)
If you're going to treat specific sites specially, then go in all the way and be prepared to have your logic understand any/all aspects of their structure. Or else, don't bother.
I donno, seems legit to me. Adding "nofollow" just means "I'm not accountable for this" -- it's overloaded to mean both "My site is linking here but I didn't vet it first" (the UGC version), versus "This isn't actually me, it's an advertiser borrowing space on my site" (the sponsored version).
Your site/forum might want "credit" for being a sort of attention aggregation hub without necessarily taking responsibility for all the content your users post -- you may want a softer middle ground if such a thing becomes available. But in the advertiser version you're just renting out page real estate; your relationship with those links ends the moment the ad is displayed.
Giving you a way to differentiate between the two could let you benefit from ranking and placement that more appropriately takes your user content into consideration while better filtering out noise from ad content.
We are not here concerned with so-called computer 'languages',
which resemble human languages (English, Sanskrit, Malayalam,
Mandarin Chinese, Twi or Shoshone etc.) in some ways but are
forever totally unlike human languages in that they do not grow
out of the unconscious but directly out of consciousness.
Computer language rules ('grammar') are stated first and
thereafter used. The 'rules' of grammar in natural human
languages are used first and can be abstracted from usage and
stated explicitly in words only with difficulty and never completely.
1. I only recognise the Oxford English Dictionary, not those upstart Americans ;-)
2. Even the Americans say (from your link):
"Irregardless was popularized in dialectal American speech in the early 20th century. Its increasingly widespread spoken use called it to the attention of usage commentators as early as 1927. The most frequently repeated remark about it is that "there is no such word." There is such a word, however. It is still used primarily in speech, although it can be found from time to time in edited prose. Its reputation has not risen over the years, and it is still a long way from general acceptance. Use regardless instead."
That's doesn't read as accepting the 'word' beyond the most technical noting that some people use it.
Dictionaries are historians of usage not legislators of language. At least in English, where we have no equivalent to the "Académie française" (suck it, Jonathan Swift).
rel="nofollow": Use this attribute for cases where you want to link to a page but don’t want to imply any type of endorsement, including passing along ranking credit to another page.
This means the meaning of 'nofollow' is changing? That seems a horrible idea. Previously 'nofollow' meant exactly that - "don't follow this link please googlebot", now it will mean "follow this link, but don't grant my site ranking onto the destination." - Thats a VERY different use case, I can't see all the millions of existing 'nofollow' tags being changed by site owners to any of these new tags. Surely a 'nogrant' or somesuch would be a better option, and leave 'nofollow' alone.
Question that wasn't answered: Should user generated content links, that may or may not be sponsored (suppose the site owner neither knows nor cares), be marked "ugc" or "ugc sponsored" or "ugc nofollow"?
Nofollow has and it's interpretation have always been interesting. Professional SEO's analyze sites as to their "link profile" the number of nofollow, follow, anchor'd, raw etc. type URLs that point to your site.
Consider a super-spammy site that eschews any like but a "follow", well that's weird - that may be an indication they're a spammer.
So while this (quite likely correctly) states that "nofollow" links have not search rank weight, it's not the entire story.
I appreciate nofollow and ugc. I run a site with user-generated content, and worry about spammers over writing the site with junk content. Having a way to indicate that the link is from someone else is useful for eliminating the incentives for spammers.
Personal publishing has moved almost entirely to platforms that `nofollow` links. And commercial sites do whatever it takes to avoid linking anywhere but back, deeper into their own site.
There have been some cases where I think people are behaving somewhat irresponsible with "nofollow": the German Wikipedia, for example, does it to all outbound links.
That's understandable from a spam-fighting perspective. But, if search engines actually implement it draconically, it deprives the search index of vast amounts of collaborative filtering.
I actually just checked, and couldn't find nofollows on either HN or Reddit links. Are they somehow setting them via some method other than the actual link tag?
In any case, I was going to suggest that nofollow makes sense for the "new" queue, but it wouldn't hurt and possibly help to remove it once user generated content has reached certain milestones, i. e. user karma, or hitting the front page.