> Does anyone have any actual statistics or quantitative data on the quality of Google search results?
Google has. They use this data expertly to improve search. Common sense and technological advancement tells us that, quantitatively, Google search has become better year over year, for all their relevant metrics/cost functions.
And likely, exactly because it has become better for all its users in aggregate, it has to become a bit worse for a certain group of power users. There, we can only rely on anecdotes and personal experience, but these tell us it actually has gotten worse.
Similarly, the web can become both worse and better. The really useful articles today are better researched, multi-modal, solid web of links, internet-first. Spam has also evolved. And "top 10 ways to do X"-McContent outranks better articles, because that is what the majority of Google users wants to see and clicks on. They truly have a better experience, while others' experiences suffer. It depends on what you measure.
> Common sense and technological advancement tells us that, quantitatively, Google search has become better year over year, for all their relevant metrics/cost functions.
lmfao. so you're telling me "quantitatively" that google search results have gotten better, without citing any data at all, but with an appeal to common sense and "technological advancement"?
what if i told you that search is an adversarial problem, and that it's possible for google's tech to be getting better slower than the aggregate tech power used to game google search is getting better? is this not a patently obvious possibility? it's not some kind of gotcha impossibility for google's tech to get much worse over time, even if they weren't hamstringing themselves by lots and lots of user-hostile changes which benefit google's interests rather than their users.
> lmfao. so you're telling me "quantitatively" that google search results have gotten better, without citing any data at all, but with an appeal to common sense and "technological advancement"?
Yes. If that sounds so unacceptable or strange to you, I suggest you try it. Works really well when reasoning about unavailable data, or researching a field with slow peer-review process.
> what if i told you that search is an adversarial problem
Then I get an adversarial reaction and a downvote from you.
> it's possible for google's tech to be getting better slower than the aggregate tech power used to game google search is getting better? is this not a patently obvious possibility?
Yes, that's plausible. Should be measurable quantitively too. Can you cite some data on this? :)
We could compare to the available data on the quality of (HTML) e-mail spam filtering over the years, which all have kept up. Like pg said: Spam is solved, when skilled spammers start creating content which does not look, feel, or talk like spam. So webcontent-spammers still are on the first 2 pages of Google, but with content not classifiable as spam/content farm.
> which benefit google's interests rather than their users.
One of Google's interest is their user. But perhaps not the type of user you are. Studies have shown the value that the tech of Google is delivering its users per year. This value was in the thousandths, and this value has risen. Meanwhile, Google makes about tens of dollars per user per year, less for technical users which don't click ads or block these.
> it's not some kind of gotcha impossibility for google's tech to get much worse over time
It really is, no way to mince it. Google search is funded by Google ad tech. Google ad tech has improved ML by a ton. To say Google tech is getting worse, is to totally overlook deep learning revolution, word2vec, transformers, BERT, etc. etc. etc. To state that, is to reveal the truth that you are ignorant of major technological advances in the past decade, only looking at the issue from the viewpoint of a single atypical Google-search user. What would you even do with quantitive SEQ data?
Do you really think it is possible that Google runs an implementation test of a new ranking model, and deploys it, while all measurements, human labeling, and user tests show it is doing worse? Of course not! Only if you think you are smarter, could you think that Google search changed and has gotten worse.
If Google does ML like the rest of the industry, all model changes move up their designed levers, or these changes are not committed. So if Google was unable to improve search, then Google would have looked exactly like 2008 Google. The fact that it does not, shows either you or Google is wrong. If I had to make a bet...
> Yes, that's plausible. Should be measurable quantitively too. Can you cite some data on this? :)
It's a bit bloody rich for you to come out with that attitude when your entire point revolves around your opinion of 'common sense'
> Do you really think it is possible that Google runs an implementation test of a new ranking model, and deploys it, while all measurements, human labeling, and user tests show it is doing worse? Of course not! Only if you think you are smarter, could you think that Google search changed and has gotten worse.
You seem to be confusing the concepts of 'Google have made their tech more profitable' and 'Google have made their search capability better'.
Of all things, I am sure that this discussion thread did not add value to this community conversation, and as such, serves as spam. Apologies, and let's hope Google bot finds ways to ignore low-information content in a threaded forum. Maybe they could even locate the exact post which caused the derail, and apply some authority penalty on its author.
No, I used common sense, but I actually have the quantitative data that poster was asking about. Right now, I am doing exact keyword matches and trying to find myself. Will get back to you when my analysis is done.
> You seem to be confusing the concepts of 'Google have made their tech more profitable' and 'Google have made their search capability better'.
No, you are confused. Try to Google the article I was talking about. It talks of perceived value to the user. What value would you lose without access to Google maps, search, youtube, gmail, etc.?
Google made their tech more valuable. That sounds like an improvement to me.
> Google have made their search capability better
If you want an explanation for this obvious statement (and not be demanded to ask in return how capitalism works), I suggest you first try to code a simple search engine. I think a 100-line Python script with some imports would do. Only this would make talking about capabilities possible.
I'm not buying the tribal argument, that I'm the wrong user for Google. The results are shite across the board regardless of tribe.
I think they've accepted that SEO has killed past ranking algorithms and are rebuilding rank from the ground up using ML with the entire internet as guinea pig. All of the crap results we're weeding through now is grist for the ML mill.
I am literally typing exact phrases for content I know is there and not getting the results I should.
I think they've cut the cord with past algorithms, not an incremental update, a major one.
My thinking so doesn't make it so, however, so just my two cents.
> We could compare to the available data on the quality of (HTML) e-mail spam filtering over the years, which all have kept up. Like pg said: Spam is solved, when skilled spammers start creating content which does not look, feel, or talk like spam. So webcontent-spammers still are on the first 2 pages of Google, but with content not classifiable as spam/content farm.
I've actually noticed Gmail spam detection doesn't work as well in the last year or two. I get an obvious spam message maybe every other day. Ads with all images and all.
This presumes that the metrics they optimize for are intended to represent usefulness to actual users and not, say, ad revenue. Even if they do intend to optimize for usefulness, this doesn't mean that they have metrics that accurately represent that.
I also think you're underestimating average users. Anecdotally I've heard my parents complain repeatedly about the incoherent, auto-generated, affiliate link spam that plagues product searches.
> This presumes that the metrics they optimize for are intended to represent usefulness to actual users and not, say, ad revenue. Even if they do intend to optimize for usefulness, this doesn't mean that they have metrics that accurately represent that.
They have multiple levers, of which user search quality is a big set. There is always trade-offs and a balance that must be found, which aligns with company vision and strategy. Having these levers allows business-decision makers to direct focus top-down (on a certain set of users, on producing great ad numbers, etc.).
It is clearly hard and important to design these levels and find the right balance, given a rapidly changing company and user-base. So a lot of expertise and power is invested to measure the right things, and to find the right balance (an incorrect/risky balance should also be adjustable with other levers).
So for me: either Google is trying really hard, but essentially failing. Or they have the best of the world, with all the right context, designing these levers. While hard and sometimes wrong, I do not expect to contribute anything which may improve their lever settings. If someone does know, Google would like to hire them.
So while true, that accurately measuring things with proxies, is really hard, and sometimes done wrong at companies. I do not think Google gets this wrong, or at least, gets this to be the best of breed. If their metrics still cause long-term search engine quality loss, would show them to not know what they are doing. I think they do know very well, better than me at least.
I would agree too that the balance of levers right now is in line with Google's strong market position. Search engine quality could take a small hit, if justified with extra adsense income. But when search engine quality noticeably start going down, then all other metrics will suffer. You should have teams with sole focus on improving quality. Other teams will have to realize that favoring their lever over the search-engine-quality lever must lead to worse outcomes for Google in general.
About product searches, I myself was not able to do this satisfactory 15 years back. It improved. But need to stop viewing things as a single lever, a single metric. To say search has become "worse" in general, is to exactly fall into the trap of not accurately measuring and losing too much nuance/details for competing objectives.
> So for me: either Google is trying really hard, but essentially failing.
If Google really really wanted they could find out why they more often than not include results that doesn't contain my keywords even after I have put doublequotes around them and hunted down and applied their verbatim setting!
After that they could think really hard about how relevant the text:
> and something someone said xyz.
>
>abc is next up and something something
is for someone searching for "xyz.abc"
Or maybe see if they can dig out an old cheat sheet with all the operators they used to support and invite some old Googlers to secretly come in and teach about it but that can wait until they got the basics working again.
I don't think this is necessarily true. Your standard desktop computer has gotten much easier for a casual user to navigate over time, but not any less robust for power users. This is because powerful tools for customization and building are still exposed to power users. Google has slowly stripped away many of these tools. One has to wonder why, and my guess would be that it's because they would expose either the unethical ways Google deals with results, or the failure of its search model.
- There is way more content to sift through, including video.
- There are way more Google users, including grandmas.
- Conversations have moved from discussion boards to walled gardens and chats.
- Google relies more on neural network embeddings, so does a better job when you type full sentences and semantic similarity.
- Google relies on authority signals and incoming links to a website, so non-commercial, hobbyist, or controversial content ranks way lower.
- Websites rely on Google for income, so they start producing what Google and its readers want to see.
- Spammers rely on Google for income, so those surviving after decades, have created massively successful linking rings and spam production pipelines looking at keyword search statistics.
- You were really good at Google searching years ago, having a harder time updating and letting go of what worked for you. Easier to blame Google for this.
As for tips: Anything academic, search on specific websites or Google Scholar. Anything technical/coding, search on StackOverflow. Anything cultural/commercial you want a peer answer, instead of a salesman answer, search on Reddit. Try to join like-minded communities where you can ask expert questions, and research new things in your field. Exact keyword match still works by enclosing keyword in double quotes:
>Anything academic, search on specific websites or Google Scholar. Anything technical/coding, search on StackOverflow. Anything cultural/commercial you want a peer answer, instead of a salesman answer, search on Reddit
This is a completely miserable experience, and walls off useful information into classes of people who "are in the know" about where the most relevant information exists.
And if you're that grandma searching for a birthday present for your grandson? Good luck. She's likely to be devoured by ads, if not an outright scam.
They asked for searching tips, not how to solve the problem of internet search. I have a few ideas for that too though.
Agreed on the miserable experience. Do you have any ideas on how to attack this? Perhaps Google started out with the right experience, but ads eventually toppled it. Perhaps Google never hit on the right experience. What gives?
>Anything academic, search on specific websites or Google Scholar. Anything technical/coding, search on StackOverflow. Anything cultural/commercial you want a peer answer, instead of a salesman answer, search on Reddit. Try to join like-minded communities where you can ask expert questions, and research new things in your field
This is much more like what Ye Olde Webbe was like. Sites competed to build communities that were repositories of information. Things like Reddit tried to build a generic silo so that they could silo information there, which I think is a bad thing long-term.
The biggest problem, as I see it, is sites just give up on doing their own search. Not surprising, as search is a hard problem, but it plays merry hell with the democratization of the Internet to foist the problem off onto Big Corporation Inc. to do the heavy lifting.
A related problem is that many sites simply don't have what could be called a "webmaster" anymore. Everything is contracted out, or part of a subscription service, or otherwise disconnected from the owner of the site having full control. If you're a small business that sells locally produced products, you're never going to appear in Google or Amazon searches, even if you have an Amazon store. You can't afford a full-time webmaster just for your site, and all of the various platforms, like Wordpress/Shopify/etc, deal in such volume that these small businesses will be largely ignored.
The ISV model for products like AutoCAD is possibly a good route. A team of well-versed engineers and designers can build things, but you need a direct customer representative to get at the juicy meat of what the end-user needs. Apply this sort of model to search, and you can aggregate over larger swathes of customers.
> Anything technical/coding, search on StackOverflow. Anything cultural/commercial you want a peer answer, instead of a salesman answer, search on Reddit.
Do you find this better? In my experience it’s nicer to just put stackoverflow, reddit, or (often, in my case) seriouseats in my google query. Reddit search in particular is pretty miserable.
There's an interesting juxtaposition between your last point (people aren't keeping up with the times and instead blaming Google) and the commonality in your tips (don't use Google Search, use special communities).
Zillow lost money, because they were hit really hard during pandemic.
This article does not mention that. Instead, the rest of the article deals with Linkedin-wisdom and hard platitudes, such that it is not possible to build a good model on someone else's data (as if Zillow even was).
Data scientists remarking on the Zillow fold, are like psychiatrists or engineers remarking on non-clients and bridges build by others. They know nothing about the business, about the constraints, about how the estimates are consumed. They end up silly, but without good information coming from Zillow, we assign value to their analysis, purely on Twitter-soundbite-ability and internet-authority.
How exactly were they hit hard during pandemic? Their main income stream is the funnel of services they provide (or get a cut of) through lead generation (agents, title, insurance, etc...).
During the time of the pandemic, the house prices rose and so did the volume. If anything, they made out like a bandit.
The Arxiv One Billion Paper Benchmark was released in 2011, and is commonly used as a benchmark to writing academic papers. Analysis of this dataset shows that it contains several examples of sarcastic papers, as well as outdated references to current events, such as Support Vectors Machines. We suggest that the temporal nature of science makes this benchmark poorly suited to writing academic papers, and discuss potential impact and considerations for researchers building language models and evaluation datasets.
Conclusions
Papers written on top of other papers snap-shotted in time will display the inherent social bias and structural issues of that time. Therefore, people creating and using benchmarks, should realize that such a thing as drift exists, and we suggest they find ways around this. We encourage other paper writers to actively avoid using benchmarks where the training samples are always the same. This is a poor way to measure perplexity of language models and science. For better comparison, we suggest the training samples always change to reflect the current anti-bias Zeitgeist and that you cite our paper when doing so.
Experts are overfitting to the territory. It is the territory which changes (or expands), not their knowledge of it. There is a necessary reason to the decay in pheromone trails: adaptivity.
Compare an army scout on a Starcraft map. First they make visible the entire map. Then they start tracking details. They detect patterns: if it rained last week, this area will be flooded, and other routes are faster.
The army scout operationalized their expertise, made it valuable to others, and hence has something of value to exchange. Anything which attacks their operationalized expertise thus attacks their livelihood. They invested all this energy into materialization, and are prone to sunk-cost fallacy and economically motivated thinking. Don't patch their exploit!
If the hive has enough energy for another scout, then too much communication and history of failure of the current expert may negatively effect their own diverse map constructions. There are multiple ways to victory and it is better to know all of these in case one way is flooded. But it is very risky for one agent to explore all these roads.
Imagine any time the new scout wants to veer of the map, or when it has rained, the expert loudly exclaims: I already checked that road last year, you can not veer of the map there. Also, take these roads, they are always shorter when it rains. Now last month, the roads stopped flooding after rain, but the expert did not waste a failed exploration on that, and the new scout never gets to try (they also do not want to waste energy on failures, but now contribute to enforcing the wasted/stale energy materialization of the outdated expert.
For the expert to admit their ideas are outdated, is to admit their own loss of value. Retiring a fighter who practiced a defense against a kick people stopped using in the 90s. Hardly, if ever, they themselves step down. They want to keep "continuous learning" and adding even more detail to their expert map. When the hive veers off the map, the expert knows they are at a disadvantage, their value no more than a newby scout. Ossification is in their best interest (but not in the interest of the hive).
The size of scientific fields may impede the rise of new ideas. Examining 1.8 billion citations among 90 million papers across 241 subjects, we find a deluge of papers does not lead to turnover of central ideas in a field, but rather to ossification of canon. Scholars in fields where many papers are published annually face difficulty getting published, read, and cited unless their work references already widely cited articles. New papers containing potentially important contributions cannot garner field-wide attention through gradual processes of diffusion. These findings suggest fundamental progress may be stymied if quantitative growth of scientific endeavors—in number of scientists, institutes, and papers—is not balanced by structures fostering disruptive scholarship and focusing attention on novel ideas.
A crafts teacher divvied up his class in two parts. One part he taught his expertise at vase making, down to the level of detail, in context of art history. They were to be graded on a single vase, so failure was frustrating and a frightening loss of energy/hurt ego. The other part of the class was to be graded on the number of vases they made. The crafts teacher focused on teaching them to learn from mistakes, cut losses and start over, how to get better at the mechanics instead of the art. In the end, the part of the class graded for most vases made, also created the highest-graded single vases. Failure was a necessary part of the initial exploration phase, allowing them to exploit unclaimed ground, instead of trial-error-mimicking already-existing expertise. There is a lesson there, I think. Deep Learning comes to mind, before and after its hype. Where the statisticians correctly calculated flooded roads of overparametrization, engineers still charged ahead, and some actually came out alive on the other side, establishing a shortcut/conquered obstacle. Some statisticians still don't want to get their feet wet. And this ok too! There may be more elegant ways, which keeps dry boots with similar outcomes. Let them find these.
Exactly. Trump weaponized this principle, making very rash and sudden moves during negotiations. This disadvantaged others, because they had trouble predicting his reaction to their own actions, thus moderating them in his favor or "You're fired!". I think a US general took it upon himself to inform China they would not be attacked, no matter what Trump threatened, because he himself was unable to model Trump's mind. "My task at the time was to de-escalate".
Also why MAD does not work well against information warfare. Is the current polarization of culture and politics a natural outgrowth of American culture, the result of unwitting civilians being targeted by military black and grey propaganda, or an unentangleable combination of the two? Did the opponent push the button? Did we push it back in the 80s and did they notice? Where exactly do we stand and draw the line, allowing countries to defend their (cultural) borders and feel safe, without the constant threat and fallout from offenders, who act like children pushing their parents to see how far they can go.
Sometimes I suspect these larger than life scientists working on the top-secret projects, Turing, Feynman, Shannon, Neumann, Kolmogorov, Tesla, Satoshi, were actually collections of people working undercover, an Alan Smithee catch-all type to launder intelligence, take credit, while keeping it in the shadows. Like the unnamed people supporting Bobby Fisher in his match against Russia. Modern day equivalents would be companies like Google, Dell, IBM, and Microsoft.
The moment the bomb became a button, was the moment the physicists had to step aside, and let the decision theorists step in. The bomb effectively became about how to make winning decisions. Decision science itself weaponized, opponents worrying about the analysts on the other side, not the aviators: you would know they would follow orders, and drop the bomb if instructed. You never fully know what those instructions were going to be, but you wanted to find out. Strategic Cold war espionage and misinformation must have ran wild.
Here is a public health official (Surgeon General of the United States) saying quite as much as OP was saying:
Seriously people- STOP BUYING MASKS!
They are NOT effective in preventing general public from catching #Coronavirus, but if healthcare providers can't get them to care for sick patients, it puts them and our community at risk!
> There is little doubt they know what they're doing. The bigger question is whether it's in your best interest.
They knew what they were doing. But they did not know what they were doing. By not being honest and showing leadership, by politicizing science into scientism, by covering their asses for not buying enough PPE for the medics and soldiers, they heavily reduced trust in public institutions, and had to deal with the PR backlash of reversing an official stance mid-Pandemic, while whining about lab-leak "disinformation". Definitely not in our best interest.
Same general was made aware by email (FOIA released) two weeks before taking to Twitter to complain about people wearing masks in an airplane, and that the real risk is the flu. Probably did not read it. The ones who did, got sidetracked by Trump, and warned their friends to wear masks and avoid cities, days before even notifying the public of the "potential" for community spread... but let's not assume evil, where no-skin-in-the-game suffices (who ever got real consequences for wrong or dumb stuff on pandemics, before 2020? You could grow solely through Powerpoints and grants and Nature articles.
Maybe I wasn't clear. I'm aware public health officials flipped the script on masks.
Can you find ANY example of a public health official publicly stating how effective cloth masks are? Not that they ARE effective, but HOW effective? As in how much disease spread is prevented from wearing cloth masks.
I believe you will not be able to find any record of a public health official making a claim like "Wearing cloth masks reduces COVID spread by approximately X%."
My hypothesis for why you won't find this is because the % is extremely low, in the < 5% range.
I don't think a public health official averaged out all the research on masks and came up with a percentage. A lot of public policy was binary: masks prevent spread, but COVID can also spread through eyes and ears. We do not want general public to think they are fully protected with masks, and they will never understand nuance, only clear directions. So, do not wear masks, they are not effective for you. Ivermectin, Vitamin D, and HCQ are 100% ineffective, so get your shot.
You should test your hypothesis. Masks clearly help spread. But these are better at preventing spread to others, than for preventing catching COVID. So a larger percentage of people has to wear masks for it to be effective. I protect you, you protect me. There is a ton of research from before the pandemic (if you allow me to extrapolate influenza to a novel coronavirus with common sense, not wait on the randomized trial to finish), and now also a lot of reviews and aggregations of mask effectiveness.
I agree there is a lot of research pre-pandemic on the effectiveness of various types of masks in preventing the spread of virus. The consensus before the pandemic was that the spread prevention was so low that it could not justifiably be recommended for the purpose of preventing spread of virus.
If you believe that wasn't/isn't the scientific consensus, then I welcome you again to find any public health official willing to state on the record how much viral transmission they believe masks are preventing. I'll expand the challenge to include heads of top tier medical universities. And I'll even accept a range, like "between x and y%." You will not find such a thing.
> The consensus before the pandemic was that the spread prevention was so low that it could not justifiably be recommended for the purpose of preventing spread of virus.
Heh. This was the CYA they put out. Maybe it pays to publish only in the authoritative media channels if that's only what a large percentage of people read.
This was the public health consensus they themselves agreed on. Anyone who was in that room justifying not having people wear masks, has not rode in public transit in the last 20 years. Wear a mask and try spitting on the floor or picking your nose.
Scientific consensus was that masks clearly work to combat pathogen spread, and that's why they are in use in hospitals for at least 100 years. There even was scientific research on SARS-COV-1, comparing how many nurses got sick under different PPE policies, and how mask usage affected their recovery / long-SARS decline.
For a while, it was possible to define the (government) status of a person by the level of their PPE recommended to them. I think they ran their global pandemic surveillance systems searching for their own names and "N95 mask" and did not like the communities they were being discussed at. But not 100% certain of course.
I bet you won't find more than a few, if any, public health officials left willing to draw attention to their failures, or have been fired/retired in 2020 after spending their lives in pandemic control. Though it must hurt to leave with such a legacy, I can live with that. If you still need them to decide whether your kid should wear a mask at school or not, I sincerely feel for you. Information gathering and judging what is true absolutely sucked the past years, and it would have been nice if at least authority could have been trusted, more than, say, 4chan.
>Despite common use of cloth masks in many countries in Asia, existing infection control guidelines do not mention their use (13).
>Rates of infection were consistently higher among those in the cloth mask group than in the medical mask and control groups. This finding suggests that risk for infection was higher for those wearing cloth masks.
The contempt for the general public from these people bordered on the perverse. How can they sleep at night? "We acted with the best information available at the time, and followed the scientific consensus to make our policy". Not even close to the truth, but maybe it helps a bit.
The hypothesis in the article was that the cloth captures droplets/water vapor that is carrying virus. And then it's captured there for you to continually suck it out of the cloth when breathing. Without a mask it would've drifted away.
>If you believe that wasn't/isn't the scientific consensus, then I welcome you again to find any public health official willing to state on the record how much viral transmission they believe masks are preventing. I'll expand the challenge to include heads of top tier medical universities. And I'll even accept a range, like "between x and y%." You will not find such a thing.
>A range of new research on face coverings shows that the risk of infection to the wearer is decreased by 65 percent, said Dean Blumberg, chief of pediatric infectious diseases at UC Davis Children’s Hospital.
>The consensus before the pandemic was that the spread prevention was so low that it could not justifiably be recommended for the purpose of preventing spread of virus
Source? Mask wearing was quite prevalent in Japan and some asian countries.
>Despite common use of cloth masks in many countries in Asia, existing infection control guidelines do not mention their use (13).
>Rates of infection were consistently higher among those in the cloth mask group than in the medical mask and control groups. This finding suggests that risk for infection was higher for those wearing cloth masks.
These shenanigans show that the approval can't have been based on all the information. They should have delayed the approval, at least until Phase III trial officially ended.
How many lives would that have cost, though, and for what benefit? The size of the sample that had already received the vaccine without issue made it basically impossible [0] for there to have been an unexpected danger. And without approval, the govt couldn't mandate federal employees take the vaccine, and that mandate has saved tens of thousands of lives.
I am not going to correct this statement, just please ask you not to use the word "intellectual-yet-idiot" to describe other people ever again. IYI is a great and new concept, and can be enlightening when correctly applied. Let us (pseudo)intellectuals cherish that value.
Yeah, after you reading that article, I totally agree.
It might also be fitting to describe NN Taleb. The only one it does not fit, I think, would be Edward Snowden, the last cultural bastion of free speech and human rights protection.
I like how "memes" got laundered through these sites. Taking memes from adversarial communities and making them your own. Or blaming your own memes and campaigns on other communities.
There was, and still is, an admiration of the "hidden hand". Manipulating a newspaper poll, without their readers finding out about it, or blame hitting eBaumsWorld.
Memes started/start in small ICQ channels. Were collectively dumped online to seed them. Other communities would adapt, share, replicate, up until the point the real origins were obfuscated the meme was from "the internet".
Small hacker groups on ICQ had applicants write troll-scripts for acceptance. One of these was DuckRoll. DuckRoll consisted of switching out Wikipedia titles and article contents and sometimes adding a picture of a duck on wheels. Bonus points for having someone else run the script (thinking it was a PHP guest-book script, MySpace theme, or innocent link to an answer to their vampire help questions [1]).
Duckroll morphed into RickRoll. RickRoll was seen as more acceptable, since it contained the damage to a single individual. The principle was the same as with linking shock images, such as Goatse: either you knew about the image, and someone "gotcha", or the internet really confused and upset you that day. You probably were taking it too seriously, and the prank did you a favor. The best pranks had a lesson.
99.9% of those Rickrolled won't trace it back and think it started on Reddit or Digg, or simply "the internet".
Google has. They use this data expertly to improve search. Common sense and technological advancement tells us that, quantitatively, Google search has become better year over year, for all their relevant metrics/cost functions.
And likely, exactly because it has become better for all its users in aggregate, it has to become a bit worse for a certain group of power users. There, we can only rely on anecdotes and personal experience, but these tell us it actually has gotten worse.
Similarly, the web can become both worse and better. The really useful articles today are better researched, multi-modal, solid web of links, internet-first. Spam has also evolved. And "top 10 ways to do X"-McContent outranks better articles, because that is what the majority of Google users wants to see and clicks on. They truly have a better experience, while others' experiences suffer. It depends on what you measure.