The author appears to have gotten the slide exactly backwards. She said the slide showed a query of “children’s clothing” that Google rewrites to be a “Nikolia kidswear” query so that it can sell more ads. But in reality, the slide is describing a fuzzy keyword matching system that takes a query of “Nikolia kidswear” and allows it to match ads with “children's clothing” keywords.
I’m surprised WIRED allowed such an obviously incorrect article to be published in the first place, particularly when it was by a known partisan (the article discloses that the author is a former Duck Duck Go executive with an obvious bias).
More precisely, the slide shows, for an advertiser who is bidding on the keywords “+kids +clothing” and has this sort of broad match enabled, three columns of examples of searches that would also match:
1. (because of [kids → children]) ads with keywords “+kids +clothing” would also match searches like “clothing for young child” and “newborn children's clothing”
2. (because of [kids clothing → kidswear]) ads with keywords “+kids +clothing” would also match searches like ”nikolai kidswear” and “kidswear outlet”
3. (because of [clothing → apparel / outlet]) ads with keywords “+kids +clothing” would also match searches like “creative apparel for kids” and “kids outfits”
That is what the slide's title (“Advertisers benefit via closing recall gaps”) refers to: the gaps in recall (matching) are being closed, by being broader.
The WIRED article misunderstood the slide, and was entirely based on the premise that if you searched for “children’s clothing” you'd get results for “NIKOLAI-brand kidswear” which is not true (and would indeed have been “startling”, not to mention obvious, if it were true). In fact, the organic (non-ads) part of the search results in Google are always completely independent of anything in ads, something that the Search team in Google have maintained for several decades as a fundamental principle.
> In fact, the organic (non-ads) part of the search results in Google are always completely independent of anything in ads, something that the Search team in Google have maintained for several decades as a fundamental principle.
Are you sure? There's an email from the ads team proposing multiple measures to increase the number of search queries so they can reach their target revenue. One of the mentioned measures include "ranking tweaks."
Well, it's a line I've heard a few times over the years, and it's also being stated externally (https://twitter.com/searchliaison/status/1709726778170786297 says “The organic (IE: non-sponsored) results you see in Search are not affected by our ads systems.”) so I'm fairly sure it's still true.
That email thread you linked is between the Ads and Chrome (not Search) teams, is about the number of search queries (not the results of search queries), and “ranking tweaks” there refers to the ranking that Chrome uses to show the suggestions in the omnibox (address bar). (To get a sense of these “ranking tweaks”, try this experiment in a (new?) Chrome profile with default settings: type "flowers" in the Chrome address bar and don't hit Enter, and look at the suggestions: what mix of search suggestions, entities, and bookmarks/history do you see? Try again with other commercial queries like “insurance” and “mortgage”, and also some less commercial queries like, I don't know, “Minnesota” or “economics”.)
(And FWIW, I think that whole email thread actually shows Google in a “good” light relative to the popular impression here on HN as a company whose every action is some Machiavellian scheme to increase ads revenue: it shows that Chrome actually launched something to production before its negative impact on revenue became a concern, that Ads leads had to work hard to persuade them to either roll back or find some other way to undo the decrease in search query volume, that starting to include search query volume as a launch criterion would be a “cultural shift” for Chrome, etc: that Ads having an influence on Chrome is a rare occurrence.)
I disagree that it shows Google in a "good" light. Here is a excerpt from the emails between Ads to the Google Chrome team. From Jerry Dirschler (Ads) to Anil Sabharwal (Google Chrome)
>"Thanks Anil (Google Chrome Lead) for pushing your team and being open to this whole line of thinking... We are short REDACTED% queries and are ahread on ads launches so are short REDACTED% vs. plan...The Search team is working together with us (ADS) to accelerate a lunch out of a new mobile layout by the end of May that will be very revenue positive (exact numbers still moving) but that still won't be enough. Our best shot at making the quarter is if we get an injection of at least REDACTED%, ideally REDACTED%, queries ASAP from Chrome... I also don't want the message to be "we're doing this thing because the Ads team needs revenue." That's a very negative message. But my question to all of you is - based on above - what do we think is the best decision for Google overall?
>In that spirit, do we think it's worth reconsidering a rollback? Or are there very scrappy tactical tweaks we can launch with holdback that we know will increase queries? (For example, can we increase vertical space between the search box/icons/feed on new tab to make search more prominent? are there other ranking tweaks we can push out very quickly? Are there other entry points we haven't focused on that we could push on soon?) Just to be clear, the reason I haven't pushed harder on a rollback so far is because I don't want the message to be..."
That's the same as the link I was responding to, and that's why I wrote “relative to” — the Ads team pushing on Chrome for revenue shows Google in a poor light relative to an imagined world where Chrome never cares about Google revenue, but a good light relative to an imagined world where everything that Chrome does is for revenue or some short-term profit to Google, rather than what's good for users.
Why are people not talking about this document more?
“I also don’t want the message to be we’re doing this because the Ads team needs more revenue…but what is the best for Google overall?”
Clearly, the ads team has influence over search to the point of saying more-or-less screw culture and team morale, let’s do what’s best for Google overall which is hitting our quarterly targets.
reading through the email chain, it seems ads did indeed get its way, and the product was indeed made worse to drive revenue numbers – chrome team was unable to say "no" when pressured by ad team
And I looked at the article at the time, and then showed my righteous indignation on HN. An example of why it’s important to dog deeper before commenting.
Unfortunately par for the course these days. At least they did remove the article. Citing "editorial standards" is weak sauce though, the point of having editorial standards is so they can be applied by the editor before publication...
I have always found this to be a weird take from a forum largely populated by software people. Every piece of software I have ever used has been riddled with bugs and security defects. This despite the fact that we have dedicated sub-fields for both (security engineering / test engineering). So when a journalist notices a bug in their article and has to "retract a release", how is that not what we do every single day?
I think you might just be disappointed in humanity, rather than any particular agency or person. Humans come together to get the right answer eventually; not multiple times a day, every day, without fail. Sometimes the software doesn't do what you want. Sometimes that article has a factual error. Eventually we figure it out. It's not the end of the world.
In this case, it's probably justified to have gripes with the editors.
They got an opinion piece with an extraordinary claim about something the writer saw in a slide presented at a trial. She wrote the entire article about it creating theories and then how it needs to be stopped. Not to mention, she had previously worked with a competitive search engine. The central piece of fact checking was seeing that slide, that would have been the second question after "Are you sure you saw something like that?". The entire article hinges on that one slide she saw and what she understood from it.
This is not a case of missing a boundary condition. This is missing the central premise. At any level, it would be inexcusable.
PS: In a sense, as the artifacts are becoming public, i think more such confusions would surface in the coming weeks where merely misunderstanding what is on the slide would lead to a lot of rumors. Probably best we have an initial case and some intuition that this can happen.
I don't think you should derive a universal QA standard from specific release practices borne out of a specific risk/reward situation that depends on things like your particular business model or lack thereof.
There's a wide spectrum of how fucked your software can be when you release it without it being a huge deal, like if I push a broken release for my hypothetical build system that jumbles the error messages a little that's probably fine, if I push a broken build of the google dot com landing page that's raising a few more sirens and I really shouldn't make a habit of it, if I deploy critically broken firmware to your pacemaker or your crewed rocket ship I should probably be exiled from the field. I imagine there's a similar spectrum for journalists and we can't really figure out where on the spectrum to the article from the OP should be with just analogies to a different field. Complaining about a trend of publications slipping towards the yolo'ier end of the spectrum doesn't seem on its face hypocritical.
It's also an understandable mistake. The slide the GP mentioned doesn't provide a clear explanation and can be interpreted the way the original article did. It's no wonder multiple people didn't catch the mistake.
An article should not be published based on one person's interpretation of one slide. Basic journalistic standards would require having other sources that corroborated what the writer thought the slide meant.
Isn't one of the key elements of ethical journalism that you need to ask the subject of your article for comments? I have a hard time believing they asked Google to comment on this piece and didn't get "that is categorically false and if you publish it you will be hearing from out attorney" as an answer. So I don't really see how this one got through unless they didn't ask for comments.
This would be like the SWE equivalent of pushing to production without code review. It's not an excusable mistake like when a bug gets through.
I don't know about "need", but it's certainly common to reach out to them... But that doesn't mean that they respond quickly. Google does not have a rep for reliably getting back to people quickly or even at all, and when they do get back to people, they very often just say they can't comment, because there's no way for the people handling the press contact to reach a person who would know what was going on in a predictable manner. (Yeah, that sounds stupid. It is. And yet, nonetheless, you can regularly find that no one at Google can figure out who owns something or knows why it's the way it is.)
A difference being software engineering is a laughingstock no one expects much of, while worldclass publishing has a past excellence against which it's still frequently judged.
People expect their software to work perfectly nearly all of the time. And for the most part, it does. So not really sure who you think these people are who think it's a laughingstock.
We also have the notion that fixing bugs at design time is easier than in dev, which is in turn easier than in prod.
Publications can simultaneously be praised for “rolling back a bad build” while being criticised for letting that build roll out to prod in the first place.
Journalists suck. They pretend to be bearers of the truth, and yet go forth with horribly biased and inaccurate claims. Nothing like software which does not wield anywhere close to the same cultural power to control narratives as the news media.
The same way we expect "QA" to pick up those bugs before release?
Think about the opposite. Imagine we pickup a bug in a live application and update it citing "our QA standards". This magical word does not absolve us of the fact that we let a bug through to production. And it's implied that we'll do better next time to pick up this kind of bug.
The post you are responding to is making the point that even though we have QA, dedicated jobs or even departments for these things … We, as a profession, still fail in releasing bug-free software. And that’s 100% correct.
Is that really something you are challenging? Which pieces of popular software you can think of are bug-free? And are you thinking of more than a handful?
They said "So when a journalist notices a bug in their article and has to "retract a release", how is that not what we do every single day?"
And they themselves were responding to someone criticizing media for using "editorial standards" as an excuse when they "retract a release".
And I explained how we are the same and supposedly strive to not release bad content/software, we just don't get to magically absolve fault with an excuse that "it doesn't meet our QA standards" as an analogy to "editorial standards".
Anywho, the whole thing breaks down when we have to beat it with a stick. It's a discussion, assume a charitable interpretation.
Retracting an article due to "editorial standards" is, to extend your analogy, like attributing the need for an emergency hotfix to "QA practices".
Editorial standards should stop bad stories from being published. Saying "we retracted this story because it doesn't meet our editorial standards" begs the question "why are you publishing things that don't meet your editorial standards in the first place?"
It doesn't take responsibility for or explain the mistakes in the article. It doesn't state that the article had factual errors. It's a frustrating cop-out. I sincerely hope this is a temporary measure while Wired gets a more comprehensive retraction put together.
Wouldn't it be better to add a correction at the top of the article instead of deleting it?
Deleting it I'd think would just fuel (current or budding) conspiracy theorists, as they can point to something that existed but got removed, rather than something that got corrected.
But in the end, it's probably a loosing battle anyway...
I don't know if you've ever had the misfortune of interacting much with conspiracy theorists, but literally whatever happens it always proves their beliefs are correct.
I recall an anecdote by a mathematician about his approach to peer-reviewing proofs of theorems. He said he glosses over most of it, but very thoroughly checks anything the author claims to be "obvious" because it's the things that seem obvious that screw people up.
Had a discussion about Assange the other day (contentious I know but bear with me). I brought up how he had a show on RT, the Russian state media channel, mentioning that it’s pretty bad optics regardless of what one thinks of the dude. They immediately went “no they didn’t that’s got to be a CIA myth.” Hand to god that’s what they said.
I showed him the IMDb and Wikipedia pages for it and a clip. Without missing a beat - remember they literally had never heard this before and have already decided it’s misinformation planted by the CIA - he said “Well it’s an independent show he put on that happened to be shown on RT.”
Seriously! Like I said, I get it’s a very contentious issue and I’m sure people have varied feelings on this, but to walk into a conversation literally knowing nothing about the topic at hand and then hand waving it away so quickly…I mean there’s no point in even continuing the conversation at that point.
Assange knew every western government was likely to do literally nothing to defend him against the US. In his shoes you'd also probably be drip-feeding information and money to the Russians. Info which should be out in the open anyway.
Whether they removed the article or not, I think it’s ridiculous that WIRED didn’t explain why the article was retracted.
They should have mentioned 1) what the article got wrong and 2) that the article was written by a former Duck Duck Go executive who should know better.
Ya, I'd prefer if they just corrected the article, made a note at the top of the article about what was corrected, and have a disclaimer that it was written by Google's competitor.
Adding a correction at the top would just fuel (current or budding) conspiracy theorists, as they can point to something that was, in their eyes, forcibly corrected against WIRED's will.
Whereas the sources spoon-feeding them this attitude are, of course, unimpeachable. (They don't redact, or admit fault.)
I generally find that anyone expecting error-free perfection in any subjective field from an entire industry can be safely dismissed as a complete idiot. Intelligent people look at the processes that produce outcomes, not the outcomes.
(In this case, the process that produced the outcome was a former DDG employee mailing in an oped of incredibly questionable quality that criticized a DDG competitor, that was eventually pulled. Wired should look into reading op-eds before publishing them.)
Because journalism is not the way to solve this, and we all think that it will.
The way to solve it is for "investigators" to get access to said source code and give us an honest assessment. Get someone neutral, get NDAs in place, have them look at the source in an air-gapped clean room, anonymize DB access, etc etc.
Instead we go around in circles. Journalism at the end of the day then becomes "we got some info that points to X, but we have no proof, and we won't look for it. We'll just wait till someone, somewhere, at some point, who knows when, finds some proof and puts it out in the public, only then will we say it's conclusive. Until then, you all go nuts, especially the conspiracy theorists. Don't worry, we'll ride this gravy train and report on those nut jobs too!".
It's a mess, and I'm not at all surprised that conspiracy-theorists have a field day with it.
Hey, let's get a bounty up so that a whistle-blower from Google comes forward. Maybe Mozilla foundation can take some of that sweet sweet endowment they use for "Social Justice" and pony up to fund this investigative effort. After all they claim to be fighting for a free internet. I'd say this is a much better use of money than paying random DEI consultancies huge speaking fees.
That would be showing too much respect for or deference to conspiracy theorists, who aren’t really worth the trouble. Copies probably exist in the Internet Archive, archive.is and in PDFs people could have saved when the article was up so it’s not like it truly disappeared. WIRED just took down a bad article that shouldn’t have been published.
Transparency is the key to trust and journalistic integrity. Even an article that was mostly wrong/written from a failed premise should remain accessible with a timestamped correction statement explaining the problem.
Making your mistakes suddenly disappear as if they never happened is tempting, but it's going to leave users feeling gaslit, distrustful, or even just misinformed.
> the point of having editorial standards is so they can be applied by the editor before publication...
“Ain't nobody got time for that”.
The role of journalists is to provide facts, but the business model of media in the internet age is about “creating content” and “engagement” and they don't work well together …
Even providing facts is a subjective, opinionated process. Which facts are provided, which are left out, which are contextualized, whether or not active or passive voice is used[1] and which are not can completely invert the meaning of a story.
[1] 'The gun went off and the shot hit Mr. Smith's head' versus 'Officer Sloan shot Mr. Smith in the head.' [2]
[2] Both of these are 'just the facts', but one of them blames the gun for going off, the other blames the person who made the gun go off. Passive voice versus active voice, very different presentation of the same facts, both are correct, and both are biased.
Was this piece marked as opinion or an editorial? Allowing 3rd party writers to publish isn't new, but back when we pretended we had standards, they were placed in an obvious part of the publication that it wasn't staff writers working within the normal flow of editorial process. i must be old, because a lot of my comments are now "used to be", "back when", "remember back when" type stories
Honestly, it just seems like something where somebody looked at it too quickly, got excited, and then didn’t look a second time. They are probably an experienced writer so everyone approving it went “sounds great.“ Honest mistake, but a bad one, and very unprofessional.
>I’m surprised WIRED allowed such an obviously incorrect article to be published in the first place
Is wired supposed to be a very accurate news source? I'm not surprised to hear bullshit from the media. This statement implies wired is supposed to be better than normal?
>EDITOR’S NOTE 10/6/2023: After careful review of the op-ed, "How Google Alters Search Queries to Get at Your Wallet," and relevant material provided to us following its publication, WIRED editorial leadership has determined that the story does not meet our editorial standards. It has been removed.
Sure would be nice if the material was provided to us too. Or write an article explaining clearly how the original info was misconstrued. Simply deleting it (edit: instead of replacing it with a clarification so anyone who goes to read the wrong info gets the corrected one) really sends a wrong message, both about Wired and Google.
That's a very fair point, and I've complained about that degradation on Google many, many times over the years on HN, but I also don't think that's what Google was responding to. The Wired article has been taken down, but I'm assuming it was arguing that Google would change organic results to get more ad clicks. Google is saying here that ad results and organic results are 2 independent, orthogonal systems.
Again, your point is one I strongly agree with, but it's also taken out-of-context with respect to the article Google was responding to.
I think there's also a case for neglecting the quality of organics, while investing highly in the quality of ad targeting. So it's not quite as twirling-mustachio villain feeling, it has roughly the same effect.
Yep. It's obvious that they design for you to click ads, but it was fairly rocky suggesting that the backend reaches out to the ad system. This wouldn't just destroy results, but also run afoul of FCC Ad disclosure requirements.
So is no one questioning that maybe the search subsystem talks to the "website score" (or whatever it's called in Google-speak), to determine if it's relevant? And perhaps that score is influenced by how many valid ads there are.
Maybe the score is simply highly-correlated with a site that shows many google ads? Think about it, someone that shows Google ads happens to also be very very keen on optimizing everything that affects their rankings in Google searches.
Hey Google, perhaps "displaying many Google ads" should be a negative on page-rank?
Regardless, the Wired piece had no evidence and didn't make a claim on the same basis as what you're suggesting. The opinion author mistook this feature by thinking it applied to organic results: https://support.google.com/google-ads/answer/10286719?hl=en
I'm not sure what point you're making. Is it just that I didn't say "currently visible portion of the page, the part you see without deliberate scrolling"?
That's the nice way to cheat with AI. It takes a probabilistic set of inquiries to discern the truth.
And right now, I'm in my states capitol, looking for a restaurant... Using Google maps. And when I searched for restaurants, was given an advertised 'optimized' curated set of restaurants. And within walking distance, they're missing easily half of the restaurants.
Now, which half are they missing? The small one-off places just aren't here on a search. HOWEVER, if you zoom in all the way, then those less-indexed restaurants FINALLY show up.
I'm sitting in 1 of those deindexed eateries right now. So yeah, the article's true and has been for a while now.
For context, the WIRED op-ed was written by a DuckDuckGo executive. I can certainly imagine the incentive there for DDG to misrepresent Google's behavior (and as I was reading the article, most of it struck me as speculation more than anything).
ETA: Okay, weird. The op-ed itself made it sound like she was a current DDG executive, but her LinkedIn profile states that she no longer works there (and is more or less self-employed now). No idea how to interpret that.
In any case, she was DDG's legal counsel and VP of policy for three years.
Mere employment does not negate that the person might still have financial incentives tied to DDG. If anything, they might be concealed better and are MORE likely they can publish with conflict of interest, without legal risk.
You're right.
And on the other hand, someone with passion and knowledge about the domain will likely be working for one of its major players at some point.
Okay, but where is its correction? If you, as a professional publication that considers itself to publish news, unintentionally spread misinformation, you shouldn't just take down the old links, you should put out a new piece with the correct information (and give it just as much attention, don't hide it away).
The news came out, everybody got mad, and not half the people who read the original article (or headline) will revisit the articles and see the retraction.
Maybe the correction will come later, but I'll stay sceptical of anything Wired has to say until the correction comes out.
Everyone knows Google has been swirling the tank for over a decade for actual web search and AI will likely finish the job. The main vector I look at is how much control does the user have over the results they see? That has been in decline at least since Google+ took away the + operator.
Remember "did you mean" results? Now we get "let's assume you meant" with fewer and fewer ways to force "actually I did mean..." This is the obvious trajectory for any corp that needs to show quarter over quarter growth for 20+ years, at some point you hollow out from the middle until the whole thing collapses.
I remember when you could query Google for x and it would simply say it didn't have any results for x. It was such a great feeling to get a frank, honest reply instead of being bullshitted with results that didn't have x in then.
>I, and most other people using the Internet on a crappy mobile keyboard wouldn't.
Sounds like your problem is crappy UI enforced through attempts to minimize production cost at the expense of enabling the use of the human processing medium to effectively integrate with an electronic device.
My error rate typing on mobile skyrocketed with the loss of haptic feedback.
The real world problem was created by the removal of haptic feedback from the physical keyboard. No problem you had was solved. The only people's problem who was solved was handset manufacturers.
Don't knock ideals. Most of the world has a vested interest in convincing you they aren't possible. They most certainly are.
The current behavior shows results for hurricane katrina, with a reasonably prominent link to search hurricane karina. Given (I’m assuming) is empirical evidence that the former was the intent a huge majority of the time, what’s unreasonable about this?
Though Google PR is brings up "relevant ads" every time they're criticized for cyberstalking, they seem to care less about relevant search results. They insist on showing me pages upon pages of links that completely disregards my search queries.
In that case, have the fuzzy search be the default, and have an option to opt out of fuzzy search. The current state of Google search is always use fuzzy search - it cannot be disabled anymore. Even clicking the "No results found for X, did you mean Y" link submits a new fuzzy search query with Y.
It's a computer, that's what they're supposed to do. Precision and determinism is what makes them great.
This is like saying if you had say, a pocket calculator but hit a key at an incidental angle and the calculator then presumed you meant the next number over and gave you that answer instead. It's incorrect - that's not what computers are supposed to do.
Yes and no -- yes, theoretically the output should only be based on the input - if you input 1+3 instead of 1+2, the answer given should be 4, not 3.
Optimization exists though, and an interface and search algorithm isn't a simple calculator. Suggesting the correct term when you misspell or mistype something is precision -- it's both identifying the lack of results for your erroneous input, and suggesting the correct input to get the result you're most likely searching for.
That's literally the point of optimization. If Search was still the same as it was in the late 90's, Google wouldn't be able to do half the things it does.
Are you going to make similar gripes about autocomplete, or GPS that reroutes when you fail to make the planned/"correct" turn?
Comparing an intelligent and contextual search interface and result, with simple arithmetic, is a patently false analogy.
> That's literally the point of optimization. If Search was still the same as it was in the late 90's, Google wouldn't be able to do half the things it does.
That'd be great. The newer half of it is terrible.
> Are you going to make similar gripes about autocomplete, or GPS that reroutes when you fail to make the planned/"correct" turn?
Absolutely valid. I never use autocomplete as it is vapid and incorrect. Also I don't use gps routing because it does this.
The "smart" rotates and "smart" zooms around the screen ignoring my input isn't desired.
These systems presume the user is profoundly, unbelievably stupid and can't, for instance, understand cardinal directions.
It's why when you enter a url like "http://somesite.com.:80/" it will be like "well I think you meant https://somesite.com" and then just ignore all your very explicit protocol and port instructions and whisk you off to an https, even if it's broken and doesn't work or similarly if you explicitly select a subsection of a url that starts at the first character, it will invisibly tact on the protocol to the beginning of your selection to be helpful ... as if the user is helplessly befuddled and perplexed by the protocol syntax.
These aren't optimizations or improvements. They're diffusive and reductive interfaces that disempower the user, they're everywhere now and it's why everything sucks.
Here's what you're advocating for in the physical world - a smart flathead screwdriver that can't be used to pry or wedge anything. In fact, if you try to do that it will have special built in motors and then work against your intentions, wobbling around looking for a flathead screw and then refusing to work if it can't find any.
Presuming the user is a completely incompetent clumsy dumbfuck and ONLY working under that modality is not an improvement. This has somehow become a core design assumption in the SV and it needs to die.
> It's a computer, that's what they're supposed to do. Precision and determinism is what makes them great.
> This is like saying if you had say, a pocket calculator but hit a key at an incidental angle and the calculator then presumed you meant the next number over and gave you that answer instead. It's incorrect - that's not what computers are supposed to do.
This deserves wide consideration. (Replying because hidden upvotes can't convey that.)
Not that I disagree with your calculator example, but "what computers are supposed to do" changed radically when AlphaGo hit the scene, changed some more with ChatGPT, and will rapidly become a meaningless notion going forward.
We don't have to like it, but we do have to accept it.
disagree. These are ultimately still do what I say interfaces. You need to be specific and intentional and they respond in pretty exact alignment to what you ask it to do. They're basically unconventional programming languages.
You can ask absurd things to gpt and it will try to respond to your absurd request.
For instance, I asked it "What year in the 1900s had the most tuesdays" and it spit back a python program that tried to figure it out. On Google, to compare things, I get "1900s" crossed out and the wikipedia entry for the Ruby Tuesday restaurant chain as the first result, maybe because it's around dinner time.
The difference here is between that and the "guess what I mean based on crude demographic information and popularity" interfaces that ignore the user's clear intentions in favor of gross statistical markers like how google changes the position of images, shopping, maps, etc in the results based on the query presuming your intention based on crude vague guesses or the search systems that seem to only return 50% of what you asked for and the other 50% is simply what it thinks you want to see instead.
Those are not tools, they are broken trash. It'd be like if you had a knife that randomly turned into a spoon or a fork based on what time of day it is and what room you're using it in.
ChatGPT tried to do exactly, precisely what I said regardless of the fact that there is no answer since it's a list of years and not a single one. It's the "you told me to do X and I did exactly X" interface, the kind you get with a good tool.
Too much software is the opposite - basically as if someone walked up to a dinner machine and said they're vegan or kosher and the machine was like "American male. Eats cheeseburgers. Here's cheeseburger"
These modern systems just straight up ignore you in favor of some "big data" approach. It's trash.
Recently I slept through a silenced alarm on a Saturday because my phone decided that people don't want wakeup alarms on Saturday and Sunday without extra configuration. I had set it Friday night for the next day and it was off by default because of some grand assumption. Asinine... This stuff is everywhere.
I'd rather get "there are 10 results for bicycle clowns NYC" than a bunch of hits for clowns on motorcycles in NYC.
Google has been steadily increasing its "fuziness" to the point that it considers words like "motorcycle" and "bicycle" synonyms. It's made it more and more difficult to get the results you're looking for.
They even do it to search terms you put in quotes.
This fuzziness is slowly getting worse in my opinion. I notice it seems like Google had gotten more and more willing to assume unrelated words/concepts are sufficiently interchangeable that it can happily return both in a search query for either … and I’ll be honest here… single behaviour is the number one reason I’m on the edge of leaving google search forever… doing a takeout export of my gmail and fucking the hell off to only using iCloud for personal and Microsoft 365 for work and professional…
Google’s continued erosion of their core user facing product “search” (the real core product is advertising but that’s not the majority of people interacting with google are interacting with google for… and an argument could be made that googles only real product customers care about anymore is YouTube but the quality of that experience fluctuates wildly depending on how stupid they are being any particular day due to asinine policies and abusive relationships where they seem to desperately want to destroy the goodwill of the creators that upload content like it’s some kind of fetish and they just have to know the creators hate them otherwise the job of working at YouTube isn’t satisfying…) is indicative of a complete failure to care about the core competency of the company… and companies that fail to care about their core competency are rotting hulks doomed to die…
When i search for "Java dispose HttpClient" i would definitely rather get no results (because you can't) than results on how to dispose an HttpClient in .net, which is what i got when i did that on friday.
This happened to me yesterday, I don't remember the query but I remember I was quite surprised because it never happens. Turns out I misspelled a brand/product name by one letter - Funny enough, I would have wanted them to correct my query in this case.
My favorite is when you fight and fight and fight, looking for something and then you finally manage to narrow the results down to what you wanted, and it says (right above the correct result):
"It looks like there aren't many great matches for your search"
Good job, Goog!
I remember when it was new, and one of the killer features was the dead-simple "innovation" of having AND searches instead of OR. To this day, I think that any search engine that queried based upon the exact provided search terms would eat their lunch. Probably not, but I would like it.
I'm not sure how the search engine you suggest would filter out noise. There is just an ocean of generated content that exists out there filled with just about every string you could imagine that attempts to get you to click on a link so they can show an ad.
Any page with ads, remove it from the results. I'm not saying this would in any way be in Google's interest, but I'd love to see a search engine that did this.
> I'm not sure how the search engine you suggest would filter out noise.
This seems orthogonal to the suggestion. Presumably the search engine would still filter out noise.
However, even if the other poster means they want the raw internet as well, there's no technical limitations preventing search engines from offering a 'filtered' or 'raw' option. They already offer filters based on what you're looking for (books, news) or to exclude explicit results.
This discussion is partly about how Google alters search results to increase revenue. It would stand to reason that they might also improve their stock price and search revenue by doing this, while also degrading search quality.
As for why their market share has not degraded as a result, it is likely that the folks currently prosecuting Google for antitrust violations have the best argument for why this might be so.
A stock price does not mean that everything the company is doing is perfect. It can reasonably mean « their search is getting worse but they’ve found very good ways to make more money from worse search so I’m very happy and will buy more stock».
1. GOOGL is the voting share stock ticker, not GOOG. GOOG is nothing more than class c shares.
2. Search as a line item is more than 50%, but search is not searching, searching itself provides not much value, ads are where Google makes its money.
3. A stock being near all time high has no indication that it's a company that is "swirling the tank". Often a stock does not decrease until after a series of bad earnings/negative outlook. But with that said, there is no evidence at the moment that Google is not doing great.
The SEO attack surface is functionally infinite for AI, and I'm not sure how any search engine is supposed to be good and also have revenue. The only way to make money is to get people to give it to you, and it's hard to imagine a world in which either a) people will pay for the right to use a search engine or b) companies will pay you to rank them fairly.
> Everyone knows Google has been swirling the tank for over a decade for actual web search and AI will likely finish the job
I think that's their hypothesis either. I don't think it is a coincidence that we are seeing more lines of ads in search results than ever. They are squeezing the cash cow. I don't think this observation is subjective but could depend on your location, do you observe the same?
+ was simply replace with "". A very minor syntax change. The other stuff about wildly assuming you meant some unrelated, usually commercial, stuff though I agree with.
> The main vector I look at is how much control does the user have over the results they see?
This has been the trend in so much technology and I think that's why there's such a slow, grinding, complete distrust of Silicon Valley going on. When all this stuff was new and interesting, you did a Google search to find stuff. Sometimes, by law of probability, that stuff was indeed something you wanted to buy. Now, all queries are subject to change so they can put a product that paid them for exposure in front of you, completely irrespective to if that product is relevant to your search or will solve a problem for you. Social media feeds used to be a chronological timeline of things your friends posted; now they are a selection, made by an algorithm you cannot interrogate, of the "most interesting" (judged on metrics you do not set and have no control over) things your friends have posted. Some of them today, some six weeks ago. And, between nearly all of them... is a product that paid for exposure, that is probably at least tangentially related to things you're interested in, but could not be, and more importantly, was not requested.
And, if you make the terrible, awful mistake of browsing any of these sites without an account, prepare for an absolute FIREHOSE of the worst, crummiest, most exploitative, total and complete bottom-of-the-barrel content, the widest possible net designed to ensnare anyone who passes by to watch a marketing grad in his 30's give 500 people in the developing world dental care, or whatever the fuck, set to copyright free music with as many ad placements as they can stuff into the thing.
Increasingly the Internet is not for us, it is certainly not by us, it is simply where you go when you are bored, the only remaining third place that people reliably have access to, and in true free market fashion, it is wall-to-wall exploitation. People selling their bodies because they can't get access to enough money to live, people desperately trying to sell things they've made because time cannot be utilized anymore without a financial benefit if you want to remain solvent, and of course, massive corporations posting billboards large enough to cover the sky, in every direction, every place. A new spot springs up, a new gathering spot that promises to be better, and it gains traction because everyone is so sick of the rest of it, and then in short order once enough people frequent it, the ads go up, the beggars appear, your friends are hidden behind a cylinder of recommended bullshit that surrounds you, and it joins the cavalcade of endless irrelevant nonsense that makes up the spaces you fled, just in time for the next one to pop up nextdoor promising it won't do the same.
> I've never seen a set of search results that felt like a different search.
They'll certainly ignore search terms, modifiers like quotation marks, and so on. If the result looks like what you meant to search for, that means they're doing their job right I suppose, but they are definitely only using your literal query as a suggestion.
(I did not get to read the original article, I have no idea if this comment of mine is off the mark with respect to the claims of that article)
Since I don't use Google search much anymore, and have my search history turned off anyway, I can't recall any specific examples from my own life. So, I searched and found internet threads like this one:
Where the complaint is that searching for "map accuracy" (in quotes) results in pages without that literal string in them, implying that Google is ignoring the quotes to give the user what it thinks is a better list of results.
But, when I try to duplicate this three year old problem, I can't replicate it, at least not on the first page of results.
And when I try to contrive my own example, searching for a literal phrase that would not exist in any web document (an example: "He beheld nonlinear radish-scented vestments") there are 0 results, which is exactly what I'd expect if Google was obeying the quotation marks.
So, this is probably not true anymore, and you've got me doubting my memory now, but I am still fairly certain that it has happened to many on multiple occasions. /shrug
I believe it's always been the case that quotes work. Here are 3 possible explanations for what you remember:
Google search will return results that match the quoted query only through invisible text. So you you might get a result that seems to not contain your text, but if you check the source code or the DOM, the text will be there, hidden.
When checking for equality, search will ignore certain punctuation and HTML. So your text might not be there exactly, but once punctuation is stripped, it's there.
In the time between Google crawling the page and you viewing it, the page may have been edited.
That SO example may be a case of the 2nd reason. For example a page saying
<title>Map Accuracy</title>
<p>Map accuracy is a measure of...
technically contains the text "accuracy map" once you strip out HTML and normalize whitespace and case.
So why did they take the article down as opposed to only appending the note at the end of the article like they've done elsewhere?
It's long been rumored that Google coerces news outlets to publish (or not publish) certain topics lest they find themselves downranked or missing from search queries for a while or get penalized with adsense.
With the removal of an article besmearching Google, I begin to wonder if there's any truth to those rumors.
Because in this case, the _entire premise_ of the article was wrong. You're welcome to wonder all you like, but in this case the obvious answer is the correct one. The author (Megan Gray) wrote a bad piece because she misunderstood how a technology worked, leaving Wired no choice but to take it down. Read the article, come to your own conclusion.
When I read that I was floored - not because I expected even remotely that it might be true, but because I couldn't believe just how far Wired's quality bar had fallen. It doesn't read like a piece written by someone who's familiar with how the internet works.
The article seemed very strange to me, so I'm not surprised it's been pulled.
(my overriding thought as I read it being "I'm sure google does assorted morally dubious things in this area but this really doesn't seem like their style of evil, and I suspect the article is either a misrepresentation or mistaken")
My point is that it's highly unusual for a news outlet to outright remove an article as opposed to editing it and giving a note explaining why. Or a note saying the article is not accurate but leaving it up for historical purposes.
My point is that it isn't unheard of for something sufficiently badly wrong, and that article cleared the bar hard enough that I was honestly surprised that wired published it at all.
I find it highly likely that Google -do- try to influence coverage, mind, but I don't think that was the primary factor in play here.
The article was obviously horseshit -- and I hate google. But if your search for "running clothes" was getting replaced with "addidas running clothes" you would see clearly in the results. Degraded as it is, I still get results that include the specific words I searched in most situations. That type of replacement would be immediately and obviously apparent.
Here’s a quick summary of the main points, I copied from the deleted Wired article [0].
Something important about it is this article, by Megan Gray was in Wired’s Opinion section which usually means the writer has more freedom to express their own opinions, and Wired does not claim this it is accurate reporting. But still it was removed.
How Google Alters Search Queries to Get at Your Wallet
Testimony during Google’s antitrust case revealed that the company may be altering billions of queries a day to generate results that will get you to buy more stuff.
RECENTLY, A STARTLING piece of information came to light in the ongoing antitrust case against Google. During one employee’s testimony, a key exhibit momentarily flashed on a projector … [1]
This onscreen Google slide had to do with a “semantic matching” overhaul to its SERP algorithm. When you enter a query, you might expect a search engine to incorporate synonyms into the algorithm as well as text phrase pairings in natural language processing. But this overhaul went further, actually altering queries to generate more commercial results …
The “10 blue links,” or organic results, which Google has always claimed to be sacrosanct, are just another vector for Google greediness …
Google likely alters queries billions of times a day in trillions of different variations. Here’s how it works. Say you search for “children’s clothing.” Google converts it, without your knowledge, to a search for “NIKOLAI-brand kidswear,” making a behind-the-scenes substitution of your actual query with a different query that just happens to generate more money for the company, and will generate results you weren’t searching for at all …
Edit: my opinion is Google should respond to the accusations. Removal of the article without a detailed explanation looks real bad for Google and also for Wired
And not only is it incorrect, it is obviously incorrect. The website owners do not pay Google for clicks on the "10 blue [organic] links"; so it gives Google no business advantage to make them more commercial.
Since they have an ad monopoly they have incentive to link you to stuff that shows ads, which will in all likelihood be their's. Spammy medical sites with google-run pill ads vs the wikipedia page for the thing you searched for (often not even on the first page anymore for medical terms).
The author is not saying that. They're saying google manipulates what those 10 links are to generate more money (bc of reasons like those sites serve more google ads).
X as a website needs to go. It's wild that readers can "add context" that's patently false or misinformative, and then X will portray this as meaningful. If I was Google, I'd just delete all index links to X: "Did you mean 'Threads'?"
The implication is that you’re being unwittingly directly channeled to certain sellers sites, instead of to sites that link to various sellers, I guess.
So the path to a potential sale is shorter and you’re more likely to buy (less time to get decision fatigue), and certain vendors might be prioritized.
That sounds like an accurate summary of what she's implying, but her implication makes no sense. Levi's doesn't pay Google for directing people to their site (unless it's through an ad, of course).
(1) Why? Ads are charged per click/impression, not per sale.
(2) How would you feasibly create a link between every brand who advertises with you and every brand whose site you're trying to uprank? What happens when two different brands who advertise with you appear in the same results?
(3) Most importantly, is there any proof at all that Google is upranking organic links on behalf of brands who advertise with them? (I don't think there is.)
Isn't this the keyword rewrite that had existed for decades? I remember they showing us similar slides during orientation back when I first interned more than a decade ago. The transformation from "kids" to "children" or "clothing" to "apparel/outfit" improves search results by adding more potential hits.
I am willing to believe that, in this particular instance, there is no evil intent behind the rewrite of search terms.
Having said that, the fact that Google goes back and forth on whether they respect the use of quotes to search for literal results doesn't make them any favors.
If I wanted to do what the article claims Google did, I would do precisely what Google is doing. So no surprises there.
I think the only way to know this is to understand whether google’s rewrites also impact what ads are eligible to show.
E.g. if 50 advertisers are bidding on “children apparel” and only 10 advertisers are bidding on “kids clothes”, then rewriting to the query with more eligible ad impressions is misleading to both the user and the advertiser.
Certain keywords are more profitable than others. Google knows this. One sure-fire guaranteed way to increase revenue is to reduce the % of long tail searches (since long tail searches typically have no ads at all) by modifying queries and swapping words around.
If I were a desperate exec at Google, that’s the first place I would look if I was in a crunch to boost revenue short-term.
Rewriting makes sense if they do the same thing to the material being searched. It has been obvious to me that the ads do more generic matching than the search results but I do not consider that to be evil.
I just wish they would leave alone the modifiers we can apply to actual search results. (And, for Google Translate, allow specifying that adult results are acceptable. Normally, with a term that can be adult or not you only get non-adult results but it's easy enough to throw in an explicitly adult term to fix that. Throwing in extra terms doesn't work so well with translate--that means it's basically impossible to get the adult result for an ambiguous word. What's the dirty word for the male reproductive organ? English has no unambiguous word for this, thus the translation is impossible.)
>I think the only way to know this is to understand whether google’s rewrites also impact what ads are eligible to show.
The other way to know is to ask Google directly (i.e. ask the company, not the search engine) and for them to explain what they're doing and why, in the name of transparency. Google could do that. They won't, but they could.
It's become more and more aggressive and dumber - it'll take "bicycle" and happily return results about motorcycles, for example - which is fucking useless. If I'm searching for "metal bicycle fender", I don't fucking want results about motorcycles.
At some point they went even further and started doing it even when the words were quoted.
No one knows exactly how Google alters search queries except Google. That's arguably the issue. It's not transparent.
But the point of the Op-Ed, if I understand it correctly, was not to describe the precise mechanism of alteration. It was to raise awareness that Google is altering search to boost ad revenue and the changes do not necessarily lead to better search, but they lead to better ad revenue.
>EDITOR’S NOTE 10/6/2023: After careful review of the op-ed, "How Google Alters Search Queries to Get at Your Wallet," and relevant material provided to us following its publication, WIRED editorial leadership has determined that the story does not meet our editorial standards. It has been removed.
Kudos to everyone in the original thread on this article who immediately pointed out how confused and wrong this article was. It heartens me to that Wired recognized this and retracted the article.
The transformation is {kids, clothing}-> [{kids, clothing}, {children, clothing}, {kidswear}, {kids, apparel}, {kids, outfit}] for the _advertiser_ trying to buy adds on a query.
"tj maxx" isn't the part being matched, it's the "kidswear" in "tj max kidswear" that is causing it to match to {kids, clothing} because of the aforementioned transform
Would certainly have been smart to say why it got retracted instead of highlighting non-existent editorial standards back when it got published. "This article has been retracted because it was full of inaccuracies, with the author misunderstanding the very basics of ad matching".
“Google demanded Wired take down my op-ed, pointing to a single page I saw in court for only a few seconds. Wired sent it to me, and I said I needed to see all the pages shown in open court. Google didn't send. Without speaking with me, Wired deleted my op-ed.” https://twitter.com/megangra/status/1711035362326298795
That story is extremely thinly sourced and shouldn’t have been published. But once published I really disagree with the decision to delete it.
If you publish an article you have an obligation to archive it. Put a huge flashing disclaimer on the top of you must, but people should be able to read it on your site.
Removal is going to needlessly fuel conspiracy theories. It increases the power of the original story. You’d think wired would understand that.
Personally, I dislike this approach by Google (from their response):
> "It’s no secret that Google Search looks beyond the specific words in a query to better understand their meaning, in order to show relevant organic results. This is a helpful process that we’ve written about many times."
This is patronizing, manipulative, and condescending - and 'organic'? They've also been boosting low-quality corporate media outlets to the top of their search rankings for some years now (Youtube search is far, far worse incidentally), as well - part of this 'daddy knows best' mentality, I'm sure.
Google does provide a 'verbatim' option under its tools heading but this is incompatible with time-restricted searching, although perhaps not on the advanced search page. Practically I find that to get what I want from Google (search results from a broad range of sources) you have to jump through many hoops and write complicated queries for no reason other than to avoid their enshittification defaults, and that's a time-consuming process.
It makes Kagi look more and more attractive, certainly - the time-saving alone might be worth the monthly fee.
The most likely explanation for the original article's false premise is that the author had a sketchy extension installed, similar to the ones that inject an affiliate code into every Amazon link.
The article didn't make sense to me at all on first read. It was not clear what Google was actually doing as the author tried to claim. I just gave up and moved on.
Labeling anti-bigtech leftists with conspiracy theory is not a good outlook either. I bet the HN liebling Cory Doctorow will write another tweet-eassay about this [0]
Seems really strange to me to reject the notion of 'reading between the lines' when we know the motives of the company's leadership. We know they work with Google, we know they are interested in driving views and profits. Any interpretation that is consistent with their known motivations is worth considering.
Some things just aren't feasible to collect evidence for. You would need access to their private communications to find evidence. So your basically giving a pass to any maleficence done in the dark.
What you are saying is that you are biased against Google. I doubt that you demand proof every time you see retraction, and that you consider the publisher liars if they don’t show the proof. Right?
Also, Wired didn’t say the “proof” came from Google
The default position when commenting something on the internet is that it is opinion (this comment included). I don't think it has to be stated explicitly.
Ah, so when I tell you the Chicago Beers defeated the Miami Saints 37-12 in last year’s Super Bowl, it’s not lying or even incorrect, it’s obviously just an opinion?
"It is my opinion that Wired is not a good source of news" — opinion, expression of subjective epreference
"It is my opinion that Wired is intentionally concealing a conspiracy behind closed doors for their own gain and the harm of the public" — baseless conjecture about objective reality and conspiracy theory-spinning, that just happens to start with "it's my opinion that..."
Conspiracy-mindedness (leading to viral spread of disinformation, and even to acts of violence and terrorism) has taken too much root in the modern psyche, hiding behind the stolen shield of "all opinions are valid", when in fact they are not statements of subjective opinion, but instead baseless assertions about objective reality.
I think many googlers will also confirm that google can and does edit and filter the search results returned. Heck just see DMCA takedown requests amongst other areas.
This is nothing more than google legal taking action to shore up their defense.