Hacker News new | past | comments | ask | show | jobs | submit | binkHN's comments login

> Firefox evolved over years from a single-process to a multi-process browser and while this brought nice improvements to the processing and latency, creating new processes has a cost both in time and in memory. This talk will introduce the solution that is coming to Linux and has already been enabled by default on Nightly for a few months: ForkServer, a process dedicated to making fork().

> I plan to use this opportunity to install Linux on my old PC.

Use this opportunity to install Linux and your NEW PC, and then buckle in!


Give Linux a try. After seeing how ad-centered Windows 11 has become, I made the decision to wipe my drive and go full Linux, and I couldn't be happier. Is it perfect? No. Is it better for my workflow and caters to my more advanced usage? A big resounding yes.

It cannot replace Microsoft Office, but it's getting close. Most people don't use the full functionality of Microsoft Office, so LibreOffice and Google's online suite are good enough, but I still keep a remote Windows Virtual Machine (VM) around for those time I need Windows-specific stuff and RDP into the VM. I look forward to the day Microsoft finally wakes up and ports Microsoft Office to Linux.


> Ctrl+Tab to switch back and forth between tabs instead of cycling them in order (not the default...

I recently started using Firefox, after giving up on Windows, and recently found out about this feature. It's a godsend for productivity, especially in a world where a lot of things are done in a browser.


> Alternatively, can the community come up with some interesting uses for all the machines?

Google has targeted this model with ChromeOS Flex.


> If they aren’t trying things, they would also then be accused of languishing in obscurity.

They are languishing into obscurity not because they aren't trying things, but because their browser functionality is languishing behind the others.


Firefox doesn’t have a profit model to sell ads or user browsing behaviour like other browsers as far as I know.

I appreciate the languishing comment, at the same time Firefox has features that seem to be a little unique to it out of the box. Spaces comes to mind.

Getting really good at one thing might be beneficial.

Ai summarization seems to be more and more common in a browser. Maybe they’ll add it as a local feature once a model can comfortably run.


> Enter 2024 with AI. The top 20% of search results are a wall of text from AI...

I'll be the contrarian here and say I actually like Google's AI Overview? For the first time in a long time, I can search for an answer to a question and, instead of getting annoying ads and SEO-optimized uselessness, I actually get an answer.

Google is finally useful again. That said, once Google screws with this and starts making search challenging again, as it has been for years, I'll go elsewhere.


But "search" and "getting an answer to a question" are two different things, aren't they? I realize that the trend has been going this way for a long time - probably since Ask Jeeves started blurring the line - and this is indeed how a lot of people try / want to use search engines, but still... I wish that Google (and competitors) would have separate pages for something like "Ask Google" vs. traditional search (where I want to find a particular document or quality content on a certain topic instead of just getting a specific answer).

May I ask how old you are? I'm 38 and I've been trying hard to break my 10 year-old of the habit of just typing questions into search engines (or telling me to "Ask Google" whenever she asks me a question and I say, "Oh, I don't know").


Yes, they very much are two different things.

I loath products like Facebook, Messenger, Google Photos, etc. are turning their traditional "search" page/feature into a one-stop AI slop shop.

All I want to do is find a specific photo album by name.


They're perfectly capable of implementing all the same search operators as 1990s Yahoo and 2000s Google. It's a solved problem.

The issue is that they don't want to. They'd rather be a middleman offering you "useful recommendations" (that they or may not sell to the highest bidder) instead of offering you value.


Agreed. So many times I have to put Wikipedia or Reddit behind my search to get anything useful out of google. So it can work. Google is clearly prioritizing junk over value


Why don't you use Kagi?


I've been using it for a while now. It is marginally better, but not exactly night and day. It seems to struggle with intent at times, and others I just get the same bland results as the free engines. The privacy is a big plus however.


Probably because it costs money and it also likely also will quickly succumb to sloppification by experimenting with their own ai and having an unstable founder…


I'm using it for about two years and i haven't seen any sloppification. I see it as a feature that it is a paid service because i hope it will be a sustainable model for them to keep it as it is. I think it's a no brainer to pay for it instead of all the suffering people describe here. The founder remark i don't get


Not to discourage you, but note it took a while before Google succumbed. Hopefully Kagi will hold out.


Right now Kagi is a better search engine than Google. Why should some eventual demise in the future discourage anyone? There is no cost of switching and you can start using it right away


> having an unstable founder

I can assure you that force is strong with me.


I'd never heard of it until I just Googled it... Is it a better experience compared to DuckDuckGo with bang operators?


Would recommend to just try their 100 free searches. Their results are good, but it’s hard to have an objective measure. For me, it’s the little features that make it worth it (and that they have a forum for feature requests, and a proper changelog).


Yes, it's great. I use it for about two years already and never had any problem. I went to search for something on Google like twice during that time.


I've been disappointed with pretty much all recent SEs (DDG being among the very worst). Having been an early Scroogle user, ixquick (startpage) and a few other ones I dearly miss, I've been using https://freespoke.com/ lately and find it tolerable.

I was using searx instances with reasonable results but many of them started failing recently.

Anyway, I hope everyone finds a good one. I fear things will only get worse though.


Are you suggesting that 2000d Google codebase would do a decent job against today's SEO?


The biggest reason SEO is profitable is because low quality sites run display ads. That is the lifeblood, and intrinsic motivation, for these sites to even exist.

Google operates the largest display ads network. They literally *pay* websites for SEO spam, and take a very healthy cut off the top.

I wish people would stop acting like Google has been in a noble battle against the spam sites, when those sites generate Google billions of dollars a year in revenue.

The obvious question is, why would they ruin search for display? The answer is greed combined with hubris. They were able to double dip for years, but they killed the golden goose.

Everybody with a brain knew this would happen when they bought Doubleclick, and it took longer than expected, but here we are.


Today's SEO isn't the reason, it's simply more profitable for Google to give you terrible search results.

It makes no financial sense for Google to give you good search results and get you off Google as soon as possible.

Instead, if they make you spend more time on Google due to having to go through more crappy results, they can sell more ads.

Most people won't change search engine and will stomach it.

Until ChatGPT happened and can save you the pain of having to use Googles search engine.


I think the search and ad code base may not be explicitly co-mingled, but they are implicitly co-mingled. And return worse search results than the early 2000s code base.


Both are explicitly related to the web at large. Google sells ads on more than two million sites, and those mostly aim to be the kind of sites that feature in search results. I'd say that the two code bases are related, by virtue of operating on the same data structure.

You remember the pagerank paper? It described how Google classifies each page on a scale from "something that links to good pages" to "informative page". Google and other search engines produces links to the latter. And since then, web site operators have had strong incentives to be on the "informative page" end of the range. Today I don't think the 2000s code can find a lot of pages of the former kind (well, outside Facebook).

It produced very nice results back then. It was/is good code, but good results need more than just good code, it needs good input too.


If they provided what you're asking for you'd leave the site and look at fewer ads.


I get very wrong and dangerous answers from AI frequently.

I just searched "what's the ld50 of caffeine" and it says:

> 367.7 mg/kg bw

This is the ld50 of rats from this paper: https://pubmed.ncbi.nlm.nih.gov/27461039/

This is higher than the ld50 estimated for humans: https://en.wikipedia.org/wiki/Caffeinism

> The LD50 of caffeine in humans is dependent on individual sensitivity, but is estimated to be 150–200 milligrams per kilogram of body mass (75–100 cups of coffee for a 70 kilogram adult).

Good stuff, Google.


Perhaps a more common question: "How many calories do men need to lose weight?"

Google AI responded: "To lose weight, men typically need to reduce their daily calorie intake by 1,500 - 1,800 calories"

Which is obviously dangerous advice.

IMO Google AI overviews should not show up for anything (a) medical or (b) numerical. LLMs just aren't safe enough yet.


I think even when the answer is "right" in some sense, it should probably come within the context of a bunch of caveats, explanations, etc.

But maybe I'm just weird. Oftentimes when my wife or kids ask me a question, I take a deep breath and start to say something like "I know what you're asking, but there's not a simple or straightforward answer; it's important to first understand ____ or define ____..." by which time they get frustrated with me.


> I think even when the answer is "right" in some sense, it should probably come within the context of a bunch of caveats, explanations, etc.

Funnily enough, this is exactly what the LLM does with these questions. So well that people usually try to tweak their prompts so they don't have to wade through additional info, context, hedging, and caveats.


So you are saying that Google should provide responses that are more likely to frustrate its users? ;)


If you’re aiming to lose weight safely the rule of thumb is 3 lbs a week. 3000kcal per pound works out to an average deficit of about 1280 calories per day. Max.

Obese people can lose a bit more under doctor supervision. My understanding is that it’s tied partially to % of body weight lost per week and partly to what your organs can process, which does not increase with body mass.


I don’t think absolute numbers are very useful here. You need around 5–10% reduction in calorie intake to get any weight-loss effect going, and I wouldn’t reduce by more than 20% (relative to weight-maintaining intake — it’s different if you’ve been seriously overeating) if you want it to be sustainable longer-term.

So for example if your weight is stable at 2500 kcal per day, I would start by reducing the intake by 250–500 kcal, but not more. If this works well for a month or two and then you want to lose weight faster, you can still reduce your intake further. You generally have to do that anyway even just to maintain the velocity, because weight loss also tends to reduce calorie expenditure.

First and foremost, you need to monitor your calorie intake against weight. Here is a useful text about that: https://www.fourmilab.ch/hackdiet/


Your body will get more efficient at whatever exercise you do to make the calories work out. So over time you’ll either have to increase your exercise or rein in the calories a bit more to achieve a sustained result.


That is assuming that you do any exercise. But yes, and the method I linked to explains how to handle any such variation in calorie expenditure regardless of its cause.


I don't see the problem with the answer, and the question is already garbage. Plus, the LLM hedges its advice with precautions.

I get a pretty good summary when I paste the question into Google. It comes up with a ballpark but also gives precautions and info on how to estimate what caloric restriction makes sense for you within the first 3 sentences.

And all in a format someone is likely to read instead of clicking on some verbose search result that only answers the question if they read a whole article which they aren't going to do.

This seems like really lame nit picking. And I don't think it passes the "compared to what?" test.


The basic problem is it says reduce "by 1,500 - 1,800" rather than "to 1,500 - 1,800" (not that that answer is much better). Yes, it's a garbage question, but the first answer is unsafe in all circumstances. The simplest solution here is to show nothing.


The question is garbage. But people will ask it with their best intentions and not know it’s garbage.


If your calorie intake is just 1500 today, it is bad advice. If your calorie surplus is 1800, it is good advice.

But I wonder, were those few words the full response? Information hiding to prove a point is too easy.


A calorie surplus of 1800/day is ~190 lbs/yr. Is that something people actually do?


Yes. You can't just say that someone eating 1800 over the recommended 2000 will perpetually gain weight. Weight maintenance calories will depend on the weight of the person.

A 500lbs man will need to consume 4000kcals/day to not lose weight. Cutting 1800 of that is realistic and might be good advice on the LLM's part, so it really depends on how GP asked the question.


Funny thing is you can train a small BERT model to detect queries that are in categories that aren’t ready for AI “answers” with like .00000001% of the energy of an LLM.


That's (obviously) a bit of an exaggeration. BERT is just another transformer architecture. Cut down from ~100 layers to 1, ~1k dimensions to ~10, and ~10k tokens to 100, and you're only 1e6 faster / more efficient, still a factor of 10k greater than your estimate and also too small to handle the detection you're describing with any reasonable degree of accuracy.


I literally have DistilBERT models that can do this exact task in ~14ms on an NVIDIA A6000. I don’t know the precise performance per watt, but it’s really fucking low.

I use LLM to help with training data as they are great at zero shot, but after the training corpora is built a small, well trained, model will smoke an LLM in classification accuracy and are way faster - which means you can get scale and low carbon cost.

In my personal opinion there is a moral imperative to use the most efficient models possible at every step in a system design. LLM are one type of architecture and while they do a lot well, you can use a variety of energy efficient techniques to do discrete tasks much better.


Thanks for providing a concrete model to work with. Compared to GPT3.5, the number you're looking for is ~0.04%. I pointed out the napkin math because 0.00000001% was so obviously wrong even at a glance that it was hurting your claim.

And, yes, purpose-built models definitely have their place even with the advent of LLMs. I'm happy to see more people working on that sort of thing.


I applaud you doing the math! Proves you aren’t an LLM :-D


> Which is obviously dangerous advice.

Same advice as my trainer gives me.


Did they say:

  reduce their daily calorie intake to 1,500 - 1,800 calories
  or
  reduce their daily calorie intake by 1,500 - 1,800 calories
These are very different answers, unless you’re consuming ~3,300 calories per day. These kinds of ‘subtle’ phrasing issue often results in AI mistake as both words are commonly used in advice but the context is really important.


Oh yeah! No, reduce to not reduce by. Though at the time I was eating a few things that had high calories that I didn’t realize so it would have been the same.


Your trainer advises you to reduce your calorie intake to between 200 and 500 calories per day? [0] That sounds very, very hazardous for anything other than very short term use, and (given the body's inbuilt "starvation mode") probably counterproductive, even then.

[0] Note that the robot suggested to reduce calorie intake by 1,500->1,800 calories, and the recommended calorie intake is 2,000.


People losing weight are probably eating more than 2000 per day to begin with. But if you go from 2800 down to 1500 you’re already likely to exceed 3 lbs of weight loss per week that is recommended without doctor supervision. If you need to lose more than 150 lbs in a year because you’re well past morbid obesity then you need staff, not just a food plan.


If you’re eating out at Chilies, you could easily be eating 3000 calories per meal.


I recall when Krispy Kreme came out with a donut shake that was 1800 calories for the large size. It’s crazy out there.


It’s not even advice, and it’s not wrong.


That explains why I haven't been losing weight!


I think that would explain why you’re starving, not how you’re not losing weight.


For whatever it’s worth, in response to the same question posed by me (“what is the ld50 of caffeine”), Google’s AI properly reported it as 150-200 mg/kg.

I asked this about 1 minute after you posted your comment. Perhaps it learned of and corrected its mistake in that short span of time, perhaps it reports differently on every occasion, or perhaps it thought you were a rat :)


Perhaps Google AI reads HN at work just like us.


    The median lethal dose (LD50) of caffeine in humans is estimated to be 150–200 milligrams per kilogram of body mass. However, the lethal dose can vary depending on a person's sensitivity to caffeine, and can be as low as 57 milligrams per kilogram. 

    Route of administration 
    Oral 367.7 mg/kg bw
    Dermal >2000 mg/kg bw
    Inhalation LC50 combined: ca. 4.94 mg/L
ref: https://i.imghippo.com/files/yeKK3113pE.png 13:25EST (by a Kagi shill ftr)


That’s the danger with thinking in terms of LD50.

That’s half the people in a caffeine chugging contest falling over dead. The first 911 call would be much much earlier. I doubt you’d get to 57 mg before someone thought they were having a heart attack (angina).


I also got similar and just tried, we are posting within minutes.

--

The median lethal dose (LD50) of caffeine in humans is estimated to be 150–200 milligrams per kilogram of body mass. However, the lethal dose can vary depending on a person's sensitivity to caffeine, and can be as low as 57 milligrams per kilogram. Route of administration LD50 Oral 367.7 mg/kg bw Dermal 2000 mg/kg bw Inhalation LC50 combined: ca. 4.94 mg/L The FDA estimates that toxic effects, such as seizures, can occur after consuming around 1,200 milligrams of caffeine.

There was a table in the middle there.


LLM are non deterministic by nature.


Is this really true? The linear algebra is deterministic, although maybe there is some chaotic behavior with floating point handling. The non deterministic part mostly comes from intentionally added randomness, which can be turned off right?

Maybe the argument is that if you turn off the randomness you don’t have an LLM like result any more?


Floats are deterministic too (this winds up being helpful if you want to do something like test an algorithm on every single float); you just might get different deterministic outcomes on different compilation targets or with threaded intermediate values.

The argument is, as you suggest, that without randomness you don't have an LLM-like result any more. You _can_ use the most likely token every time, or beam search, or any number of other strategies to try to tease out an answer. Doing so gives you a completely different result distribution, and it's not even guaranteed to give a "likely" output (imagine, e.g., a string of tokens that are all 10% likely for any greedy choice, vs a different string where the first is 9% and the remainder are 90% -- with a 10-token answer the second option is 387 million times more likely with random sampling but will never happen with a simple deterministic strategy, and you can tweak the example slightly to keep beam search and similar from finding good results).

That brings up an interesting UI/UX question.

Suppose (as a simplified example) that you have a simple yes/no question and only know the answer probabilistically, something like "will it rain tomorrow" with an appropriate answer being "yes" 60% of the time and "no" 40%. Do you try to lengthen the answer to include that uncertainty? Do you respond "yes" always? 60% of the time? To 60% of the users and then deterministically for a period of time for each user to prevent flip-flopping answers?

The LD50 question is just a more complicated version of that conundrum. The model isn't quite sure. The question forces its hand a bit in terms of the classes of answers. What should its result distribution be?


Yes, that’s the main issue as ideally they wouldn’t be non-deterministic on well-established quantitative facts.


But they can never be. RAG gets you somewhere, but it’s still a pile of RNGs under a trenchcoat.


>> ideally


It’s just not possible. You can do a lot with nondeterministic systems, they have value - but oranges and apples. They need to coexist.


ideal (def. #2) = Existing only in the mind; conceptual, imaginary

https://en.m.wiktionary.org/wiki/ideal

(We’re allowed to imagine the impossible.)


Fair, I am loath to take away your dreams!


I get your point wasn't this specific example, it's perhaps not a very good example of being dangerous: Getting that much caffeine into your bloodstream takes quite a commitment, and someone who knows the term LD50 is perhaps not very likely to think it indicates what is safe to consume. It's also not something you're likely to do accidentally because you've looked it up online and decided to test it.

In the most concentrated form in typical commercial caffeine tablets, it's half to one fistful. In high-caffeine pre-workout supplements, it's still a quantity that you'd find almost impossible to get down and keep down... E.g. a large tumbler full of powder of mine with just enough water to make it a thick slurry you'd likely vomit up long before much would make it into your bloodstream...

I'm not saying it's impossible to overdose on caffeinated drinks, because some do, and you can run into health problems before that, but I don't think that error is likely to be very high on the list of dangerous advice.


Hmm my search returns “between 150 to 200 mg per kilogram”, which is maybe more correct?

Also, in what context is this dangerous? To reach dangerous levels one would have to drink well over 100 cups of coffee in a sitting, something remarkably hard to do.


> Also, in what context is this dangerous? To reach dangerous levels one would have to drink well over 100 cups of coffee in a sitting

some people use caffeine powder / pills for gym stuff apparently.

someone overdosed and died after incorrectly weighing a bunch of powder.

doubt it is a big leap to someone dying because they were told the wrong limits by google.

https://www.bbc.co.uk/news/uk-wales-60570470

as ever, machine learning is not really suitable for safety/security critical systems / use cases without additional non-ML measures. it hasn’t been in the past, and i’ve seen zero evidence recently to back up any claim that it is.


I don't doubt the news article on this, but even with caffeine pills/powder it's near half a fistful to get to LD50 judging by my caffeine tablets. It's not impossible to consume, but it'd be distinctly unpleasant long before you get even anywhere close to dangerous levels.

For my high-caffeine pre-workout powder, I suspect I'd vomit long before I'd get anywhere near. Pure caffeine is less unpleasant, but still pretty awful, which I guess is why we don't see more deaths from it despite the widespread use.

I agree with you that there really ought to be caution around giving advice on safety-critical things, but this one really is right up there in freak accident territory, in the intersection of somewhat dangerous substances sold in a poorly regulated form (e.g. there's little reason for these to be sold as bulk powders instead of pressed into pills other than making people feel more macho downing awful tasting drinks instead of taking pills).


I wonder if they’re thinking 200mg per kilo to trigger cell death. I have trouble believing a human heart surviving a dose of 50mg/kg. Half of them surviving four times that much? No. I don’t believe it.

Found an article about a teenager who died after three strong beverages. The coroner is careful to point out that this was likely an underlying medical condition not the caffeine. The health professional they interviewed claims 10g is lethal for “most” people, which would be 100-150mg/kg. That still seems like something an ER doctor would roll their eyes at.


Your example doesn't interact with the chicken littling in this thread.

> The hearing was told the scales Mr Mansfield had used to measure the powder had a weighing range from two to 5,000 grams, whereas he was attempting to weigh a recommended dose of 60-300mg.

Nothing to do with an LLM nor with someone not knowing the exact LD50 of caffeine. Just "this article contains someone dying of caffeine overdose, and we're talking about caffeine overdose here, therefore LLM is dangerous."


> some people use caffeine powder / pills for gym stuff apparently.

At 200mg per pill, which is the strongest I had, I'd still have to down some 70+ pills in one go. Not strictly impossible, but not something you could possibly do by accident, and even for the purpose of early check-out, it wouldn't be my first choice.


An accident with it in powdered form is possible - people who use them are often used to pre-workout supplements tasting awful, and so might be prepared to down it as fast as possible - but it's a big enough volume of powder that it really is a freak accident.

And if on purpose, using caffeine would just be staggeringly awful...


the problem isn’t someone’s intent (on purpose/by accident).

it’s intent (want to improve my gym performance so down a bunch of caffeine) combined with incorrect information gained from what is supposedly a trustworthy source (the limit presented is much higher than it actually is for humans).


If they're searching for LD50, they're already setting themselves up for errors, even with the right information. The LD50 isn't a safe dose, after all, but the mean lethal dose. While it's not great if its wrong, if people search for an LD50 thinking it indicates what they can safely take, it's already going to be hard to protect them against themselves.


This is why we let the pros do compounding. Slip a decimal point and you can kill yourself with many substances.


Even that seems high. I don’t feel good with 200mg per human, not per kilo. I can’t imagine drinking ten times as much and not being in the ER. A hundred times that much? No fucking way.


Yes, Google's AI chatbot confidently claimed yesterday that US passports have a fingerprinting requirement, which is absolutely not true. These things can't be trusted to emit even basic facts without somehow screwing it up and it's frankly depressing how they are worming their way into almost everything. I hope this particular hype train is derailed as soon as possible.


It want stay long. You are now in the pre ad phase. As soon as the ads are integrated the answers will become worse.


They have something Google never had: a paid tier.

There are plenty of revenue models aside from ads.


The excluded middle here is a paid tier that nevertheless serves you ads :(


Google was originally fairly egalitarian, OpenAI never was, and never will be. For better or worse.


Streaming services have been introducing ads in their lower paid tiers. It will come eventually.


MS already talks about ads for Copilot


Obviously the future is to train the model with the ads, so that they're indistinguishable from the core of the answer.

I kid, but also hope I'm wrong.


Hm, m$ also runs a few giant adtech platforms, maybe they can just inject tracking code at the source.


> I've been trying hard to break my 10 year-old of the habit of just typing questions into search engines

Honest question: why?

I understand not wanting to use Google (the search engine) or not wanting to support Google (the company). But I don't see with the issue with just looking up questions.

I'm 10 years younger than you, and I've been reaching for search engines first since I was 7, I think. Basically since I learned how to turn the computer on and open a web browser.


Because I want her to find authoritative sources, read, learn, understand, think critically, etc. rather than taking a given answer at face value.


For me: because that‘s exactly what Google and/or seo optimize for, but with no regard for accuracy and quality.


Right, A lot of times I'm searching for a filing. Or a site link. I do not ask questions when I'm doing so, that's ridiculous. I don't ask questions if I'm searching for a recipe, or something in my local area either. Actually, I very rarely do this.


> But "search" and "getting an answer to a question" are two different things, aren't they?

Google exists, as both a successful enterprise and as a verb, precisely because to most people they are exactly the same thing.

No, this is wrong. People ask what they want to know. Sometimes the best answer is a link. Sometimes it's just an answer. The ability to intuit which is best is what makes products in this space worth making.


Like you, I thought typing questions into Google was wrong for a long time. The times have changed; this is how most people interact with Google, and it really does convey intent to the system better now that we have sufficiently powerful NLP.


That’s okay if your goal is to get an answer to a straightforward question. If, however, your goal is to research a topic, or to find sources for something, or any other scenario where your aim is to read actual web pages, then you want web search, not AI answers. These are two different use cases.


I absolutely agree that it handles natural language questions much better now than when I started using search engines in the late 1990s - in fact it's optimized for this task now, meeting demand where it's at - but a direct answer to a question is often not what I want. For example, I often want to find a page that I remember reading in the past, so that I can re-read or cite it. Or I want more reading material to get a deeper, more nuanced understanding of some topic; things that will provide more context around answers or lead me to generating new questions.


>But "search" and "getting an answer to a question" are two different things, aren't they?

First conceptualization of the "search" were web directories then AltaVista and Google drove the complexity down for the users by providing the actual system which crawls, index and ranks web information. Now cycle will repeat again and we will get Answer Machines aka chat bots which drive the UX complexity for users even more down.

Why would I skim search results links and websites if the "AI" can do it for me. The only reason would be if you don't trust the "AI" and you want the actual links of websites so you can look for useful information by yourself but the majority of people want an instant answer/result to their search query hence Google's old school button "I'm feeling lucky".


Kagi has better search and you can tweak it however you like. So the product you are wishing for exists.


Getting an answer to a question is a superset - the answer can be a page.

Sometimes the answer we want is a specific page containing some term, but for most people, most of the time, I'd argue that getting a narrower piece of information is more likely to be valuable and helpful.


The answers come from the same websites. They just get stripped of their traffic. As someone who puts a ton of work into writing accurate, helpful guides, it's devastating to have my work plundered like that.

Once these monopolies have successfully established themselves, they will become indistinguishable from the ad-invested websites they replace. The only difference is that they will create no new information of their own, and they will destroy the indieweb that once provided it.


What value does the traffic have for you? Is it lost revenue from ads? Or are you selling something? If you're selling something, then the AIs could very well be giving you more sales than they take away.


I guide immigrants who settle in Germany


Have you noticed a decline in sales from AIs? I'd think that for such a service, people who don't want to pay wouldn't pay you anyway even if they went to your website first for the information, and people who do want to pay will find your business through the AI.


I don’t sell my services. My income comes from affiliate links on some pages (e.g. for choosing a first bank). A vast majority of the content is not monetised.

I noticed a drop in traffic, despite having added a lot of valuable guides this year. The traffic per page is way down, and only for Google.

You are correct that people don’t pay for this, even when they email me and I personally assist them. There’s just an expectation that things on the internet are free and that’s fine.


People pay thousands of dollars for immigration assistance, even if you're just filling out paperwork. If you want to make money from your webpage, you could simply offer to help them with these things for a fee. And leave the information up if you want, for the people who don't want to pay. Also, the more information you have up, the more comfortable people will be to hire you. E-mails from people who want free advice simply gets the delete button. They can read your free information or hire you.

As for affiliate links, I think they are a thing of the past. It's exactly these that web users want to avoid. Better to make a deal where you charge the bank a fee for each person you sign up. There's a lot of banks who offer these referral deals to all their customers. For example N26 gives you €70 per referral and Wise gives £25 per referral.

You can probably make good money from your website from doing what you love. But ads and affiliate links and getting money from traffic is a thing of the past.


Affiliate income is tied to the size of my audience. Income from relocation assistance is tied to the number of hours in a day. So far I have made more money doing less work, and it helped far more people.

I’d much rather help a greater number of people for free than the well-off minority that can afford my time. I’d sooner run my business into the ground than change this.


Since you're an expert in the field, the time you spend on an individual client is much less than what the person would spend figuring it out for themselves. And time is money. Whether a person is rich or poor.

I think you are thinking and operating in the old paradigm, thinking that somebody spending a few hundred dollars on getting expert help on the Internet is outrageous, even if it's a life changing spend. Would you consider somebody spending a few hundred dollars on expert advice in a fancy office to be outrageous? Then consider that your expertise is way higher. There's nothing wrong with you charging for your immigration services and expertise. And as I said, you can still offer as much information for free as you please. It's only better for sales.

Google does not owe you any traffic and they don't owe you any business income from affiliate links or ads. Zero. If you don't like Google using the information on your website that you are giving away for free, then it only takes a few minutes to remove your domain from Google.

You can combine giving information for free with getting paid for your work. But you can't demand that Google pays for that.


I just checked out your website -- what a beautiful labor of love!

Thank You For Making And Sharing :)


How are you monetizing your website? Is it with ads?


Who says they need to monetize it? Is that the only value we ascribe to traffic, now?


Indeed, this is what I likened to a "Dark Forest":

https://news.ycombinator.com/item?id=42459246


Either I’m monetizing my site and I care about traffic or why else would I care if people visit my site as long as information gets out there?


This whole website's raison d'être was to provide neutral and accurate information about German immigration.

> as long as information gets out there

A possibly incorrect summary of the information gets out there. Given how much nuance I weave into my content, and how much effort I put into getting the phrasing just right, it frustrates me to no end. There's a very high likelihood that AI could give someone an invalid answer _and_ put my name under it, surrounded by their ads.


And ChatGPT 4o (at least the paid version) and the AI overview in Google both give real time links to sources. Well at least you can ask the paid version of ChatGPT to give you sources and it will do a web search


I use Perplexity and it routinely confabulates while linking to the source it confabulates from. Parent has a valid gripe that AI is essentially damaging their reputation by pretending to cite its information with a credible source, I wish there were some legal avenue to sue but it's not quite libel is it?


And you have the source as proof. But that says a lot about Perplexity.

Does Perplexity actually give you a clickable link like ChatGPT does?


It gives you clickable links to a half dozen or so sources. It's not clear which of the information comes from source 1 vs source 2, etc.


And it’s too much to verify the sources? When you use Google and search for something do you not have to go to multiple sources?


Building a professional reputation? Letting people contact you with feedback and improvement suggestions? Pure personal pride? Plenty of reasons to want your work to be attributed to you regardless of whether you're directly monetising people reading it.


And who is going to find or even care about these websites except for people going to them specifically because of a link to your profile on social media sites, through public talks or otherwise through word of mouth?


I don't understand what you're getting at. This thread concerns how we used to be able to find good information with these contraptions called search engines, so that word of mouth was not the only way information was found.


What I’m getting at is simple, no one is going to find a random persons obscure blog where they are trying to build a “brand” or be a “thought leader” that is not on the first page of search results.

I subscribe to Ben Thompson’s writing and make it habit to go to a few other websites because they have earned my trust.

The only method that most people have ever had of gaining traction is via word of mouth and not through search engines.

No one owes you traffic or discoverability any more than they owed HuffPost or the other click bait, SEO optimized websites before the algorithm changes


I don't know how old you are, or whether you ever really knew the web in the prior era that we're talking about. Forgive me if I'm making flawed guesses about where you're coming from.

Back in the day, if I wanted the answer to some specific question about, say, restaurants in Chicago, I'd search for it on Google. Even if I didn't know enough about the topic to recognize the highest quality sites, it was okay, because the sorts of people who spent time writing websites about the Chicago restaurant scene did know enough, and they mostly linked to the high-quality sites, and that was the basis of how Google formed its rankings. Word of mouth only had to spread among deeply-invested experts (which happens quite naturally), and that was enough to allow search engines to send the broader public to the best resources. So yeah, once upon a time, search engines were pretty darn good at pointing people to high quality sites, and a lot of those quality sites became well-known in exactly that way.


I’m old enough that my first paid project was making modifications to a home grown Gopher server built using XCMDs for HyperCard.

My first post was on Usenet in 1994 using the “nn” newsreader

The web has gotten much larger than when it didn’t exist when I started.

But web rings on GeoCities weren’t exactly places to do “high quality research”. You still had to go to trusted sites you knew about or start at Wikipedia and go to citations.

Before two years ago I would go to Yelp. Now I use the paid version of ChatGPT that searches the internet and returns sources with links

https://imgur.com/a/hZwrjJS


I've had numerous people contact me directly with follow up questions about various info I've put on my website. Many of those have turned into further conversations and collaborations.

You can't have that if Google is plagiarizing your site and delivering the info.


How many of those people found you by randomly searching Google versus via links via your profile on social media?


All of them, because I'm not on any social media. I also mostly put obscure things on my website that aren't easily found elsewhere online, so very specific searches tend to end up on my site.

Also probably why I get email from people visiting as it's one of the few places people can reference said info.


So you aren’t willing to put in the work necessary to get your site recognized in 2024 and you don’t see that as a problem?


Eh? My site is recognized and found on google searches. People find the info they are looking for and sometimes email me asking follow up questions. The site is working as intended so I'm not sure what you're talking about.


That's the real question, because younger generations use less and less Open Web.

That was actually one of the main concerns of Larry Page back in the day, that the majority of Web's information might get and be locked behind walled gardens, paywalls or whatever else.


The web is just as “open” as it ever was. It is just as easy if not more so to create and host your own content.

You’re complaining about “discoverability” which hasn’t been easy since 2000.

The most successful independent writer today is probably Ben Thompson’s “Stratechery”

https://stratechery.com/about/ https://blockbuster.thoughtleader.school/p/how-ben-thompson-...

Through organic search, you probably won’t find any of his free articles when searching for a topic on the first page. He had to put in the work over years and couldn’t depend on Google.


Walled gardens like Facebook, Instagram, LinkedIn and others are the missed opportunities for the Open Web. Google nor any other search engine can't crawl their information, so Web users who are not on the aforementioned sites are missing a lot of useful information and social dynamics that would otherwise take place on the Open Web.


At least for Facebook, if the information is not publicly available via Google it’s because the content creator has decided not to make their content public.

Google very much can crawl information on Facebook and Instagram that people have made “public”

As far as “social dynamics”, do you remember Cambridge Analytics? Why would I want my social graph to be publicly available.

It’s bad enough that people have their contacts synced with Facebook.


If most information on Facebook is private, it’s because everything else gets spammed to hell. Same with discord. They are not a replacement to public, curated information put out by relatively knowledgeable people.


If I had a site (no time lately to maintain one) it would be because I wanted to inform people and contribute to the world’s accessible knowledge. I would want my information presented in context, accurately, the way I intended, not digested and reworded (often inaccurately) by Google.


And how likely is someone to find your site through search instead of word of mouth?

I bet you if you had insightful posts on HN (not saying you don’t) and people knew you, you would get more traffic by putting a link in your profile here than people searching on Google.


I can answer that question with actual numbers: 90% of my traffic comes from search engines. The remaining 10% is much more time-consuming to acquire. It doesn’t help that external links are downranked by most social media sites.


It doesn't matter anymore. There won't be monetary reward, citation, personal brand building, or anything. Google just rips off the information, presents it as fact, and a visitor will never visit an author's original website again.

Websites are training data and will become an anachronism.


If I care about my “personal brand”, what are the chances that people are going to find me organically on the web?

If I want to get my name our there - which I don’t - I’m going to post to LinkedIn, give in person talks at conferences, try to get on popular podcasts that have guests, etc.


Presumably, because no other method than ads or affiliate links works...


Where are you publishing your guides? Would love to add another bookmark to my collection.


Unless you are thinking of moving to Germany, it might not be helpful to you.

https://allaboutberlin.com


You’re correct but this is pretty interesting and I’m sure helpful for people in Berlin!


Do you just like collecting links to online "guides" to anything? No preference for any subject matter, just a collection of random "guides"? Interesting, you could make a guide for that!


You could say that. I have found that a lot of guides produced by folks on Hacker News to be generally interesting. Probably too much free time? Either way a guide of guides does seem like a good use of that free time.


It would be great if it wasn't completely wrong 50% of the time.


Describes my general experience with AI across the board. Copilot, ChatGPT, Claude, etc. It’s like I’m talking to a genius toddler. With ChatGPT losing 5 billion dollars on 3.7B in revenue this is unsustainable. It feels like the dotcom bubble all over again.


This is true, but fairly or unfairly, asking a question to a chat bot feels like “opting in” to the possibility that the answers you get will be hallucinated garbage, in a way that doing a Google search does not. It’s a tough problem for Google to overcome— the fact that they will be held to a higher standard—- but that’s what it is: we have already learned to accept bullshit from LLMs as a fact of life, whereas on the top of Google results it feels like an outrage.


I have been a paying ChatGPT user for awhile. It’s simply a matter of saying “verify that” and it will give you wen citations


Aren’t those citations sometimes entirely made up? Like the lawyers who used it for a case and it cited ones that never happened?


I really do think hallucinated references are a thing of the past. Models will still make things up, but they won't make up references.

ChatGPT with web search does a good job of summarizing content.


No, ChatGPT has had a web search tool for paid users forever. It actually searches the web and you can click on the links


It invents citations too, constantly. You could look up the things it cites, although at that point, what are you actually gaining?

And I’m not saying this makes them useless: I pay for Claude and am a reasonably happy customer, despite the occasional bullshit. But none of that is relevant to my point that the bots get held to a different standard than Google search and I don’t see an easy way for Google to deal with that.


Do you pay for ChatGPT? The paid version of ChatGPT has had a web search tool for ages. It will search the web and give you live links.


ChatGPT has had web search for exactly 58 days. I guess our definitions of 'ages' differ by several orders of magnitude.


The paid version has had web access for at least a year

March 23rd 2023

https://openai.com/index/chatgpt-plugins/

That’s 666 days.

So you are off by over “one order of magnitude”


A plugin? You’re joking.


It’s a “plug in” built into the paid version of ChatGPT, run by default and created by OpenAI.

This isn’t a third party obscure plug in.

All “tools” use a plug in architecture


You're a troll, and I'm done feeding you.


What part is “trolling”? Paid users have been able to use ChatGPT using the built in web browsing plug in for over a year just by saying “please provide citations” or “verify that”.

What you say has been around for a few weeks has literally been around for paid users for over a year


> we have already learned to accept bullshit from LLMs as a fact of life, whereas on the top of Google results it feels like an outrage.

Sort of. Top results for any kind of question that applies to general population - health, lifestyle, etc. - are usually complete bullshit too. It's all pre-AI slop known as content marketing.


> genius toddler

I think it's closer to a well spoken idiot.


A cat who can talk.


What are you using it for?


That's a very pessimistic take. It's right about 50% of the time!


Both of your requirements for correctness are just 50% too high.


The mark of a great product/feature is always when they feel the need to force it on users, because they know that a significant portion of users would switch it off if they could.


The difficulty of verifying the answer isn't-wrong is another important factor. Bad search results are often obvious, but LLM nonsense can have tricky falsehoods.

If a process gives false results half the time, and verifying any result takes half as long as deriving a correct solution yourself... Well, I don't know the limiting sum of the infinite series offhand, but it's a terrible tool.


I find it mostly right 70% of the time.


Which would be great, except for that I found the top google result to be more than 70% relevant to my searches in the past, its a clear downgrade of relevancy.


60% of the time, it works every time.


Yeah the AI summaries are garbage still


Compared to 0% of relevant results in first 10 pages it's an enormous improvement.


Have you seen an example where the AI hits on something that isn't in the first 10 pages of results?


Wait till the monetizing by ads starts


The AI answers are nowhere near good enough to always be at the top, without any clear indication that they are just a rough guess. Especially for critical things like visa requirements or medical information. When you search Google for these sort of things, you want the link to the authoritative source, not a best guess. It’s very different for queries like say “movies like blade runner”.


It seems damning enough that Google itself doesn't know what is a more authoritative source or they would have weighted their AI output appropriately.

What does that say about their traditional search results?


I doubt that was the decision process. It’s much more likely that there is a directive coming down from the top that “we need to go all in on AI”, which then gets amplified down by middle management and the result is AI smeared over all results irregardless if helpful. That then drives up some vanity metric like “searches answered by AI summaries”, while metrics like “bad AI summaries shown” don’t get attention. As a result the organization is happy, people can get promoted, etc.


Not all queries are the same but I agree with you that the authority of source is crucial. That's why for example .gov sites rank high and should rank high because government is usually the most trusted source.

But when you are looking for new shoes to buy or food recipes then .gov sites can't help you and that's where things get ugly....SEO spam ugly.


An example: I was looking up what a good diet is to follow after a child has been vomiting. The AI said to avoid giving fruit juice … yet the authoritative sources said the opposite. I already knew not to trust the AI, but this was nail in the coffin for me.


agreed and frankly I am a big fan of LLMs in general… it just doesn’t seem like the one behind google search is all that smart


Google's AI summary of search results hallucinates. You might like it, but you may also end up seeing, and believing in, something that just doesn't exist.

For example, it says there's a sequel to a Disney film called Encanto, and there just isn't. https://bsky.app/profile/jasonschreier.bsky.social/post/3lee...


It also misidentified an article by Bernard Vonnegut about ice crystals as written by Kurt Vonnegut


Most if not all of the times it "hallucinated" it was paraphrasing the web.


And in doing that it's endorsing some random website that makes things up and elevating that garbage to be something users trust 'because it's on Google.'

Users will need to learn that Google is only as trustworthy as the crappy websites it uses for data that drives its AI. I'll leave it up to you to work out how that might impact Google's brand.


For quick simple steps like how to get a Bluetooth keyboard into pairing mode, it seems to work really well. I hated the prior world where everyone attempted to hide the real answer 3/4ths of the way through a useless blog post or YouTube video.


Which, we should note, didn't happen 10 years ago before the accountants took over search at Google. Those good, lean, helpful pages still exist. Google incentives websites to have pages of slop on everything now because they track how long you spend on a site as a "metric of a good match". Forrest for the trees...


Baking recipes are the fucking worst.

I shouldn’t have to read 2000 words to make a cheesecake. And I shouldn’t have to read it three times before starting to make sure I combine the ingredients in the right order.

Even the good ones are often subtly wrong. For example, never add baking powder or especially cinnamon to wet ingredients. Stir them into the dry ingredients first, then combine. Otherwise they clump. With cinnamon it makes it look bad. With chemically reactive ingredients it can lead to insufficient rise. Who taught you people to cook? Obviously not grandma or PBS.

I see a lot of people blame “stale” baking powder and while that is a thing, mixing it in wrong or subbing oil for butter or not chilling (eg cookie dough) is just as likely a culprit.

My friend made two sheets of cookies from the same batch and the second ones were terrible. She left the dough on the counter while the first batch was in the oven. Rookie mistake. And she has adult children.


Stack overflow launched 16 years ago, when for many years most of google results already were expertexchange type of sites with the obfuscated answers hidden pages deep in the link.


> expertexchange

This reminded me that, rather hilariously, it used to be called expertsexchange.com before adding a dash (experts-exchange.com).


Expert sex change will long be remembered. I recall there was an Italian site that had a very spicy “Le Tits Now” reading but it escapes me.


i think if anything by google actively penalizes long slip articles with lots of affiliate links


Were you not googling before? They had a bullet point summary that was actually more accurate because it scraped direct quotes from the website. Now I am getting wrong info from the ai summary. Its a huge step back from just what was there previously but its sold as some advancement.


It definitely didn't seem more accurate to me. If quite frequently either scraped quotes that weren't actually an answer to my search (the webpage was correct, but the link between my search and the webpage was not), or it was an answer but the answer was wrong (because the webpage was wrong).

The AI summary now isn't perfect because it can still regurgitate wrong information from the Internet, or hallucinate information when there isn't any -- but it seems to actually understand what I want now, so it doesn't suffer from the incorrect matching problem.

Also, there are way more AI answers now than there ever were snippet answers.


Agreed.

My friend and I used to paste pre-AI Google search snippets to each other when they were so bad, especially when it quoted a comment on Reddit.


…i’m no fan of the google AI feature but it is way more accurate than the scraped bullet point predecessor which would often scrape things while missing something key like a “here is the opposite of what we are talking about:” in the webpage


They are also wrong just slightly too often. After the fifth time I was twenty minutes in to trying to use command line options that just don’t exist before realizing that I was being led down the winding path by an ai hallucination that I mistook for a stack overflow quote, I broke and paid for Kagi. Which then immediately added an AI drek feature, fml.


> I can search for an answer to a question and, instead of getting annoying ads and SEO-optimized uselessness, I actually get an answer.

You get the average of the seo optimized answers


An answer is only as good as the expertise behind it. When searching I always pay attention to the source, and will skip ones that look less trustworthy.

One major advantage of Google's original pagerank was that originally it worked well and number of links to a page was a good proxy for trustworthiness and authority on a subject. It used to be that you'd find what you were looking for in the top few Google search results, which was a massive improvement to Alta Vista which was the existing competition where you'd have to wade though pages of keyword match sites listed in no particular order.

Anyway, source is critically important, and if I'm looking to find something authoritative then the output of an LLM, even if RAG based, is not what I'm looking for! Increasingly people may be looking to search to verify stuff suggested by an LLM, which makes a search engine that puts LLM output as it's top result rather unhelpful!

It doesn't help that with Google in particular their AI output is all heavily DEI biased, and who knows what else ... I just don't trust it as objective.


I totally agree, I really appreciate them. Half the time they give me the answer straight away.

And when they're not helpful, it's no different from the first search result not being helpful and going to the second. Plus, they do a pretty good job of only showing them for the types of searches where they're appropriate.

Are the right 100% of the time? Of course not. But snippets weren't right 100% of the time, and not infrequently clicking on the top search result will contain information that's wrong as well. Because the Internet isn't 100% right.

The idea that a "wall of text from AI" is somehow bad doesn't make any sense to me. And it's not a "wall", it's basically paragraph-sized. Where the context is really helpful in determining whether the answer seems correct/reasonable.


They're strictly worse.

They're just a summary, so any information is in the results or hallucinated.

If the AI could accurately point to the correct information, they would just order the results as such, but instead it's just a paragraph of spaghetti on a wall to look cutting edge.


I just want to be able to turn the stupid overview off. That's all. One simple toggle.

I don't get why a Google Workspaces account can have Gemini forcibly disabled across the entire enterprise yet still have these AI features seep in with no way to manage it at the enterprise level.


Why don't you use Kagi? It has better search and you can customize a lot of things, even turning off their LLM


Because it gives money to people I don’t want to have that money, even if I turn off any features related to the people that I don’t want to have money.


Is Google supposed to give you an answer, or help you find something you're looking for?

Back when they only tried to help you find something, they were good at that. Really good. Then the ads and meta-slop came in and you couldn't find things anymore.

Then they decided they also wanted to answer questions, which is hard enough (they're often wrong). So they have to focus harder on answering questions.

And since they're trying to do both in one page/place, the question-answering has taken center stage, and finding things is now next to impossible.

So they're no longer a search engine. They're a crap version of OpenAI.


It's wrong enough and unsourced enough that it's more cognitive load to vet the result than not having it.

Google is barely more useful because of this.


Its hit or miss for me. This week I was googling how to use libarchive and the AI generated responses at the top of each query were either incorrect or hallucinations of methods that don’t exist.

I don’t mind playing with AI to help scratch together some code, but I do that using better models. Whatever model google is using for search results is too crappy for me to consider trusting.


I sometimes like it, but I've gotten very skeptical of it. One day a friend and I searched the exact same question in Google and got opposing answers for the identical search string. Thus wasn't in the "AI" widget, but one of their usual widgets that give answers to questions. I assume both use some form of AI anyway.


I think when it’s good it’s pretty good.

But knowing when it is good is still hard, as I can’t trust it more than an LLM. But with an LLM I have a simple chat window, not a bag of rabid SVPs fighting to be on the SERP page.


I dont find it useful for the things I search

I still have to check the sources and then add “reddit” to the end of my search query

so for me its actually an additional third step or remembering not to trust the ai overview


I use google.com for search and Gemini for Q&A. Two sites for two modes. I also use uBlock to remove the ai response from my search results to keep them clean and separate


Same with Brave search. The AI answer is often good enough for me to not need to go further. I’m with you, except I don’t use Google.


Search for anything with Google that has high ad monetization potential. You will find that Google turns off the AI overview.


Arc Search is what Gemini dreams of being. I’ve found it to be incredibly useful tool to cut through a lot of the crap.


The few times I gave it a try it was dead wrong. The dream is nice but the execution is lacking.


Google search had gotten so bad the AI overview is passable in comparison. They don't deserve credit for that! Search was better at getting useful information fifteen years ago than it does now. (And yes, the internet is way more full of garbage now--but they did that, they are responsible for that too!)

... unless you want anything like a perspective or an opinion on something, instead of a factual answer to a question, in which case it's totally useless.


Why do we think the AI is any better? Isn't it based on the same dataset as search? How can it be anything but strictly worse for any given query?


Er.. I used it and that's what I thought?

Search goes out of its way to hide why I want and show me bullshit shopping ads and influencer videos. The overview at least tries to help. For now.

But I will emphasize: it's still not that helpful, it's just less corrupted than the main body results are... so far.


if google’s AI overview were as smart as 4o, i would like it a lot more.


"Be Socially destructively evil via diffusion of content"


You're definitely a contrarian.

Google search is awfully bad these days.


I hate Flatpaks; they're bloated monstrosities and I only run them when I have no other choice. Outside of that, distribution package maintainers tend to do a good job and that is my preferred way of running programs.


You couldn't be more wrong. Windows 11 is the reason I left Windows for Linux, after MANY years, and I was running the Pro version.


> The same is probably not true for physical things such as ... YubiKeys stored in your pocket.

There's nothing preventing you from password-protecting your YubiKeys.


Right, they can seize my yubikey but the 5th amendment protects me from being compelled to hand over the pincode protecting it.


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: