Two new Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

simonw · 2024-09-24T17:01:18 1727197278

This price drop is significant. For <128,000 tokens they're dropping from $3.50/million to $1.25/million, and output from $10.50/million to $2.50/million.

For comparison, GPT-4o is currently $5/million input and $15/million output and Claude 3.5 Sonnet is $3/million input and $15/million output.

Gemini 1.5 Pro was already the cheapest of the frontier models and now it's even cheaper.

pzo · 2024-09-24T20:41:59 1727210519

whats confusing they have different pricing for output. Here [1] it's $5/million output (starting 1st october) and on vertex AI [2] it's $2.5/1 million (starting 7 october) - but characters - so it's overall gonna be more expensive if you wanna compare to equivalent 1 million tokens. It's actually even more confusing to know what kind of characters they mean? 1 byte? UTF-8?

[1] https://ai.google.dev/pricing

[2] https://cloud.google.com/vertex-ai/generative-ai/pricing

Deathmax · 2024-09-24T20:59:28 1727211568

They do mention how characters are counted in the Vertex AI pricing docs: "Characters are counted by UTF-8 code points and white space is excluded from the count"

monsieurpooh · 2024-09-28T19:28:37 1727551717

They CHANGED the pricing for the first link; originally the output said price drop from $10.50 to $2.50 but now it says $10.50 to $5.00.

RhodesianHunter · 2024-09-24T18:18:54 1727201934

I wonder if they're pulling the wall-mart model. Ruthlessly cut costs and sell at-or-below costs until your competitors go out of business, then ratchet up the prices once you have market dominance.

lacker · 2024-09-24T18:52:23 1727203943

Probably not. Do they really believe they are going to knock OpenAI out of business, when the OpenAI models are better?

Instead I think they are going after the "Android model". Recognize they might not be able to dethrone the leader who invented the space. Define yourself in the marketplace as the cheaper alternative. "Less good but almost as good." In the end, they hope to be one of a small number of surviving members of an valuable oligopoly.

scarmig · 2024-09-24T19:13:24 1727205204

Cheapness has a quality all its own.

Gemini is substantially cheaper to run (in consumer prices, and likely internally as well) than OpenAI's models. You might wonder, what's the value in this, if the model isn't leading? But cheaper inference could potentially be a killer edge when you can scale test-time compute for reasoning. Scaling test-time compute is, after all, what makes o1 so powerful. And this new Gemini doesn't expose that capability at all to the user, so it's comparing apples and oranges anyway.

DeepMind researchers have never been primarily about LLMs, but RL. If DM's (and OAI's) theory is correct--that you can use test-time compute to generate better results, and train on that--this is potentially a substantial edge for Google.

zaptrem · 2024-09-24T20:11:27 1727208687

Google still has an unbelievable training infrastructure advantage. The second they can figure out how to convert that directly to model performance without worrying about data (as the o1 blog post seemed to imply OAI had) they’ll be kings.

llelouch · 2024-09-25T06:08:10 1727244490

This is why Sam Altman keeps releasing things a few days before Deepmind. He is worried Google will overtake them more so than other companies.

Dr4kn · 2024-09-24T20:05:29 1727208329

In Home Assistant you can use LLMs to control your Home with your voice. Gemini performs similar to the GPT models, and with the cost difference there is little reason to choose OpenAi

a_wild_dandan · 2024-09-24T20:28:32 1727209712

Using either frontier model for basic edge device problems is wasteful. Use something cheap. We're asking "is there a profitable niche between the best & runner-up models?" I believe so.

GaggiX · 2024-09-24T19:09:56 1727204996

Android is more popular than iOS by a large margin and it's neither less good or cheaper, it really depends on the smartphone.

socksy · 2024-09-24T19:10:32 1727205032

The latest Google Pixel phone (you know, the one that Google actually set the price for) appears to cost the exact same as the latest iPhone ($999 for pro, $799 for non-pro). And I would argue against the "less good" bit too.

I think this analysis is not in keeping with reality, and I doubt if that's their strategy.

rajup · 2024-09-24T19:13:49 1727205229

I doubt anyone buys Pixel phones at full price. They are discounted almost right out of the gate.

JeremyNT · 2024-09-24T21:26:53 1727213213

> Probably not. Do they really believe they are going to knock OpenAI out of business, when the OpenAI models are better?

Would OpenAI even exist without Google publishing their research? The idea that Google is some kind of also-ran playing catch up here feels kind of wrong to me.

Sure OpenAI gave us the first productized chatbots, so in that sense they "invented the space," but it's not like Google were over there twiddling their thumbs - they just weren't exposing their models directly outside of Google.

I think we're past the point where any of these tech giants have some kind of moat (other than hardware, but you have to assume that Google is at least at parity with OpenAI/MS there).

bko · 2024-09-24T18:31:46 1727202706

Isn't Walmart still incredibly cheap? They have a net margin of 2.3%

I think that's one of those things competitors complain about that never actually happens (the raising prices part).

https://www.macrotrends.net/stocks/charts/WMT/walmart/net-pr...

sangnoir · 2024-09-24T19:16:47 1727205407

There's lot of room to cut margins in the AI stack right now (see Nvidia's latest report); low prices are not an sure indication of predatory pricing. Which company do you think is most likely to have the lowest training and inference costs between Anthropic, OpenAI and Google? My bet goes to the one designing,producing and using their own TPUs.

charlie0 · 2024-09-24T18:52:58 1727203978

Yes, and it's the exact same thing OpenAI/Microsoft and Facebook are doing. In Facebook's case, they are giving it away for free.

entropicdrifter · 2024-09-24T18:29:12 1727202552

You think Google would engage in monopolistic practices like that?

Because I do

0cf8612b2e1e · 2024-09-24T18:56:25 1727204185

I have no idea if this is dumping or not. At Microsoft/Google scale, what does it cost to serve a million LLM tokens?

Tough to disentangle the capex vs opex costs for them. If they did not have so many other revenue streams, potentially dicey as there are probably still many untapped performance optimizations.

GaggiX · 2024-09-24T17:21:07 1727198467

GPT-4o is 2.5/10$. Unless you look at an old checkpoint. GPT-4o was the cheapest frontier model before.

simonw · 2024-09-24T17:53:20 1727200400

I can’t see that price on https://openai.com/api/pricing/ - it’s listing $5/m input and $15/m output for GPT-4o right now.

No wait, correction: That’s confusing: it lists 4o first and then lists gpt-4o-2024-08-06 as $2.50/$10.

jeffharris · 2024-09-24T19:30:13 1727206213

apologies: it's taken us a minute to switch the default `gpt-4o` pointer to the newest snapshot

we're planning on doing that default change next week (October 2nd). And you can get the lower prices now (and the structured outputs feature) by manually specify `gpt-4o-2024-08-06`

jiggawatts · 2024-09-24T20:14:11 1727208851

> “You can”

No, “I” can’t.

Open AI has always trickled out model access, putting their customers into “tiers” of access. I’m not sufficiently blessed by the great Sam to have immediate access.

On, and Azure Open AI especially likes to drag their feet both consistently, and also on a per-region basis.

I live in a “no model for you” region.

Open AI says: “Wait your turn, peasant” while claiming to be about democratising access.

Google and everyone else just gives access, no gatekeeping.

benterix · 2024-09-25T10:12:33 1727259153

> Google and everyone else just gives access, no gatekeeping.

Well, Gemini Pro was delayed in Europe for many months. Same for Claude.

jiggawatts · 2024-09-25T12:38:59 1727267939

For legal/regulatory reasons, not for the arbitrary favouritism reasons of OpenAI.

lossolo · 2024-09-24T21:03:50 1727211830

> For comparison, GPT-4o is currently $5/million input and $15/million output and Claude 3.5 Sonnet is $3/million input and $15/million output.

Google is the only one of the three that has its own data centers and custom inference hardware (TPU).

ekkk · 2024-09-25T16:14:33 1727280873

It doesn't matter if it's cheap, it's unusable.

copperx · 2024-09-27T21:16:30 1727471790

Can you expound on why do you find it unusable?

monsieurpooh · 2024-09-28T19:22:36 1727551356

They CHANGED the pricing from $2.50 to $5.00 stealthily unannounced. Look at the web site again; it says $5 per million now, and this comment on this website might be the ONLY evidence in the world that I wasn't gaslighting myself or hallucinating!

naiv · 2024-09-24T17:41:43 1727199703

This sounds interesting:

"We will continue to offer a suite of safety filters that developers may apply to Google’s models. For the models released today, the filters will not be applied by default so that developers can determine the configuration best suited for their use case."

ancorevard · 2024-09-24T18:40:17 1727203217

This is the most important update.

Pricing and speed doesn't matter when your call fails because of "safety".

CSMastermind · 2024-09-24T19:04:08 1727204648

Also Google's safety filters are absolutely awful. Beyond parody levels of bad.

This is a query I did recently that got rejected for "safety" reasons:

Who are the current NFL starting QBs?

Controversial I know, I'm surprised I'd be willing to take the risk with submitting such a dangerous query to the model.

nrvn · 2024-09-25T17:21:53 1727284913

You know what. I just ran this query in gemini and it spit out “I'm not able to help with that, as I'm only a language model.” but just before that I got a glimpse of a real answer:

NFL Starting Quarterbacks for the 2024 Season Note: Quarterback situations can change throughout the season due to injuries, trades, or poor performance. AFC • Baltimore Ravens: Lamar Jackson Buffalo Bills: Josh Allen

But then it gets wiped and you cannot see it even in the drafts. The text above is from the screenshot I managed to make before the response vanished.

This non-deterministic unpredictable behavior blended with poor “safety” policies is one of those major “dealbreakers” that pushes me back from trusting any existing LLMs.

elashri · 2024-09-24T21:59:29 1727215169

Not stranger than my experience with openai. I got banned from DELL-3 access when it first came because I asked in the prompt about generating a particle moving in magnetic field of a forward direction and decays to two other particles with a kink angle between the particle and the charged daughter.

I don't recall exact prompt but it should be something close to that. I really wonder what filters they had about kink tracks and why? Do they have a problem with Beyond standard model searches /s.

CSMastermind · 2024-09-24T22:42:11 1727217731

For what it's worth I run every query I make through all the major models and Google's censorship is the only one I consistently hit.

I think I bumped into Anthropics once? And I know I hit ChatGPTs a few months back but I don't even remember what the issue was.

I hit Google's safety blocks at least a few times a week during the course of my regular work. It's actually crazy to me that they allowed someone to ship these restrictions.

They must either think they will win the market no matter the product quality or just not care about winning it.

ancorevard · 2024-09-26T16:30:59 1727368259

Microsoft's OpenAI models seems even worse. They run their own "safety" filter.

Using their models in a medical setting is impossible. It refuses to describe scientific photos and will not summarize HCP discussions (texts) that it misinterprets.

bobwell · 2024-09-25T05:51:07 1727243467

I'm reminded of this short story about a government bureaucracy banning research in certain areas to prevent dangerous technology being discovered: https://en.wikipedia.org/wiki/The_Dead_Past

KaoruAoiShiho · 2024-09-24T19:40:17 1727206817

There's still basic filters even if you take all the ones that you can turn off from the UI all off. It's still not capable of summarizing some YA novels I tried to feed it because of those filters.

panarky · 2024-09-24T22:40:21 1727217621

The "safety" filters used to make Gemini models nearly unusable.

For example, this prompt was apparently unsafe: "Summarize the conclusions of reputable econometric models that estimate the portion of import tariffs that are absorbed by the exporting nation or company, and what portion of import tariffs are passed through to the importing company or consumers in the importing nation. Distinguish between industrial commodities like steel and concrete from consumer products like apparel and electronics. Based on the evidence, estimate the portion of tariffs passed through to the importing company or nation for each type of product."

I can confirm that this prompt is no longer being filtered which is a huge win given these new lower token prices!

FergusArgyll · 2024-09-24T18:14:59 1727201699

Any opinions on pro-002 vs pro-exp-0827 ?

Unlike others here I really appreciate the gemini API, it's free and it works. I haven't done too many complicated things with it but I made a chatbot for the terminal, a forecasting agent (for metaculus challenge) and a yt-dlp auto namer of songs. The point for me isn't really how it compares to openAI/anthropic, it's a free API key and I wouldn't have made the above if I had to pay just to play around

bn-l · 2024-09-24T19:17:53 1727205473

I’ve used it. The API is incredibly buggy and flakey. A particular pain point is the “recitation error” fiasco. If you’re developing a real world app this basically makes the Gemini api unusable. It strikes me as a kind of “Potemkin” service.

Google is aware of the issue and it has been open on google's bug tracker since March 2024: https://issuetracker.google.com/issues/331677495

There is also discussion on GitHub: https://github.com/google-gemini/generative-ai-js/issues/138

It stems from something google added intentionally to prevent copyright material being returned verbatim (ala the NYT openai fiasco), so they dialled up the "recitation" control (the act of repeating training data—and maybe data they should not have legally trained on).

Here are some quotes from the bug tracker page:

> I got this error by just asking "Who is Google?"

> We're encountering recitation errors even with basic tutorials on application development. When bootstrapping a Spring Boot app, we're flagged for the pom.xml being too similar to some blog posts.

> This error is a deal breaker... It occurs hundreds of times a day for our users and massively degrades their UX.

screye · 2024-09-24T21:40:51 1727214051

The recitation error is a big deal.

I was ready to champion gemini use across my organization, and the recitation issue curbed any enthusiasm I had. It's opaque and Google has yet to suggest a mitigation.

Your comment is not hyperbole. It's a genuine expression of how angry many customers are.

3choff · 2024-09-25T17:06:36 1727283996

It seems the recitation problem has been fixed with the latest models. In my tests, the answer generation no longer stops prematurely. Before I resume a project that was on hold due to this issue, I'm gathering feedback from other users about their experiences with this and the new models. Are you still having this problem?

summerlight · 2024-09-24T17:00:03 1727197203

Looks like they are more focused on the economical aspect of those large models? Like 90~95% performance of other frontier models at 50%~70% price.

Workaccount2 · 2024-09-24T17:57:45 1727200665

They are going for large corporate customers. They are a brand name with deep pockets and a pretty risk-adverse model.

So even if Gemini sucks, they'll still win over execs being pushed to make a decision.

hadlock · 2024-09-24T21:10:31 1727212231

Not even trying to be snarky, but their lack of ability to offer products for more than a handful of years, does not lend google towards being chosen by large corporate customers. I know a guy who works in cloud sales and his government customers are PISSED they are sunsetting one of their PDF products and are being forced to migrate that process. The customer was expecting that to work for 10+ years and after a ~3 year onboarding process, they have 6 months to migrate. If my neck was on the line after buying the google PDF product, I wouldn't even short list them for an AI product.

jsnell · 2024-09-24T21:48:23 1727214503

What Google Cloud pdf product is that? I thought my knowledge of discontinued Google products was near-encyclopedic, but this is the first I've heard of that.

But as an enterprise customer, if you expect X, don't you get X into the contract?

RhodesianHunter · 2024-09-24T18:19:59 1727201999

That doesn't seem like much of a plan given their trailing position in the cloud space and the fact that Microsoft and AWS both have their own offerings.

resters · 2024-09-24T19:00:54 1727204454

Maybe Google is holding back it's far superior, truly sentient AI until other companies have broken the ice. Not long ago there was a Google AI engineer who rage quit over Google's treatment of sentienet AI.

TIPSIO · 2024-09-24T17:27:41 1727198861

I do like the trend.

Imagine if Anthropic or someone eventually release a Claude 3.5 but at like a whopping 10x its current speed.

Would be incredibly more useful and game changing than a slow o1 model that may or not be x percent smarter.

bangaladore · 2024-09-24T18:26:57 1727202417

Sonnet 3.5 is fast for its quality. But yeah, it's nowhere near Google's flash models. But I assume that is largely just because its a much smaller model.

sgt101 · 2024-09-24T17:30:55 1727199055

We might see that with the inference ASICs later this year I guess?

xendipity · 2024-09-24T17:43:34 1727199814

Ooh, what are these ASICs you're talking about? My understanding was that we'll see AMD/Nvidia gpus continue to be pushed and very competitive as well as have new system architectures like cerebras or grok. I haven't heard about new compute platforms framed as ASICs.

Workaccount2 · 2024-09-24T18:04:51 1727201091

Cerebras has ridiculously large LLM ASICs that can hit crazy speeds. You can try it with llama 8B and 70B:

https://inference.cerebras.ai/

It's pretty fast, but my understanding is that it is still too expensive even accounting for the speed-up.

throwup238 · 2024-09-24T18:15:22 1727201722

Is Cerebras an integrated circuit or more an integrated wafer? :-)

And yeah their cost is ridiculous, on the order for high 6 to low 7 figures per wafer. The rack alone looks several times more expensive than the 8x NVIDIA pods [1]

[1] https://web.archive.org/web/20230812020202/https://www.youtu...

campers · 2024-09-25T04:50:30 1727239830

https://www.etched.com/announcing-etched

I think there's another one but I can't remember the name of it.

Also a bit further out is https://spectrum.ieee.org/superconducting-computer

"Instead of the transistor, the basic element in superconducting logic is the Josephson-junction."

anotherpaulg · 2024-09-24T19:25:03 1727205903

The new Gemini models perform basically the same as the previous versions on aider's code editing benchmark. The differences seem within the margin of error.

https://aider.chat/docs/leaderboards/

kendallchuang · 2024-09-24T17:22:42 1727198562

Has anyone used Gemini Code Assist? I'm curious how it compares with Github Copilot and Cursor.

mil22 · 2024-09-24T19:20:56 1727205656

I have used Github Copilot extensively within VS Code for several months. The autocomplete - fast and often surprisingly accurate - is very useful. My only complaint is when writing comments, I find the completions distracting to my thought process.

I tried Gemini Code Assist and it was so bad by comparison that I turned it off within literally minutes. Too slow and inaccurate.

I also tried Codestral via the Continue extension and found it also to be slower and less useful than Copilot.

So I still haven't found anything better for completion than Copilot. I find long completions, e.g. writing complete functions, less useful in general, and get the most benefit from short, fast, accurate completions that save me typing, without trying to go too far in terms of predicting what I'm going to write next. Fast is the key - I'm a 185 wpm on Monkeytype, so the completion had better be super low latency otherwise I'll already have typed what I want by the time the suggestion appears. Copilot wins on the speed front by far.

I've also tried pretty much everything out there for writing algorithms and doing larger code refactorings, and answering questions, and find myself using Continue with Claude Sonnet, or just Sonnet or o1-preview via their native web interfaces, most of the time.

kendallchuang · 2024-09-24T20:51:13 1727211073

I see, perhaps with Gemini because the model is larger it takes longer to generate the completions. I would expect with a larger model it would perform better on larger codebases. It sounds like for you, it's faster to work on a smaller model with shorter more accurate completions rather than letting the model guess what you're trying to write.

imp0cat · 2024-09-24T20:30:01 1727209801

Have you tried Gitlab Duo and if so, what are your thoughts on that?

mil22 · 2024-09-24T22:12:07 1727215927

Not yet, hadn't heard of it. Thanks for the suggestion.

spiralk · 2024-09-24T17:39:08 1727199548

The Aider leaderboards seem like a good practical test of coding usefulness: https://aider.chat/docs/leaderboards/. I haven't tried Cursor personally but I am finding Aider with Sonnet more useful that Github Copilot and its nice to be able to pick any model API. Eventually even a local model may be viable. This new Gemini model does not rank very high unfortunately.

kendallchuang · 2024-09-24T18:03:51 1727201031

Thanks for the link. That's unfortunate, though perhaps the benchmarks will be updated after this latest Gemini release. Cursor with Sonnet is great, I'll have to give Aider a try as well.

KaoruAoiShiho · 2024-09-24T19:44:24 1727207064

It's on the leaderboard, it's tied with qwen 2.5 72b and far below SOTA of o1, claude sonnet, and deepseek. (also below very old models like gpt-4-0314 lol)

spiralk · 2024-09-24T18:07:37 1727201257

It is updated actually, gemini-1.5-pro-002 is this new model.

kendallchuang · 2024-09-24T20:41:23 1727210483

That was fast, I missed it!

therein · 2024-09-24T19:13:20 1727205200

I know you aren't necessarily talking about in-editor code assist but something about in-editor AI cloud code assist makes me super uncomfortable.

It makes sense I need to be careful not to commit secrets to public repositories but now I have to avoid not only saving credentials into a file but even to paste them by accident into my editor?

danmaz74 · 2024-09-24T20:51:47 1727211107

I tried it briefly and didn't like it. On the other hand, I found Gemini pro better than sonnet or 4o at some more complex coding tasks (using continue.dev)

dudus · 2024-09-24T18:09:01 1727201341

I use it and find it very helpful. Never tried cursor or copilot though

therein · 2024-09-24T19:15:09 1727205309

I tried Cursor the other day. It was actually pretty cool. My thought was, I'll open this open source project and use it to grok my way around the codebase. It was very helpful. After that I accidentally pasted an API secret into the document; had to consider it compromised and re-issued the credential.

spotlmnop · 2024-09-24T17:35:13 1727199313

God, it sucks

frankdenbow · 2024-09-24T16:53:17 1727196797

Interview with the product lead: https://x.com/rowancheung/status/1838611170061918575?

hadlock · 2024-09-24T16:57:19 1727197039

They put a 42 minute video on twitter? That's brave.

mh- · 2024-09-24T17:15:35 1727198135

https://www.youtube.com/watch?v=WQvMdmk8IkM

throwup238 · 2024-09-24T17:31:52 1727199112

> What makes the new model so unique?"

>> Yeah It's a good question. I think it's maybe less so of what makes it unique and more so the general trajectory of the trend that we're on.*

Disappointing.

serjester · 2024-09-24T17:06:28 1727197588

Gemini feels like an abusive relationship — every few months, they announce something exciting, and I’m hopeful that this time will be different, that they’ve finally changed for the better, but every time, I’m left regretting having spent any time with them.

Their docs are awful, they have multiple unusable SDK's and the API is flaky.

For example, I started bumping into "Recitation" errors - ie they issue a flat out refusal if your response resembles anything in the training data. There's a GitHub issue with hundreds of upvotes and they still haven't published formal guidance on preventing this. Good luck trying to use the 1M context window.

Everything is built the "Google" way. It's genuinely unusable unless you're a total masochist and want to completely lock yourself into the Google ecosystem.

The only thing they can compete on is price.

jatins · 2024-09-24T17:48:33 1727200113

i think it's unusable if you are trying to use via GCP. Using via ai studio is a decent experience

stan_kirdey · 2024-09-24T17:44:22 1727199862

Google should just offer llama3 405b, maybe slightly fine tuned. Geminis are unusable.

romland · 2024-09-24T19:08:15 1727204895

Pretty sure Google's got more than 700 million active users.

In fact, Google is most likely _the_ target for that clause in the license.

Deathmax · 2024-09-24T22:08:56 1727215736

They do: https://cloud.google.com/vertex-ai/generative-ai/docs/partne...

tiborsaas · 2024-09-24T18:07:39 1727201259

AI companies should not pick up naming models after astrological signs, after a while it will be hard to tell apart model reviews from horoscope.

VirusNewbie · 2024-09-24T21:18:47 1727212727

Llama seems to be tricked up by simple puzzles that Gemini does not struggle with, in my experience.

nmfisher · 2024-09-24T19:14:40 1727205280

I've found Gemini Pro to be the most reliable for function calling.

dcchambers · 2024-09-24T19:01:23 1727204483

> Geminis are unusable

how so?

JimDabell · 2024-09-25T06:25:59 1727245559

In my experience, Gemini models are far worse than any other frontier model when it comes to hallucinations. They are also pretty bad at getting caught in loops where pointing out a mistake makes it flap between two broken solutions. And obviously the overzealous softly stuff that other people have mentioned.

ekkk · 2024-09-25T16:18:22 1727281102

Yep, idk why would anyone use gemini instead of chatgpt or claude.

thekevan · 2024-09-26T01:28:03 1727314083

I have found that as well

rkwasny · 2024-09-24T18:16:46 1727201806

Can someone explain to me why there is COMPLETELY different pricing for models on Vertex AI, Google AI studio and also OpenRouter has another price ...

hiddencost · 2024-09-24T18:29:21 1727202561

https://www.businessinsider.com/big-tech-org-charts-2011-6

Google is now looking more like the Microsoft chart.

charlie0 · 2024-09-24T18:55:25 1727204125

Anyone want to take bets on how long it takes for this to hit the Google Graveyard?

mnicky · 2024-09-25T11:54:09 1727265249

I think the sentiment here is not fully objective, there are nice improvements in benchmarks (and even more so when accounting for the price): https://imgur.com/a/K3tVPEw

Also, this model shouldn't be compared to the CoT o1, I think. That is something different (also in price and speed).

ramshanker · 2024-09-24T19:30:28 1727206228

One company buying expensive NVIDIA hardware vs another using in-house chips. Google got a huge advantage here. They could really undercut OpenAI.

rty32 · 2024-09-24T22:06:07 1727215567

People have said that for many years. Very few companies are choosing Google's TPUs. Everyone wants H100s.

TheAceOfHearts · 2024-09-24T17:28:55 1727198935

I only use regular Gemini and the main feature I care about is absolutely terrible: summarizing YouTube videos. I'll ask for a breakdown or analysis of the video, and it'll give me a very high level overview. If I ask for timestamps or key points, it begins to hallucinate and make stuff up. It's incredibly disappointing that such a huge company with effectively unlimited access to both money and intellectual resources can't seem to implement a video analysis feature that doesn't suck. Part of me wonders if they're legitimately this incompetent or if they're deliberately not implementing good analysis features because it could eat into their views and advertisement opportunities.

bahmboo · 2024-09-24T18:45:08 1727203508

I've had it fail when a video does not have subtitles - I'm guessing that's what it uses. I have had good success having it answer the clickbait video titles like "is this the new best thing?"

It's not watching the video as far as I can tell.

foota · 2024-09-24T18:43:05 1727203385

I imagine you'd be paying more in ML costs than YouTube makes off your views.

ldjkfkdsjnv · 2024-09-24T18:13:56 1727201636

They have to drop the price because the model is bad. People will pay almost any cost for a model that is much better than the rest. How this company carries on the facade of competence is laughable. All the money on the planet, and they still cannot win on their core "competency".

kebsup · 2024-09-24T18:35:58 1727202958

Is there a good benchmark comparing multilingual and/or translation abilities of most recent LLMs? GPT-4o struggles for some tasks in my language learning app.

msp26 · 2024-09-24T17:39:56 1727199596

Has anyone tried Google's context caching feature? The minimum caching window being 32k tokens seems crazy to me.

pacoverdi · 2024-09-25T10:16:35 1727259395

Why the ** do they have to use a 8.9MiB png as hero image? I guess it was generated by AI?

resource_waste · 2024-09-24T17:56:50 1727200610

Its just not as smart as ChatGPT or LLAMA, its mind boggling Google fell so far behind.

Mistletoe · 2024-09-24T18:39:34 1727203174

Has it changed much since March when this was written? Gemini won, 6-4. And it matches my experiences of just using Gemini instead of ChatGPT because it gives me more useful responses, when I feel like I want an AI answer.

https://www.tomsguide.com/ai/google-gemini-vs-openai-chatgpt

w23j · 2024-09-24T20:54:55 1727211295

"For this initial test I’ll be comparing the free version of ChatGPT to the free version of Google Gemini, that is GPT-3.5 to Gemini Pro 1.0."

The free version of ChatGPT is 4o now, isn't it? So maybe Gemini has not gotten worse, but the free alternatives are now better? When I compare ChatGPT-4o with Gemini-Advanced (wich is 1.5 Pro, I believe) the latter is just so much worse.

phren0logy · 2024-09-24T17:20:52 1727198452

As far as I can tell there's still no option for keeping data private?

sweca · 2024-09-24T18:25:29 1727202329

If you are on the pay as you go model your data is exempted from training.

> When you're using Paid Services, Google doesn't use your prompts (including associated system instructions, cached content, and files such as images, videos, or documents) or responses to improve our products, and will process your prompts and responses in accordance with the Data Processing Addendum for Products Where Google is a Data Processor. This data may be stored transiently or cached in any country in which Google or its agents maintain facilities.

https://ai.google.dev/gemini-api/terms

wildmXranat · 2024-09-25T15:34:55 1727278495

Interesting. In essence, we could equate to paying for the Gemini Advanced or Pro as a way to avoid use of our data and prompts.

https://ai.google.dev/gemini-api/terms#data-use-paid

diggan · 2024-09-24T17:24:23 1727198663

Makes sense, as soon as your data leaves your computer, it's safe to assume it's no longer private, no matter what promises a service gives you.

You want guaranteed private data that won't be used for anything? Keep it on your own computer.

999900000999 · 2024-09-24T17:30:31 1727199031

Not sure why your getting down voted. Anything sent to an cloud hosted LLM is subject to be publicly released or used in training.

Setting up a local LLM isn't that hard, although I'd probably air gap anything truly sensitive. I like ollama, but it wouldn't surprise me if it's phoning home.

phren0logy · 2024-09-24T17:52:20 1727200340

This is just incorrect. The OpenAI models hosted though Azure are HIPAA-compliant, and Antropic will also sign a BAA.

999900000999 · 2024-09-24T18:15:49 1727201749

I'm open to being wrong. However for many industries your still running the risk of leaking data via a 3rd party service.

You can run Llama3 on prem, which eliminates that risk. I try to reduce reliance on 3rd party services when possible. I still have PTSD from Saucelabs constantly going down and my manager berating me over it.

caseyy · 2024-09-24T20:01:09 1727208069

You are not technically wrong because a statement "there is a risk of leaking data" is not falsifiable. But your comment is performative cynicism to display your own high standards. For the very vast majority of people and companies, privacy standards-compliant services (like HIPAA-compliant) are private enough.

999900000999 · 2024-09-25T02:11:33 1727230293

I know my company outright warns us to not share any sensitive information with LLMs, including ones that claim to not use customer data for training.

I can flip your statement around. For the vast majority of use cases, LLAMA 3 can be hosted on prem and will have similar performance.

spiralk · 2024-09-24T17:54:14 1727200454

This is not true. Both OpenAI and Google's LLM APIs have a policy of not using the data sent over them. Its no different than trusting Microsoft's or Google's cloud to store private data.

phren0logy · 2024-09-24T17:58:40 1727200720

Can you link to documentation for Google's LLMs? I searched long and hard when Gemma 2 came out, and all of the LLM offerings seemed specifically exempted. I'd love to know if that has changed.

spiralk · 2024-09-24T18:02:25 1727200945

https://ai.google.dev/gemini-api/terms this?

phren0logy · 2024-09-24T18:22:15 1727202135

Thanks very much! I think before I looked at docs for Google AI Studio, but also for Google Workspace, and both made no guarantees.

From the linked document, so save someone else a click:

     > The terms in this "Paid Services" section apply solely to your use of paid Services ("Paid Services"), as opposed to any Services that are offered free of charge like direct interactions with Google AI Studio or unpaid quota in Gemini API ("Unpaid Services").

Deathmax · 2024-09-24T19:37:24 1727206644

There's some possible confusion because of the Copilot problem where everything in the product stack is called Gemini.

The Gemini API (or Generative Language API) as documented on https://ai.google.dev uses https://ai.google.dev/gemini-api/terms for its terms. Paid usage, or usage from a UK/CH/EEA geolocated IP address will not be used for training.

Then there's Google Cloud's Vertex AI Generative AI offering, which has https://cloud.google.com/vertex-ai/generative-ai/docs/data-g.... Data is not used for training, and you can opt out of the 24 hour prompt cache to effectively be zero retention.

And then there's all the different consumer facing Gemini things. The chatbot at https://gemini.google.com/ (and the Gemini app) uses data for training by default: https://support.google.com/gemini/answer/13594961l, unless you pay for Gemini Enterprise as part of Gemini for Workspace.

Gemini in Chrome DevTools uses data for training (https://developer.chrome.com/docs/devtools/console/understan...).

Enterprise features like Gemini for Workspace (generative AI features in the office suite), Gemini for Google Cloud (generative AI features in GCP), Gemini Code Assist, Gemini in BigQuery/SecOps/etc do not use data for training.

sweca · 2024-09-24T18:23:09 1727202189

No Human eval benchmark result?

accumulator · 2024-09-24T20:29:11 1727209751

Cool, now all Google has to do is make it easier to onboard new GCP customers and more people will probably use it...its comical how hard it is to create a new GCP organization & billing account. Also I think more Workspace customers would probably try Gemini if it was a usage-based trial as opposed to clicking a "Try for 14 days" CTA to activate a new subscription.

jzebedee · 2024-09-24T17:41:40 1727199700

As someone who actually had to build on Gemini, it was so indefensibly broken that I couldn't believe Google really went to production with it. Model performance changes from day to day and production is completely unstable as Google will randomly decide to tweak things like safety filtering with no notice. It's also just plain buggy, as the agent scaffolding on top of Gemini will randomly fail or break their own internal parsing, generating garbage output for API consumers.

Trying to build an actual product on top of it was an exercise in futility. Docs are flatly wrong, supposed features are vaporware (discovery engine querying, anybody?), and support is nonexistent. The only thing Google came back with was throwing more vendors at us and promising that bug fixes were "coming soon".

With all the funded engagements and credits they've handed out, it's at the point where Google is paying us to use Gemini and it's _still_ not worth the money.

wewtyflakes · 2024-09-24T17:46:56 1727200016

> Docs are flatly wrong

This +999; I couldn't believe how inconsistent and wrong the docs were. Not only that, but once I got something successfully integrated, it worked for a few weeks then the API was changed, so I was back to square one. I gave it a half-hearted try to fix it but ultimately said 'never again'! Their offering would have to be overwhelmingly better than Anthropic and OpenAI for me to consider using Gemini again.

ldjkfkdsjnv · 2024-09-24T18:08:33 1727201313

The engineers at google are bad, they keep hiring via pure leetcode. Cant ship working products

victor106 · 2024-09-24T17:56:35 1727200595

Same experience here.

I had hopes of Google able to compete with Claude and OpenAI. But I don’t think that’s the case. Unless they come out with a product that’s 10x better in the next year or so I think they lost the AI race.

mixtureoftakes · 2024-09-24T16:14:15 1727194455

TLDR - 2x cheaper, slightly smarter, and they only compare those new models to their own old ones. Does google have moat?

usaar333 · 2024-09-24T16:17:06 1727194626

The math score exceeds o1-preview (though not mini or o1 full) fwiw.

cj · 2024-09-24T16:58:10 1727197090

Moat could be things like direct integration into Gmail (ask it to find your last 5 receipts from Amazon), Drive (chat with PDF), Slides (create images / flow charts), etc.

Not sure if their models are the moat. But they definitely have an opportunity from the productization perspective.

But so does Microsoft.

svara · 2024-09-24T19:17:11 1727205431

Have you tried the Gemini Gmail integration? I have that enabled in my GSuite account.

It's incredible how bad it is. I've seen it claim I've never received mail from a certain person, while the email was open right next to the chat widget. I've seen it tell me to use the standard search tool, when that wasn't suitable for the query. I've literally never had it find anything that wouldn't have been easier to find with the regular search.

I mean, it's a really obvious thing for them to do, I'm genuinely confused why they released it like that.

cj · 2024-09-24T21:03:46 1727211826

> I'm genuinely confused why they released it like that.

I agree. Right now it's not very useful, but has the potential to be if they keep investing in it. Maybe.

I think Google, Microsoft, etc are all pressured to release something for fear of appearing to be behind the curve.

Apple is clearly taking the opposite approach re: speed to market.

svara · 2024-09-25T08:55:32 1727254532

Yeah - The thing though is, you could build the same thing better in a day's work by using OpenAI's API, or Gemini's for that matter.

I wonder if there isn't a deeper, more worrying (for Google) reason behind that - that AI is killing their margin.

Google has always been about delivering top notch services, and winning by being able to do that cheaper than the competition.

It's "in their DNA" - everyone knows that using links to a website as a quality signal was a really good idea in the early days of Google, but what's a little less well known is that the true stroke of genius was the algorithmic efficiency of PageRank.

Similarly for GMail. Remember when it launched, 1 GB of free storage was just completely out of every competitor's league?

It may just be that this recipe of being smarter than everyone on algorithms and on datacenter operations might just not work anymore in the age of modern machine learning.

knowriju · 2024-09-25T12:21:13 1727266873

The problem with current crop of LLM models is that it makes for a great demo. I am also confident that you can build a working prototype for GMail, Outlook or any other surface. But I am equally confident it will be a massively different ballgame to role it out to a billion users. You'll run into a lot of edge cases and have to take care of a lot of adversarial scenarios as well. Pretty sure that's the same issue Apple is running into as well, and why they have had to postpone rollouts.

svara · 2024-09-25T13:12:16 1727269936

I don't buy that at all. They've literally shipped a broken, useless product that this amateur could do better (yes, as a demo).

All the hard scalability stuff, they've already done before. Gmail exists, the Gemini API exists.

If they're not getting it to work, there must be another reason. They just can't afford to provide it at a price point that users accept.

ethbr1 · 2024-09-24T17:05:39 1727197539

Doesn't Microsoft also get OpenAI IP, if they run out of money?

re-thc · 2024-09-24T16:39:03 1727195943

> Does google have moat?

Potentially (depends if the EU cares)...

E.g. integration with Google search (instead of ChatGPT's Bing search), providing map data, android integration, etc...

ianbicking · 2024-09-24T17:13:39 1727198019

Their Android integration certainly isn't on track to earn them any moats... https://hachyderm.io/@ianbicking/113099247306589777

thekevan · 2024-09-24T17:14:58 1727198098

Google does not miss one single opportunity to miss an opportunity.

They announced a price reduction but it "won't be available for a few days". By the time, the initial hype will be over and the consumer-use side of the opportunity to get new users will be lost in other news.