Hacker News new | past | comments | ask | show | jobs | submit login
AMD's MI300X Outperforms Nvidia's H100 for LLM Inference (tensorwave.com)
280 points by fvv 15 days ago | hide | past | favorite | 264 comments



"TensorWave is a cloud provider specializing in AI workloads. Their platform leverages AMD’s Instinct™ MI300X accelerators, designed to deliver high performance for generative AI workloads and HPC applications."

I suggest taking the report with a grain of salt.


The salt is in the plain sight.

The do the standard AMD comparison:

  8x AMD MI300X (192GB, 750W) GPU  
  8x H100 SXM5 (80GB, 700W) GPU
The fair comparison would be against

  8x H100 NVL (188GB, <800W) GPU
Price tells a story. If AMD performance would be in par with Nvidia they would not sell their cards for 1/4 price.


                 MTr
  ------------------
  H100 SXM5   80,000 
  MI300X     153,000
  H100 NVL   160,000


H100 SXM4 has 52% of the transistors MI300X has, half of the RAM and MI300X achieves *ONLY* 33% higher throughput compared to the H100. MI300X was launched 6 months ago, H100 20 months ago.

AMD has work to do.


On the other hand, 33% better performance for a 7% increase in power consumption is an appealing bullet point. Lots for AMD to do though, as you said.


Maybe I'm a naive fanboy, but I would put my money on Apple catching Nvidia before AMD or Intel.


Apple doesn't have any hardware SIMD technology that I'm aware of.

At best, Apple has Metal API which iOS video games use. I guess there's a level of SIMD-compute expertise here, but it'd take a lot of investment to turn that into a full scale GPU that tangos with Supercomputers. Software is a bit piece of the puzzle for sure, but Metal isn't ready for prime time.

I'd say Apple is ahead of Intel (Intel keeps wasting their time and collapsing their own progress from Xeon Phi / Battlemage / etc. etc. Intel cannot keep investing in its own stuff to reach critical mass). Intel does have OneAPI but given how many times Intel collapses everything and starts over again, I'm not sure how long OneAPI will last.

But Apple vs AMD? AMD 100% understands SIMD compute and has decades worth of investments in it. The only problem with AMD is that they don't have the raw cash to build out their expertise to cover software, so AMD has to rely upon Microsoft (DirectX), Vulkan, or whatever. ROCm may have its warts, but it does represent over a decade of software development too (especially when we consider that ROCm was "Boltzmann", which had several years of use before it came out as ROCm).

-------

AMD ain't perfect. They had a little diversion with C++Amp with Microsoft (and this served as the API for Boltzmann / early ROCm). But the overall path AMD is making at least makes sense, if a bit suboptimal compared to NVidia's huge efforts into CUDA.

I'd definitely rate AMD's efforts above Apple's Metal.


Apple makes a better consumer GPU than AMD does.

M3 Max's GPU is significantly more efficient in perf/watt than RDNA3, already has better ray tracing performance, and is even faster than a 7900XT desktop GPU in Blender.[0]

[0]https://opendata.blender.org/benchmarks/query/?compute_type=...


Couple of things: Blender uses HIP for AMD which is nerfed in RDNA3 because of product segmentation, so really this is comparing against something which is deliberately mediocre in the 7900 XT.

The M3 Max is also in a sense a generation ahead in terms of perf/watt of the 7900 XT as it uses a newer manufacturing node.

I suppose it's also worth highlighting that if you enable Optix in the comparison above, you can see Nvidia parts stomping all over both AMD and Apple parts alike.


Why does AMD nerf RDNA3 when they're so far behind Nvidia and Apple in Blender performance? Do you have benchmarks for when AMD doesn't nerf Blender performance?

M3 Max GPU uses at most 60-70w. Meanwhile, the 7900XT uses up to 412w in burst mode.[0] TSMC N3 (M3 Max) uses 25-30% less power than TSMC N5 (7900XT). [1] In other words, if 7900XT used N3 and optimizes for the same performance, it would burst to 300w instead which is still 5-6x more than M3 Max. In other words, the perf/watt advantage of the M3 Max is mostly not related to the node used. It's the design.

[0]https://www.techpowerup.com/review/amd-radeon-rx-7900-xt/37....

[1]https://www.anandtech.com/show/18833/tsmc-details-3nm-evolut...


Its weird that you're choosing a nerf'd part and sticking with it as a comparison point.

The article is MI300X, which is beating NVidia's H100.

> Do you have benchmarks for when AMD doesn't nerf Blender performance?

Go read the article above.

> Notably, our results show that MI300X running MK1 Flywheel outperforms H100 running vLLM for every batch size, with an increase in performance ranging from 1.22x to 2.94x.

-------

> Why does AMD nerf RDNA3 when they're so far behind Nvidia and Apple in Blender performance?

Nerf is a weird word.

AMD has focused on 32-bit FLOPs and 64-bit FLOPs until now. AMD never put much effort into raytracing. They reach acceptable levels on XBox / PS5 but NVidia always was pushing Raytracing (not AMD).

Similarly: Blender is a raytracer that uses those Raytracing cores. So any chip with substantial on-chip ray-tracing / ray-matching / ray-intersection routines will perform faster.

Blender isn't what people do with GPUs. The #1 thing they do is video games like Baldur's gate 3.

-------

It'd be like me asking why Apple's M3 can't run Baldur's gate 3. Its not a "nerf", its a purposeful engineering decision.


I was responding to the person above me, who used the word "nerf" to describe RDNA and Blender.

Their GPUs are literally “hardware SIMD”, and Metal is conceptually very close to CUDA. Apple just chooses to focus on consumer hardware instead.


But Apple doesn't produce servers or server hardware.


Currently no, but the XServe was a product for a decade. And they have built an internal ML cloud, presumably with rack mountable hardware. The bigger issue for apple IMO is they ditched the server features of their OS and they're not going to sell a hypothetical M4ultra Xserve with linux.


Good point; if the Mx architecture does prove to be a viable competitor to Nvidia/AMD for training and/or inference, do you think Apple would enter the server market? They continue to diversify on the consumer side, I wonder if they have their eye on the business market; I am not sure if their general strategy of “prettier is better” would work well there though.


They should buy Grok (the hardware inference company, not the lame twitter bot).


Clearly they are building their own Apple Silicon powered servers for their Private Computing Cloud, even if it is not sold to outsiders like the XServe used to be.

I don’t think that’s true anymore: see WWDC presentation about Private Cloud


AMDs deep learning libraries are very bad the last time I checked, nobody uses amd in that space for that reason. Nvidia has a quazi monopoly, that's the main reason for the price difference IMHO.


this...

nearly 95% of deeplearning github repos are "tested using cuda gpu - others, not so sure"

the only way to out-run nvidia is to have 3~10x better bang-for-buck.

Or AMD can just provide a "DIY unlimited gpu RAM upgrade" kit -- a lot of people are buying macstudio 128gb ram because of its "bigger ram-for-buck" than nvidia gpus


I heard apple m4 ultra using 256gb HBM for studio and pro, but I don’t buy it. The 256GB maybe. But a HBM memory control that would go unused on laptops doesn’t pass the smell test.


I think their best option might be more/better prosumer options at the higher end of consumer pricing. Getting more hobbyists into play just on the value proposition.


> that's the main reason for the price difference IMHO.

Explain why the performance difference does not matter?

AMD does only 33% better with a chip that has 2X transistors and 2X memory.


Isn't SXM5 higher bandwidth? It's 900 GB/s of bidirectional bandwidth per GPU across 18 NVLink 4 channels. The NVL's are on PCIe 5, and even w/ NVLink only get to 600 GB/s of bandwidth across 3 NVLink bridges (across only pairs of cards)?

I haven't done a head to head and I suppose it depends on whether tensor parallelism actually scales linearly or not, but my understanding is since the NVL's are just PCIe/NVLink paired H100s, you're not really getting much if any benefit on something like vLLM.

I think the more interesting thing critique might be the slightly odd choice of Mixtral 8x7B vs say a more standard Llama2/3 70B (or just test multiple models including some big ones like 8x22B or DBRX.

Also, while I don't have a problem w/ vLLM, as TensorRT gets easier to set up, it might become a factor in comparisons (since they punted on FP8/AMP in this tests). Inferless published a shootoff a couple months ago comparing a few different inference engines: https://www.inferless.com/learn/exploring-llms-speed-benchma...

Price/perf does tell a story, but I think it's one that's mostly about Nvidia's platform dominance and profit margins more than intrinsic hardware advantages. On the spec sheet MI300X has a memory bandwidth and even raw FLOPS advantage but so far it has lacked proper software optimization/support and wide availability (has anyone besides hyperscalers and select partners been able to get them?)


> but I think it's one that's mostly about Nvidia's platform dominance and profit margins more

Profit margins and dominance are result from performance, not the other way around.

It does not matter if Nvidia tools are better when you deploy large number of chips for inference and it does more flops per watt or second. It's seller market and if AMD can't ask high price, their chip do not perform.

----

Question:

People here seem to think that Nvidia has absolutely no advantage in their microarchitecture design skills. It's all in software or monopoly.

Is this right?


> People here seem to think that Nvidia has absolutely no advantage in their microarchitecture design skills. It's all in software or monopoly.

That's an extrapolation. Microarchitecture design skills are not theoretical numbers you manage to put on a spec sheet. You cannot decouple the software driving the hardware - that's not a trivial problem.


> Microarchitecture design skills are not theoretical numbers you manage to put on a spec sheet

not only can you measure this, not only do they measure this, but it's literally the first component of the Rayleigh resolution equation and everyone is constantly optimizing for it all the time.

https://youtu.be/HxyM2Chu9Vc?t=196

https://www.lithoguru.com/scientist/CHE323/Lecture48.pdf

in the abstract, why does it surprise you that the semiconductor industry would have a way to quantify that?

like, realize that NVIDIA being on a tear with their design has specifically coincided with the point in time when they decided to go all-in on AI (2014-2015 era). Maxwell was the first architecture that showed what a stripped-down architecture could do with neural nets, and it is pretty clear that NVIDIA has been working on this ML-assisted computational lithography and computational design stuff for a while. Since then, I would say - but they've been public about it for several years now (and might be longer, I'd have to look back).

https://www.newyorker.com/magazine/2023/12/04/how-jensen-hua...

https://www.youtube.com/watch?v=JXb1n0OrdeI&t=1383s

Since that "mid 2010s" moment, it's been Pascal vs Vega, Turing (significant redesign and explicit focus on AI/tensor) vs RDNA1 (significant focus on crashing to desktop), Ampere vs RDNA2, etc. Since then, NVIDIA has almost continuously done more with less: beaten custom advanced tech like HBM with commodity products and small evolutions thereupon (like GDDR5X/6X), matched or beaten the efficiency of extremely expensive TSMC nodes with junk samsung crap they got for a song, etc. Quantitatively by any metric they have done much better than AMD. Like Vega is your example of AMD design? Or RDNA1, the architecture that never quite ran stable? RDNA3, the architecture that still doesn't idle right, and whose MCM still uses so much silicon it raises costs instead of lowering them? Literally the sole generation that's not been a total disaster from The Competition has been RDNA2, so yeah, solid wins and iteration is all it takes to say they are doing quantitatively better, especially considering NVIDIA was overcoming a node disadvantage for most of that. They were focused on bringing costs down, and frankly they were so successful despite that that AMD kinda gave up on trying to outprice them.

Contrast to the POSCAP/MLCC problem in 2020: despite a lot of hype from tech media that it was gonna be a huge scandal/cost performance, NVIDIA patched it dead in a week with basically no perf cost etc. Gosh do you think they might have done some GPGPU accelerated simulations to help them figure that out so quickly, how the chip was going to boost and what the transient surges were going to be etc?

literally they do have better design skills, and some of it is their systems thinking, and some of it is their engineers (they pay better/have better QOL and working conditions, and get the cream of the crop), and some of it is their better design+computational lithography techniques that they have been dogfooding for 3-4 generations now.

people don't get it: startup mentality, founder-led, with a $3t market cap. Jensen is built different. Why wouldn’t they have been using this stuff internally? That’s an extremely Jensen move.


NVidia doesn't have a general advantage in hardware design skills, but they have been focused on AI workloads for quite a while while AMD spent a long time focusing on HPC factors like 64 bit floating point performance.


But the price should be a factor. Your fair comparison would match a ~60k$ setup to a 20k$ according to prices we can find online.

I don't think it should be ignored, especially when the power consumption is similar.


fair? h100 NVL are two h100 in a single package.. which probably costs 2xh100 or more,

if so ok it's fair to compare 1 mi300x with 1 h100 NVL but then price ( and tco ) should be added to the some metrics conclusion , also the NVL is a 2xpci5.0 quad slot , so not the same thing..

I am not sure about system compatibility and if and how you can stack 8 of those in one system ( like you can do with non NVL and mi300x.. ) so it's a bit a diffent ( and more niche ) beast


> Price tells a story. If AMD performance would be in par with Nvidia they would not sell their cards for 1/4 price

What were your thoughts on Zen (1) vs Intel's offerings then? AMD offered more back for the buck then too.


Price tells the story. Yes but for electric prices not card prize and here their much more close to each other!

Thx! Anyone who says Nivida isnt king, needs a reality check.


I love AMD, but my Nvidia stock position currently is much higher than AMD.


AMD is at a much higher PE ratio. Is the market expecting AMD to up its game in the GPU sector? Or is the market expecting a pullback in GPU demand due to possibility for non-GPU AI solutions becoming the frontier or for AI investment to slow down?


AMD doesn't have a higher PE ratio. You should use nonGAAP numbers because the GAAP numbers include Xilinx Googwill Amortization which skew the PE.

AMD's PE is ~55. Nvidia's PE is above 70.


I think that the expectation is that NVIDIA is in somewhat of an unreasonable position right now (and for the immediate future) where they're getting about 80% gross margins on their datacenter GPUs. This is an extremely juicy target for competitors, and even if competitors manage to produce a product that's half as good as NVIDIA, NVIDIA will have to cut prices to compete.


Why not both?


This topic is just about wether this changes or not


Well, there's the beauty of specifying exactly how you ran your benchmark, it is easy to reproduce and disprove or confirm (assuming you got the hardware).


As easy as getting yourself 8 H100 and 8 MI300X.

Fun weekend project for anybody.


You can rent them online for ~ 4-5 $ per hour per GPU. Not cheap, but definitely feasible as a weekend project.


where can I rent a H100 for 4-5 dollars an hour?

AWS doesn't let you use p5 instances (not getting a quota as a private person), lambda cloud is sold out.


It looks like Runpod currently (checked right now) has "Low" availability of 8x MI300 SXM (8x$4.89/h), H100 NVL (8x$4.39/h), and H100 (8x$4.69/h) nodes for anyone w/ some time to kill that wants to give the shootout a try.


We'd be happy to provide access to MI300X at TensorWave so you can validate our results! Just shoot us an email or fill out the form on our website


If you're able to advertise available GPU compute in some public forums then it's enough to tell us about the demand of MI300X in cloud ...

You're joking/trolling right? There are literally 10's of thousands of H100s available on gpulist right now, does that mean there's no cloud demand for Nvidia gpus? (I notice from your comment history that you seem to be some sort of bizarre NVDA stan account, but come on, be serious)

If they used Nvidia's chip would this somehow make the blog post better?


For one, they didn't use TensorRT in the test.

Also, stuff like this is hard to take the results seriously:

  * To make an accurate comparison between the systems with different settings of tensor parallelism, we extrapolate throughput for the MI300X by 2.

  * All inference frameworks are configured to use FP16 compute paths. Enabling FP8 compute is left for future work.

They did everything they can to make sure AMD is faster.


You need 2 H100 to have enough VRAM for the model whereas you need only 1 MI300X. Doubling the total throughput (for all completions) of 1 MI300X to simulate the numbers for a duplicated system is reasonable.

They should probably show separately the throughput per completion as the tensor parallelism is often used for that purpose in addition to the doubling the VRAM.


What's the cost to run 2x H100 and 1x MI300X?

I think that'd give us a better idea of perf/cost and whether multiplying MI300X results by 2 is justified.


I don't understand why they should use TensorRT. vLLM is much more popular and it was actually written for Nvidia. It also supports AMD hardware, so it's the appropriate tool to compare.

So they just multipiled their results per 2 ^^ ?


I see it as they did everything they can to compare the specific code path. If your workload scales with FP16 but not with tensor cores, then this is the correct way to test. What do you need for LLM inference?


Couldn't they find a real workload that does this?


vLLM inference of Mixtral in fp16 is a real workload. I guess the details are there because of the different inference engine used. You need the most similar compute tasks to be ran but the compute kernels can't be the same as in the end they need to be ran by a different hardware.


Why the hell are we doing 128 input token benchmarks in 2024. This is not representative of most workloads, and prefill perf is incredibly important.


For understanding:

What would be a suitable input length in your oppinion?

And why isnt this a good one: Are real-life queries shorter? Or longer?

If i count one word as a token, then in my case most of the queries are less than 128 words.


I think today 512 tokens is a minimum.

It's not just the query (if you're running a chatbot, which many of us are not). It's the entire context window. It's not uncommon to have a system prompt that is > 512 tokens alone.

I would like to see benchmarks for 512, 1024, 4096 and 8192 token inputs.


Including the initialization prompt and your history if you have one? I use ChatGPT for a very simple task, to map chat messages to one of 5 supported function calls, and the function definitions alone already take up 200 tokens I think


It's not just the current prompt, but the whole conversation, if possible. Or, if you want the AI to summarise an article, the article has to fit in.

If I understood that correctly, context length is something like session storage or short term memory. If it's too small the AI starts to forget what it's talking about.


IMO the relevant benchmark for now is a mixed stream of requests with 50 (20%), 500 (50%), 2000 (10%) and 50k (20%) input tokens, ignore EOS and decode until you get around 300 output tokens.


I'm really interested, do you have a source for those percentages ?

I tried to look for some service provider to publish this kind of metrics, but haven't found any.


Sorry, I can't. My employer doesn't publish this kind of metrics, either. What I posted was definitely just some very rough number off my brain.


In most cases thats not enough


I try to be optimistic about this. Competition is absolutely needed in this space - $NVDA market cap is insane right now, about $0.6 trillion more than the entire Frankfurt Stock Exchange.


It's more how little the Frankfurt stock Exchange is worth. And European devs keep wondering why our wages are lower than in the US for the same work. That's why.


The DAX is only 40 companies, most of which make real products rather than advertising mechanisms. Making real physical things just doesn't scale, and never will.

While I would enjoy a US tech salary, I'm not sure we want a world where all manufacturing is set aside to focus on the attention economy.

Nvidia value deserves to be much higher than any company on the DAX (maybe all of them together, as it currently is) - but how much of that current value is real rather than an AI speculation bubble?


> Making real physical things just doesn't scale,

Nvidia sells chips ...


Investors don't even know what NVIDIA is selling, I was listening to a random investor podcast and they were talking about Intel, AMD and NVIDIA, but no one knew what exactly they are selling, they only knew they are part of this AI bubble so that's why you should invest in them


What about seroous analysts, say like morningstar, don't the undedtand the business that they recommended relatively well?


How much of Nvidia's value do you think comes from the chips? Chips are a commodity, AMD's chips have been better and cheaper than Nvidia for many years.

The reason Nvidia's value has been so inflated is the software stack and the lock-in they offer. CUDA, CuDNN, that's where Nvidia's value lies.

And obviously, now that all relevant ML frameworks are designed for Nvidia's software stack, Nvidia has a monopoly on the supply. That's why their value is being inflated so much.

And Nvidia doesn't have produce the chips themselves, that's all contracted out as well.


Nvidia is a fabless chip designer. They design chips and software to go with the chips, but contract out the actual manufacturing.


Yes, they buy these chips from fabs (made to their design) but it is the chips they then sell on, at about a 4x markup to what they bought them for. Good business model! Buy low, sell high. Whether good enough to justify current valuation is another question.


They can't make them fast enough to satisfy hype-driven demand; they are not scaling like a tech company, but their market cap is being inflated like one.


What is the meaning of "tech" in this context? "Software"? I mean, if Nvidia isn't a technology company...


Sorry should have said "big tech", as commonly used to refer to FAANG companies who focus more on software and/or services.


> Making real physical things just doesn't scale, and never will.

Nvidia sells physical things, and they are bigger than 40 companies because the companies are selling physical things?

I am not arguing hardware scales better than software but this is a strange argument in this context.


> but how much of that current value is real rather than an AI speculation bubble?

I mean, by definition given that it trades freely their market cap is real. Your market cap today is what the market thinks your future cash flows are worth. The bubble and the bubble popping should in theory both be priced into Nvidia's market cap.

What isnt' is events the market doesn't anticipate, AMD coming out with a current generation chip that can do inference as well as the H100 is something the market hasn't priced in.

Andy our manufacturing example is very poor as NVidia is certainly part of the manufacturing pipe line by designing physical products that people buy.


> Your market cap today is what the market thinks your future cash flows are worth. The bubble and the bubble popping should in theory both be priced into insert_financial_product_here

This is a bit of a tired viewpoint, and is evidently proven not true time after time. The collective despair/euphoria of market participants is extremely powerful and well documented, at least as far back as dutch tulips.

Stock valuations are relative, and they are relatively misvalued most of the time. That's why there are (albeit rare) funds that are capable of outperforming the market for decades - Berkshire, and Medallion for example.

It's certainly possible that AMD is valued (almost) fairly. It's just as likely that it's relatively misvalued for no reason other than emotions (lack of hype).


> can do inference as well as the H100 is something the market hasn't priced in.

I think probability of that would still be priced in. Not sure what the exact probability is, though.

But if say it was clear that AMD can come up with a competitive option, then NVDA stock would drop. But if it was clear the other way that AMD can't do it, NVDA price would increase.


> The DAX is only 40 companies, most of which make real products rather than advertising mechanisms

This, as the kids say, is just cope. American big tech makes real products. Google is not just ads. Apple is not. Amazon is not. Tesla is not. NVidia is not. Netflix is not.

NVidia might be overvalued because of the current AI hype but that does not diminish their real accomplishments!

Europe has almost no real tech companies. There is one exception, founded in 1984. Not exactly a spring chicken. How can a wealthy continent with 750 million people produce no big tech companies? It's a big problem.


> Google is not just ads.

Bad example given how aggressively they terminate products which don't generate the same revenue as ads.

> Apple is not.

Best example, they have done a fantastic job of being both a tech company and pseudo-fashion company.

> Amazon is not.

They don't make anything (at least nothing people want to buy) and have ad revenue as an increase slice of their pie.

> Tesla is not.

Even bigger hype/speculation vehicle than Nvidia.

> NVidia is not.

Nvidia of 5 years ago would not have appeared on this list, being too much of a niche tech company. Good at what they do, but hugely hype-fuelled.

> Netflix is not.

Running out of growth potential with their current business model, starting to introduce ads!


Nvidia's valuation is in part driven by the fact that their chips potentially disrupt the vehicle for ad revenue that is search.

For all his insanity, the one thing I respect Musk for, is that he actually started successful companies that make stuff. Creating a new car manufacturer of the scale of BMW out of nothing was widely considered impossible before.

Of course he did this from a position of extreme wealth, but none of his peers managed to do that. Everyone else is just seeking rent by trying to be first to implement some tech transition that is coming anyway. And that might be a lot more valuable to society if it was managed differently...


Amazon doesn’t make anything anyone wants to buy?

Not to be snarky, but if AWS counts as “nothing” I’d sure like a slice of nothing please.


AWS is a (collection of) service(s) which can scale and have low marginal cost of reproduction. Fire Tablets are a real thing you can buy, but they are shit so nobody does.


Sorry, are you suggesting AWS Services "aren't real things"?

If I pay for a database server in Virginia, how is that not real?


I get that you meant “physical” but this blur between “advertising isn’t a useful thing for an economy to focus on” and “services are not real” is a bit of a jump!

Cloud services don’t just exist on their own accord. Datacenters are physical and real!


fortunately, it's not for a one person to decide what an economy needs or not. The demand is there and that's why we have "ad companies".

Why is producing big companies a goal? High standards of living for all seems to a much better goal. And that can be done with small or big companies - so long as economic production is high enough and distributed well enough.


Because tech innovation requires tons of R&D and you can't afford to do that otherwise. Europeans use American laptops running an American operating system to watch American movies in an American browser.

European economic production is nowhere near high enough and now Europe is struggling to provide for its aging population and doesn't have enough good jobs for younger people. I support redistribution generally, but the wealth has to be created first or there won't be anything to redistribute.


To be fair more than half of that laptop hardware is made in China+Taiwan, including a lot of the IP that goes into it. If you look at phones and tablets, there is a bunch of components/IP from European companies also, such as ARM, Bosch, STmicroelectronics, Infineon, NXP etc. Intel has famously struggled and failed multiple times to get into that market. European semiconductor companies are also strong in automotive and other industries that use modern embedded systems. In another consumer electronics niche, a majority of wireless mice and keyboards are build on Nordic Semiconductor chips - from tiny Norway.


>If you look at phones and tablets, there is a bunch of components/IP from European companies also, such as ARM, Bosch, STmicroelectronics, Infineon, NXP etc.

But those are all low-marin chips. Qualcomm, Nvidia, Intel, AMD and Apple have much higher margins on their chips. They don't bother competing with the EU chips companies.


That's all true. However, the US is capable of producing almost everything domestically, albeit in lower volume. The US military doesn't like to depend on Chinese chips for obvious reasons.


And yet standards of living in Europe are comparable to those in the US, and preferable at the median. Our attention is captured by speculative valuations of unicorns, and yet people actually need real stuff made, drugs developed and made etc. Europe does perfectly well in many non winner takes all sectors where English language and network effects are less relevant. The political instability created by the US neoliberal experiment is something I hope we can avoid over here too.


Europe is much much poorer than the US, actually.

https://pbs.twimg.com/media/F3PGpsrWEAEiplB?format=jpg&name=...


EU GDP per capita 2022 is the same as US GDP per capita 2017.

Unless you want to say that the US was much poorer in 2017 than it was in 2022 that's a fairly ridiculous statement.

Also, the highest productivity places in the EU have much lower hours worked per capita than the US, with Germans on average working 25% less than Americans and the EU as a whole working 13% less than the US.

https://data.oecd.org/emp/hours-worked.htm


World Bank:

European Union gdp per capita for 2022 was $37,433, a 3.33% decline from 2021.

U.S. gdp per capita for 2022 was $76,330, a 8.7% increase from 2021.

It's not even close?


Sources please? Are these nominal dollars?

World bank data in PPP dollars is reported as 64,600 vs 45,900 here:

https://ourworldindata.org/grapher/gdp-per-capita-worldbank?...

Germany is at 53,900 there, but a good chunk of the difference is simply that US works more per capita. GDP per hour worked is 74$ in the US vs 69$ in Germany, 53$ in Canada. Sweden is ahead of the US. And the EU also includes countries like Bulgaria, which at 29$ is barely ahead of Russias 28$.

https://data.oecd.org/lprdty/gdp-per-hour-worked.htm

France is at 65$ per hour worked, but Germany and France also have significantly lower poverty and inequality rates by any measure you chose, with France more equal than Germany.

The US, of course, remains the dominant economy of the world by any measure. There is no question of that. But the exponential nature of economics, and the structural differences between these different economies, means that GDP numbers compared directly are fairly meaningless.

Edit: That last sentence is too strong as stated. GDP obviously matters a big deal in the grand scheme of things, especially as you jump from lower or middle income to high income countries. But it's all logscale. A factor of 2 is a big deal, a factor of 1.2 might not be.


GDP per capita as a stand alone metric doesn't mean much.


All rich countries have high GDP per capita and all poor countries have low GDP per capita. Zero exceptions. Despite the shortcomings of GDP as a metric it still tracks prosperity very accurately.


You can't get any more obvious than that, but that wasn't my point to say that higher GDP doesn't make you richer than a low GDP, but to say GDP/capita as a number alone is not a measure of wealth, income or prosperity between countries, even in the EU.

For example Ireland has by a long margin the highest GDP/capita in the whole EU, and it would make you think the average Irish worker earns more that any other worker in the EU and drives a Lambo, but that's not what's happening. It's because most US corporations funnel their EU money through their Irish holding companies skewing the statistic.


> For example Ireland has by a long margin the highest GDP/capita in the whole EU

No, Luxembourg does.


But in both cases it's because of weird corporate tax loopholes rather than the underlying productivity or utility to the median individual.

> EU GDP per capita 2022 is the same as US GDP per capita 2017.

Can't edit anymore, but: That were nominal numbers, and thus useless. See here:

https://news.ycombinator.com/item?id=40673552


Yes, if you had 2017 salary in the US today you are much poorer. Inflation was no joke


These numbers are PPP, so inflation adjusted.


These are mean figures, not median.

.... while all of the content is hosted on Linux servers, and you're listening to music through a Swedish app. That is running on silicone made in Taiwan. Using equipment that can currently only be manufactured in Belgium and Germany. On a Mac you are using a British instruction set.

Also in terms of tech innovation: What part of the US-based tech innovation couldn't have been (and actually were) achieved with open-source solutions many many years earlier for a fraction of the cost, if we didn't have copyright?

Honestly, a significant chunk of the "innovation" seems to relate directly to maximizing advertisement opportunities and inducing increased consumption. Who cares if a website takes a second to load rather than 0.1 seconds? If it has content I want, 1 second isn't a big deal. If I don't care about the content, I lose nothing by being distracted by something else in that 1 second.

---

More importantly:

European Economic production isn't high enough... by what standard?

https://data.oecd.org/lprdty/gdp-per-hour-worked.htm

GDP per hour worked is 74 in the US vs 69 in Germany and 54 in the EU. And the EU includes many large countries that emerged from communist dictatorship only 35 years ago, and are very much still in the process of catching up. Incidentally, the German economy is the result of the West German economy with 63 million people absorbing a failing economy hosting 16 million people in 1990.

The idea that the US is some promised land of economic prosperity while Europe is falling is entirely absurd. It's a narrative built on small relative differences and a US system that pressures people into working a lot more than Europeans do.

More importantly, even GDP per Capita wise:

https://data.oecd.org/gdp/gross-domestic-product-gdp.htm

EU per Capita GDP in 2022 is the same as USA 2016. Was the USA in 2016 struggling but now isn't?

This is all bullshit. Economic output is more than high enough and rising steadily. The problem remains solely in the distribution of


>EU per Capita GDP in 2022 is the same as USA 2016. Was the USA in 2016 struggling but now isn't?

Absolutely the US would be struggling if the GDP was still 2016 values with today’s costs.


Pretty much. I'd also love it (European) if we stil had 2017's rent prices and consumer prices, but we don't. Housing, energy, food, healthcare and everything has gone up like crazy. We're definitely poorer than before, just sweeping it under the rug pretending everyting's fine.

So it boggles my mind that your parent tried to make a point by equating USA 2016 with EU 2022 GDP/capita as if nothing's wrong with that. Are some people that oblivious?


The 2017 number was a mistake by me, I wanted to look at inflation adjusted/PPP, and that was nominal. You can clearly see the massive effect of the Russian invasion of Ukraine and the resulting inflation in the data. If we were looking at nominal values, the opposite is the case: Nominally GDP is rising rapidly:

Nominal: https://data.oecd.org/gdp/gross-domestic-product-gdp.htm

PPP/Inflation adjusted not so much: https://ourworldindata.org/grapher/gdp-per-capita-worldbank?...

But pretty much all countries are still well ahead of where we were in 2017. If you feel poorer than in 2017 it's because you're getting less of a larger pie, not because the economy is producing less than it did then.


Can't edit anymore, but this little factoid:

> More importantly, even GDP per Capita wise: > > https://data.oecd.org/gdp/gross-domestic-product-gdp.htm > > EU per Capita GDP in 2022 is the same as USA 2016. Was the USA in 2016 struggling but now isn't? > > This is all bullshit. Economic output is more than high enough and rising steadily. The problem remains solely in the distribution of the ouptut.

The conclusion that economic output has been rising steadily, even in the last couple of years, is true. But the 2016 vs 2022 numbers are nominal, thus useless. There is a much more significant difference over time when working in PPP/Inflation adjusted numbers:

https://ourworldindata.org/grapher/gdp-per-capita-worldbank?...

The overall point holds though: The economic output of the EU is at 45K per capita today, the level of the US in 1997. The US was not a poor country in 1997. Germany is at the economic output per capita of 2009.

Did the US in 1997 suffer from the problem that it didn't produce enough economic output? Of course not.

And given that, adjusting for inflation, GDP per capita is at an all-time high, the conclusion that you're poorer because economic production is distributed to others is necessarily true. And it tracks, too. Corporate profits and the Dow Jones are not down. The already extremely wealthy have accumulated nearly two thirds of the new wealth being created since 2020:

https://www.oxfam.org/en/press-releases/richest-1-bag-nearly...

> Billionaire wealth surged in 2022 with rapidly rising food and energy profits. The report shows that 95 food and energy corporations have more than doubled their profits in 2022. They made $306 billion in windfall profits, and paid out $257 billion (84 percent) of that to rich shareholders. The Walton dynasty, which owns half of Walmart, received $8.5 billion over the last year. Indian billionaire Gautam Adani, owner of major energy corporations, has seen this wealth soar by $42 billion (46 percent) in 2022 alone.

Given these facts, if we have to accept lower economic production in the name of a fairer distribution of economic production, that seems more than acceptable to me.


Big companies means efficiencies of scale. Small companies that succeed and grow inevitably become big companies. If they don't, it means the qualities that make them effective don't scale to the rest of the economy.


They become big but not necessarily huge - like the 10 biggest tech companies that people here put as the benchmark. For that one needs organizations that also continuously increases their scope - going into new markets, consolidating exiting markets, buying up existing players. And if they are to continue being "European" then they must resist being bought up by the huge US or global tech companies. The latter is a big problem at least here in Norway. We have some companies that grow quite big and successful - but it is generally just a question of time before they get swallowed by a huge corp, often with US headquarters. Example: Atmel, now Microchip


Yep, that's a huge problem, and one that isn't easy to fix. The European market is vastly more gragmented than the North American one, so even without Europe's penchant for taxes and regulation, North America can more easily get giant companies that can reach across the Atlantic.

The only real answer is protectionism, and there's a good chance that'll hurt more than it helps.


Because that famous EU welfare is funded via taxes. Having well performing companies funds your welfare system.

Currently EU welfare systems are under massive strain and huge waiting lists due to ageing population and economy that hasn't kept up to fund it.

There's no free lunch here. You need big companies with scale that pay huge wages as those mean a lot more tax revenue. Saying no to that kind money out of some made up idealism is just silly copium.

The EU income taxes paid by a single FANG salary employee would be the equivalent of the taxes paid by ~10 average workers. Pretty sure Germany and every other EU country would like to have such taxpayers contributing into the welfare system and not say no to it.

Europe's share of global GDP gas been on a constant decline at the expense of US and Chinese growth. Yeah it's nice to have a better welfare system than China or the US, but how will you fund it in the future if you keep having less money? Political idealism doesn't pay your food and rent.


[flagged]


Even if a company doesn’t pay taxes, its masses of highly paid workers do. Each of those SWEs making $300k+ are paying more in taxes than the entire earnings of the average EU dev.


Aha, the cries of the once well-payed SWE laid off and replaced with some cheap hire oversea come up at least once a week for a year or two already. The funny thing about transnationals - they do not care about the society they operate in.

But what about the rest, not these lucky SWE who had a good run for the last 10-15 years? How is it going, education, medicine, crime, inequality? All is splendid, I assume?


I still don't get why that's so complicated to understand that more well paid workers = more taxes for the state, and some are vehemently expecting a "source" for this.


Yes, this is why European politicians are bending over backwards to get American tech CEOs to open offices in their countries.


[flagged]


Stop trolling and throwing around wild accusations. Here are your words:

>You need big companies with scale that pay huge wages as those mean a lot more tax revenue.

Now provide proofs that we need some big corps dodging taxes, including FAANG, and not more small and middle-sized business.


No


> There is one exception, founded in 1984 Europe clearly has many problems stimulating investments and creating a competitive environment for startups and tech companies. But you can't say ASML is the only one "real" tech company. What about Adyen, Spotify, Klarna, N26, Revolut, etc?


Most of those companies you mentioned aren't anywhere near as wealthy or as high market caps as US big-tech.

Most of them are just payment middlemen not some innovative product nobody else can do, and Spotify survives on monopolizing and squeezing artists, not some innovative product. Kind of like Netflix except Netflix has some cutting edge streaming tech as a product not just IP licenses.

ASML is the only product innovator there except their innovative EUV lightsources are licensed from Sandia labs in the US and made by Cymer in the US which ASML bought and licensed to not seel to China. So an US invention at the end of the day.


We aren't exclusively talking about big tech here, just tech. Companies like Monzo dominate the local markets because they executed on ideas no-one else tried before. We forgot that huge, international tech companies are pretty much an exclusive American phenomenon, they don't really exist anywhere else.


N26, Klarna and Revolut are not tech companies. Neobanks are still banks. And Klarna is just modern store credit cards. These have been around for decades.

Adyen is very underrated, and Spotify is definitely tech.

Stripe should be on the list. DeepMind at one point.


Stripe was founded in California. It's an American business that focused exclusively on the American domestic market in their first years of operation. Many tech companies in the US are founded by immigrants from Europe and elsewhere. That the Collisons chose to start their business in the States is no coincidence.


Many EU start-ups choose to launch their product in the US first due to the 300+ million people market speaking only English, rather than bother with the tiny fragmented markets at home.

It's just much cheaper and easier for start-ups if you're developing a SW product to sell it in the US market first and only when you've made money there, slowly bring it in the EU.

Starting off SW products in the EU is suicide (unless you're targeting some niche in the local market that's safe from competitors from abroad because it ties into some local idiosyncrasies on language, culture and law).


Revolut is as much a tech company as Netflix or Stripe are - using modern software to fix an old problem.


>How can a wealthy continent with 750 million people produce no big tech companies? It's a big problem.

Much more difficult to scale a product across 26 different countries and nearly as many languages and regulatory jurisdictions. US is one country, not a collection of countries fighting each other, meaning your product is instantly available to 300M people speaking the same language under (nearly) the same regulations.


>US is one country, not a collection of countries fighting each other

The US is a republic of 50 states. Each state has a huge amount of sovereignty and autonomy. There are 50 state-level regulatory jurisdictions. Not to mention the local-level of government.

But in spite of this, the US does not over-regulate. This is the big difference to Europe (I say this as an American expat living in Europe).


American states have generally less autonomy than Canadian provinces do. Federalism is common worldwide, it's far from an American thing. EU countries are _vastly_ more independent than American states, and when you throw in the cultural differences, it grows hugely again.


Have you tried launching a product in your new EU country and then taking abroad to another EU country? You'll find out it's not exactly like doing the same thing between US states. And I'm not even talking about the language barrier.


Language aside, the entire point of the EU is the single market so you don’t have 26 different rule sets. (There are some exceptions such as health care but that is no different in the US.)


Except you do have 26 different rules.

It's a single market on paper as the eu only mandates a small subset of common rules and regulations such as removing tarrifs or freedom of movement, but have you ever tried in practice to launch your company from Belgium to France or from Netherlands to Belgium or from Austria to Germany, or from Romania to Italy?

It's much more difficult when the rubber hits the road as every country has various extra laws and protectionist measures in place to protect it's domestic players from outsiders even if they came from within the EU. And that's besides the language barrier which means added costs. This is much less efficient than the US market.

EU countries and voters still value their national sovereignty and culture (both with the upsides and downsides) above a united EU under the same laws and language for everyone, ruled from outside their country's borders. See what happened with Brexit and the constant internal squabbling and sabotaging over critical EU issues that affect us all like the war in Ukraine or illegal mass migration. An US style unification just won't work here since every little country wants to be it's own king while having its cake and eating it too.


Europe is a single market with regards to imports and exports, and that's what matters most for a business. Low wages more than compensate for regulatory annoyances. No shortage of subsidies either.

California has more burdensome regulations and higher taxes than other states and yet it's home to silicon valley.


>Europe is a single market with regards to imports and exports, and that's what matters most for a business.

We're talking about scaling internal companies across EU, not about imports and exports. And scaling local start-up across the EU is a regulatory and legal nightmare for small companies.

Shipping and selling imports and exports of commodities are a solved problem for decades, but scaling a on-line notary service for instance, that works both in Germany and in Italy, isn't. The EU doesn't help much with that as they only say you should have no tariffs between each other, not that you shouldn't have various legal, cultural and bureaucratic protectionism idiosyncrasies in place. The EU won't and can't force countries to improve that to make doing business easier for cross-country start-ups.

EU countries have a lot more roadblocks between each others than US states do when ti comes to scaling businesses.


Yes, and it's doing a mediocre-at-best job of it. You can remove tariffs and harmonise regulations, but the EU is trying to unite countries with cultural borders older than Christianity.


I would argue that the EU is quite a successful organization given the task of setting up general market rules for 26 countries.

Obviously there is a lot to criticize about the EU and I can offer you a gigantic list there too. However, I do not see any clear failure of the EU’s approach as a single market so far. Additionally part of the philosophy was establishing peace in a region that was torn up by wars for a lot longer than Christianity exists. I would argue the EU was quite successful there too.


It's definitely successful, and I was probably too harsh there. But I genuinely think the barriers that are left are damn near insurmountable. An awful lot has to change before a Greek tech workers can move to Sweden as easily as a Virginian can move to California.


What would you change if you could? The EU already offers freedom of movement to EU citizens. Of course the US being a country instead of a connection of countries offers a more streamlined experience and surely the shared language plays a huge role too. But when I want to come up with examples such as a Greek person having completely different retirement, health care and legal schemes in Sweden compared to Greece, it seems the US is not so dissimilar there either given that states sometimes have very different approaches to health care, taxes, labor laws, etc.


Mostly stuff you can't really change. Like, the cultural difference between Sweden and Greece is _way_ bigger than between California and Virginia. You've got the language barrier too. This isn't gonna be fixed, maybe ever, but will continue to cause friction.


I had two Greeks on my team at a large tech company in Sweden.


Do you think there are more Greeks in Sweden, or Virginians in Washington?


I just thought it was funny your example was my experience. But I also don't think it applies that much to tech workers, they dont need to learn swedish at all. Certainly not easy but not insurmountable as you say.


> Europe has almost no real tech companies

How do you define "tech"? Europe's domestic markets are jam-packed full of local tech companies.


The grandparent claimed that the DAX 40 has real businesses that make things as opposed to American tech businesses that just serve the attention economy. Clearly not true. European businesses run on American technology. American businesses do not run on European technology.

Europe has many small and not very profitable tech companies. Almost no large and profitable ones. https://pbs.twimg.com/media/GNDtCtTXcAAiwFk?format=jpg&name=...


Wages is a proxy of how valuable your work is, but not a measure of how value your work is. To support a high salary something has to happen, either the product sold is very expensive or it's being subsidized by investors. No company can pay its employees above what they are able to generate selling the product they worked on indefinitely.


Okay, so you are saying I should move to america, where apparently a lot of people struggle hard to even get a job?

Nah, then ill get my very good wagie pennies here and have plenty jobs available, plus good health insurrance and whatnot.


German unemployment is 3.2%. US unemployment is 4.0%. Neither of these are at all high by historical standards.

https://www.bls.gov/news.release/empsit.nr0.htm https://www.destatis.de/EN/Press/2024/06/PE24_217_132.html


Don't forget to take both of those stats with a grain of salt though. The US has a lot of gig workers which are not always counted correctly or the same and Germany has a large low-wage sector, where people are employed but earn less per month that they would get in unemployment benefits, and so the state pays the difference.


Sure, and I can only judge from what I'm hearing, but the internet does make it sounds like it's very hard to find programming jobs as a graduate or junior. I don't know what the truth is. Maybe it's just a minority crying out loud because they don't get hired at FAANG.


That is a perfectly reasonable approach. On the other side, working in the tech on the US salary, if not blown on luxurious lifestyle, gives one a good chance of quickly reaching financial semi-independence (say, 3 years of full living expenses in the bank), which can be a powerful feature. My 2c.


Or do both, work for a US company remotely. Most large ones even have an entity in the larger EU countries that can employ you with all the benefits of being an EU employee. They might not pay the exact same as in the US but still considerably above market rate.


>Okay, so you are saying I should move to america

Please stop breaking HN rules. I never said that. HN rules state you need to reply to the strongest interpretation of someone's argument, not the weakest that's easiest to criticize.

I just pointed out once country's economics performance for comparison, if you're want to extrapolate from that that you should move there, that's your issues to deal with, but not my argument.


Yes

But there's a long list of German companies not on the DAX

(though Germany DAX really deserves to be worth less than NVidia)


The DAX is made of the 40 most valuable German companies. That’s how it is defined. So the companies not in it, again by definition, matter less.


> The DAX is made of the 40 most valuable German companies.

Not to be too nitpicky here but these are only the publicly traded companies. You have a number of pretty large German companies that are still entirely private such as Aldi, Schwarz Group, Boehringer or Bosch.


Also many medium sized companies which are productive and competitive but not public, so never grow to huge sizes. I view this as a feature not a bug… smaller companies have a more direct connection with their workforce and tend to behave better with them.


> DAX is made of the 40 most valuable *listed* German companies

https://www.famcap.com/top-500-german-family-businesses-the-...

Not all of those are listed, or listed in Frankfurt


No it isn't? It's the list of the biggest blue-chip Germany-HQ'd companies trading on the Frankfurt Stock Exchange. If you're a German company with exclusively German employees trading in London, or Paris, or New York, you don't qualify.


The CDAX index has about 360 companies and appears to have a market cap of around 2 trillion EUR vs 1.7 trillion for the DAX 40.


stock value doesnt reflect a company's income or ability to pay their workers


Frankfurt Stock Exchange or the DAX is mostly irrelevant. Germany has a strong, family-owned Mittelstand, those companies are not publicly traded and thus not listed. Plus, we have some giants that are also not publicly listed but belong to the richest Germans (Lidl, Aldi of discount groceries, but also automotive OEM Bosch).


We are in the middle of an LLM bubble.

Nvidia problem will sort itself out naturally in the coming months/years.


As someone put it in: we are in the 3D glasses phase of AI. Remember when all TVs came with one?


Same thing was said about Nvidia's crypto bubbles, and then look what happened.

Jensen isn't stupid. He's making accelerators for anything so that they'll be ready to catch the next bubble that depends on crazy compute power that can't be done efficiently on CPUs. They're so far the only semi company beating Moore's law by a large margin due to their clever scaling tech while everyone else is like "hey look our new product is 15% more efficient and 15% more IPC than the one we launched 3 years ago".

They may be overvalued now but they definitely won't crash back to their "just gaming GPUs" days.


They got extremely lucky with AI following crypto. The timing was close to perfect. I'm not sure there will be another wave like that at all for a long while.


Maybe but it's not like all those AI compute units or whatever Nvidia called them will be thrown in the dumpster after the AI bubble pops. There's a lot of problems the can be solved on them and researcher are always looking for new problems to solve as compute becomes accesibile.

I'm tired of hearing about Nvidia's "luck". There was no luck involved. Nvidia shiped Cuda on consumer GPUs since 2006. That's almost 20 years time researchers had to find used cases for that compute and Nvidia made it possible. In other words the AI bubble happened because Nvidia made the necessary ground work for it to happen, they didn't just fall into it by luck.


They will not be thrown in the dumpster, but that's actually a bad thing for NVIDIA. We had a very short period when lots of miners dumped their RTX cards on ebay and the prices fell a lot for some time. (then AI on RTX became a thing at small scales) When the A100/H100s get replaced, they will flood the market. There's many millions of $ stuck in those assets right now and in a few years they will dominate research and top end of hobbies. Only high profile companies/universities/researchers will look at buying anything newer. Maybe NVIDIA can do 2 generations of those cards, but ASIC-based solutions will hit the market and the generic CUDA will become a problem rather than a blessing. Same story as BTC miners and graphics cards.

Sure, they didn't get lucky with the tech they had to offer - that was well developed for years. They just got lucky that the next big thing was compute-based. If the next thing is memory/storage-based, they're screwed and the compute market is saturated for years - they have only gamers left.


> they're screwed and the compute market is saturated for years

If all this extra computing power is available, smart people will find a way to use it somehow.


There is no recurring revenue on them though. So NVIDIA needs to continue selling obscene amount of new chips.


There could be other "AI" waves after LLM. And we have still not hit the self-diving car wave, that could happen the next 20 years (or 40). And neither general-purpose robots, that couuuuld also happen. Personalized medicine has also not happened yet. Nor virtual reality, which might take off one day (or not). There are still many industries that could go big in terms of computational demands.


It wasn't a coincidence either though: the amount of compute available is probably the main driver of this wave.


I think there's also a very high prospect of virtual worlds with virtual people (SFW or otherwise) becoming popular, rendered with Apple/META goggles...that could require insane amounts of compute. And this is just one possibility. Relatively cheap multimodal smart glasses you wear when out and around that offload compute to the cloud are another.

Nvidia could just as easily triple in short order as get cut in half from here imho.


I thought Meta Horizon and the sales number of Vision Pro[0][1] already proves your thesis wrong. Even Zuckerberg stopped talking about it.

[0]https://www.macrumors.com/2024/04/23/apple-cuts-vision-pro-s... [1] https://www.macrumors.com/2024/04/22/apple-vision-pro-custom...


I am referring to the future (the actual future, not simulated ones), so it is not possible to know if I am wrong.

I predict this is yet another domain rich with opportunity for AI.


You mean like Omniverse from Nvidia where you can simulate entire factories or as Nvidia does data centers before they are built?

Or how about building the world virtually? https://www.nvidia.com/en-us/high-performance-computing/eart...


Ya, I saw those demos (I own the stock), incredible. But I'm thinking things more like just a single VR friend who has memory and (optionally) prior knowledge of your background. I think these could be an absolute blessing for a lot of people who have lots of time on their hands but no one to talk to.

Or, leaning more towards your examples, a Grand Theft Auto style environment containing millions of them, except life like.


Intel's all time high was in 2000, if I'm reading the charts correctly.

Of course the equivalent can happen to Nvidia. Seems almost certain.


Oof, I really didn't intend to start a flamewar.


Those are rookie numbers


I’m a AI Scientist and train a lot of models. Personally I think AMD is undervalued relative to Nvidia. No, chips aren’t as fast as Nvidia’s latest and yes, there are some hoops to get things working. But for most workloads in most industries (ignoring for the moment that AI is likely a poor use of capital), it will be much more cost effective and achieve about the same results.


The market (and selling price) is reflecting the perceived value of nvidia's solution vs AMDs - comprehensively including tooling, software, TCO and managability.

Also curious how many companies are dropping that much money on those kind of accelerators just to run 8x 7B param models in parallel... You're also talking about being able to train a 14B model on a single accelerator. I'd be curious to see how "full-accelerator train and inferrence" workloads would look ie: Training a 14B param model then inferrence throughput on a 4x14B workload.

AMD (and almost every other inferrence claim maker so far... intel and apple specifically) have consistently cherry picked the benchmarks to claim a win over, and ignored the remainder which all show nvidia in the lead - and they've used mid-gen comparison models as many commenters here pointed out in this article.


mi300x win in some inference workloads, h100 win in training and some others inference workloads ( fp8 inference with tensorRT-llm , rocm is young but is growing fast )

in a single system ( 8x accelerators ) LLMs, mi300x has very competitive inference TCO vs h100 .

also :

AMD Instinct MI300X Offers The Best Price To Performance on GPT-4 According To Microsoft, Red Team On-Track For 100x Perf/Watt By 2027

https://wccftech.com/amd-instinct-mi300x-best-price-performa...


wccftech is an untrustworthy source.


the article contain a quote from satya nadella : https://x.com/ryanshrout/status/1792953227841015897

None of the mentioned claims from the article are confirmed there.

the market and the selling price also includes sales strategies, penetrating a sector dominated by a strong player with somewhat "smart" sales strategies *1

and with a growing but certainly less mature product ( expecially software ), it requires suitable pricing and allocation strategies

1. https://www.techspot.com/news/102056-nvidia-allegedly-punish...


the price of h100 reflects and reflected the fact that there is a total monopoly in the training sector,

amd is successfully attacking the inference sector, increasing its advantage with mi325 and aiming for training from 2025 with mi350 (and Infinity Fabric interconnect and other types of interconnection that are arriving for the various topologies), which will probably have an advantage over blackwell, and then fall back against rubin and come back ahead against mi400,

at least, this is what it seems, and as long as the rocm continues to improve.

Personally I am happy to see some competition in the sector and especially on open source software


This stuff is the actual reason nvidia is under antitrust investigation.

boo boo, a GTX 670 that cost you $399 in 2012 now costs $599 - grow up, do the inflation calculation, and realize you’re being a child. gamers get the best deal on bulk silicon on the planet, R&D subsidized by enterprise, fantastic blue-sky research that takes years for competitors to (not even) match, and it’s still never enough. ”Gamers” have justified every single cliche and stereotype over the last 5 years, absolutely inveterate manbabies.

(Hardware Unboxed put out a video today with the headline+caption combo “are gamers entitled”/“are GeForce gpus gross”, and that’s what passes for reasoned discourse among the most popular channels. They’ve been trading segments back and forth with GN that are just absolute “how bad is nvidia” “real bad, but what do you guys think???” tier shit, lmao.

https://i.imgur.com/98x0F1H.png

this stuff is real shit, nvidia has been leaning on partners to maintain their segmentation, micromanaging shipment release to maintain price levels (cartel behavior), punishing customers and suppliers with “you know what will happen if you cross us”, literally putting it in writing with GPP (big mistake), playing fuck fuck games with not letting the drivers be run in a datacenter, etc. You see how that’s a little different than a gpu going from an inflation-adjusted $570 to $599 over 10 years?

(And what’s worse the competition can’t even keep that much, they’re falling off even harder now that Moores law has really kicked the bucket and they have to do architectural work every gen just to make progress, instead of getting free shrinks etc… let alone having to develop software! /gasp)

In entirely unrelated news… gigabyte suddenly has a 4070 ti super with a blower cooler. Oh, and it’s single-slot with end-fire power connector. All three forbidden features at once - very subtle, extremely law-abiding.

https://videocardz.com/newz/gigabyte-unveils-geforce-rtx-407...

and literally gamers can’t help but think this whole ftc case is all about themselves anyway…


mi300x production is ramping , in latest earning report lisa su said 1H2024 is production capped , 2h2024 have increased production ( and still have some to sell ), thanks probably to cowos and hbm3/(e?) supply improved

large orders for those accelerators are placed months ahead

meanwhile mi300x on microsoft are fully booked...

https://techcommunity.microsoft.com/t5/azure-high-performanc...

"Scalable AI infrastructure running the capable OpenAI models These VMs, and the software that powers them, were purpose-built for our own Azure AI services production workloads. We have already optimized the most capable natural language model in the world, GPT-4 Turbo, for these VMs. ND MI300X v5 VMs offer leading cost performance for popular OpenAI and open-source models."


I'm wondering if the tensor parallel settings have any impact on the performance. My naive guess is yes but not sure.

According to the article: """ AMD Configuration: Tensor parallelism set to 1 (tp=1), since we can fit the entire model Mixtral 8x7B in a single MI300X’s 192GB of VRAM.

NVIDIA Configuration: Tensor parallelism set to 2 (tp=2), which is required to fit Mixtral 8x7B in two H100’s 80GB VRAM. """


I personally find such comparisons unfair. A good comparison should optimize for each device configuration, which means use a model within the VRAM limit and quantize to 8 bits where it boosts performance etc and avoid shortcomings of both devices unless necessary.


AMD has better seemingly better hardware - but not the production capacity to compete with Nvidia yet. Will be interesting to see margins compress when real competition catches up.

Everybody thinks it’s CUDA that makes Nvidia the dominant player. It’s not - almost 40% of their revenue this year comes from mega corporations that use their own custom stack to interact with GPUs. It’s only a matter of time before competition catches up and gives us cheaper GPUs.


> their own custom stack to interact with GPUs

lol completely made up.

are you conflating CUDA the platform with the C/C++ like language that people write into files that end with .cu? because while some people are indeed not writing .cu files, absolutely no one is skipping the rest of the "stack" (nvcc/ptx/sass/runtime/driver/etc).

source: i work at one of these "mega corps". hell if you don't believe me go look at how many CUDA kernels pytorch has https://github.com/pytorch/pytorch/tree/main/aten/src/ATen/n....

> Everybody thinks it’s CUDA that makes Nvidia the dominant player.

it 100% does


Can you explain the cuda-less stack a little more or provide a source?


some people emit llvm ir (maaaaybe ptx) directly instead of using the C/C++ frontend to CUDA. that's absolutely the only optional part of the stack and also basically the most trivial (i.e., it's not the frontend that's hard but the target codegen).


LLVM IR to machine code is not the part that AMD has traditionally struggled with. What you call "trivial" is. If everyone started emitting IR and didn't rely on NVidia-owned libs then the space would become unrecognizable. The codegen is something AMD has always been decent at, hence them beating NVidia in compute benchmarks for most of the past 20 years.


> LLVM IR to machine code is not the part that AMD has traditionally struggled with.

alright fine it's the codegen and the runtime and the driver and the library ecosystem...

> If everyone started emitting IR and didn't rely on NVidia-owned libs then the space would become unrecognizable.

I have no clue what this means - which libs are you talking about here? the libs that contain the implementations of their runtime? or the libs that contain the user space components of their driver? or the libs that contain their driver and firmware code? And exactly which of these will "everyone emitting IR" save us from?


I am talking about user and user-level libraries, so from PyTorch to cuBLAS. The rest is currently serviceable and at time was even slightly better than NVidia. If people start shipping code that targets, say, LLVM IR (that then gets converted to PTX or whatever), like one would do using SYCL, we only have to rely the bare minimum.


AMD is struggling with unsafe C and C++ code breaking their drivers.


> but not the production capacity to compete with Nvidia yet.

thats just a question of negotiating with tsmc or their few competitors

(also didn't tsmc start production of some factories in the US and/or EU?)

I mean, nvidia use tsmc, so does amd.


Yes it is - but Nvidia has larger contracts _right now_. Nvidia has been investing more money in producing more GPUs for longer, so it’s only natural that they have an advantage now.

But now that there’s a larger incentive to produce GPUs, their moat will eventually fall.

TSMC runs at 100% capacity for top tier processes - their bottleneck is more foundries. These take time to build. So the question becomes - how long can Nvidia remain dominant? It could be quarters or it could be years before any real competitor convinces large customers to switch over.

Microsoft and Google are producing their own AI hardware too - nobody wants to depend solely on Nvidia, but they’re currently forced to if they want to keep up.


Isn't their moat primarily software (CUDA) rather than supply-chain strength?

A good start for AMD. I am also enthusiastic about another non-NVidea inference option: Groq (which I sometimes use).

NVidia relies on TMSC for manufacturing. Samsung is building competing manufacturing infrastructure which is also a good thing, so Taiwan is not a single point of failure.


Without proper statistical metrics (why use average when 95% percentile is widely used?) and performance/watt this is a useless comparison.


And performance/price -> that's the bottom line.


average says more about throughput, right?

95% would be nice too


INT8/FP8 benchmarks would've been great, both cards could have loaded them with around 60GB VRAM instead of TP=2 on H100.


We just got higher performance out of open source. No need for MK1.

https://www.reddit.com/r/AMD_MI300/comments/1dgimxt/benchmar...


> Hardware: TensorWave node equipped with 8 MI300X accelerators, 2 AMD EPYC CPU Processors (192 cores), and 2.3 TB of DDR5 RAM.

> MI300X Accelerator: 192GB VRAM, 5.3 TB/s, ~1300 TFLOPS for FP16

> Hardware: Baremetal node with 8 H100 SXM5 accelerators with NVLink, 160 CPU cores, and 1.2 TB of DDR5 RAM.

> H100 SXM5 Accelerator: 80GB VRAM, 3.35 TB/s, ~986 TFLOPS for FP16

I really wonder about the pricing. In theory the MI300X is supposed to be cheaper, but whether is that is really the case in practice remains to be seen.


RunPod [0] is pricing MI300X at $4.89/hr vs $3.89-4.69/hr for H100s.

So, probably around the same price?

The tests look promising, though!

[0] https://runpod.io/


We are starting at $4.50/hr [0]. The catch is that we won't have availability until mid August.

The weird thing on Runpod is the virtual CPUs, you can't run MI300x in virtual machines yet. It is a missing feature that AMD is working on.

[0] https://hotaisle.xyz/pricing/


It doesn't matter. AMD has offered better compute per dollar for a while now, but noone switched because CUDA is the real reason why all serious ML people use Nvidia. Until AMD picks up the slack on their software side, Nvidia will continue to dominate.


Microsoft recently announced that they run chatgpt 3.5 & 4 on mi300 on Azure and the price/performance is better.

https://www.amd.com/en/newsroom/press-releases/2024-5-21-amd...


I've used ChatGPT on Azure. It sucks on so many levels, everything about it was clearly enforced by some bean counters who see X dollars for Y flops with zero regard for developers. So choosing AMD here would be about par for the course. There is a reason why everyone at the top is racing to buy Nvidia cards and pay the premium.


"Everyone" at top is also developing their own chips for inference and providing APIs for customers to not worry about using CUDA.

It looks like the price to performance of inference tasks gives providers a big incentive to move away from Nvidia.


There are only like 3 AI building companies who have the tech capability and resources to afford that and 2 of them don't even offer their chips to others or have gone back to Nvidia. The rest is manufacturers desperately trying to get a piece of the pie.

Large corporate customers like Microsoft and Meta do not use CUDA. They all use custom software. AMD doesn’t have enough GPUs to sell them yet, that’s the real bottleneck.


That's a pretty big claim, that Microsoft and Meta have their own proprietary cuda-replacement stack. Do you have any evidence for that claim?


I'm guessing what they meant is that they use toolchains that are retargetable to other GPUs (and typically compile down to PTX (nVidia assembly language) on nVidia GPUs rather than go through CUDA source -- GCC and clang can both target PTX). For example XLA and most SYSCL toolchains support much more than nVidia.


Even then it's an insanely bold assumption that a company other than Nvidia could build a better framework than CUDA for compiling PTX. Especially since CUDA is already so much C-like. I've never seen anyone go deeper than that outside of academia.

How many customers/consumers will care about services are be built with CUDA?

If they need a ChatBot that uses a model with same accuracy and performance as on non-CUDA hardware, would they still want CUDA based hardware?


Who is going to build the architecture and compile the device specific kernels? You have to pay those people as well and you can save tons of money and time if you do it with cuda.

Unless you develop in CUDA, you can easily train code (e.g. PyTorch) written for training on Nvidia hardware on AMD hardware. You can even keep the .cuda() calls.


In theory. But if you actually work with that in practice, you're already going to have a bad experience installing the drivers. And it's all downhill from there.


And this shouldn't be to hard if you know the ins and outs of the hardware and have a reasonable dev team. So why aren't they doing it?


> and have a reasonable dev team

probably this.


I'm skeptical of these benchmarks for a number of reasons.

1. They're only comparing against VLLM, which isn't SOTA for latency-focused inference. For example, their vllm benchmark on 2 GPUs sees 102 tokens/s for BS=1, gpt-fast gets around 190 tok/s. https://github.com/pytorch-labs/gpt-fast 2. As others have pointed out, they're comparing H100 running with TP=2 vs. 2 AMD GPUs running independently.

Specifically,

> To make an accurate comparison between the systems with different settings of tensor parallelism, we extrapolate throughput for the MI300X by 2.

This is uhh.... very misleading, for a number of reasons. For one, at BS=1, what does running with 2 GPUs even mean? Do they mean that they're getting the results for one AMD GPUs at BS=1 and then... doubling that? Isn't that just... running at BS=2?

3. It's very strange to me that their throughput nearly doubles going from BS=1 to BS=2. MoE models have an interesting property that low amounts of batching doesn't actually significantly improve their throughput, and so on their Nvidia vllm benchmark they just go from 102 => 105 tokens/s throughput when going from BS=1 to BS=2. But on AMD GPUs they go from 142 to 280? That doesn't make any sense to me.


Is this an ad for a new, closed-source, GPGPU backend?



Pretty much and the test suit is optimized to get the results they wanted.


Pretty sure a useful benchmark for this kind of thing would calculate performance per watt (or per watt and dollar).

That info is conspicuously absent from the article.


The electricity consumption in the cloud is not really important.

The H100 rents for about $4.5/hr consuming 0.7kWh in that hour which will likely cost them less than 7 cents.


> The electricity consumption in the cloud is not really important.

That just says you don't run a cloud for profit :)


Shouldn't the right benchmark be performance per watt? It's easy enough to add more chips to do LLM training or inference in parallel.

Maybe the benchmark should be performance per $... though I suspect power consumption will eclipse the cost of purchasing the chips from NVDA or AMD (and costs of chips will vary over time and with discounts). EDIT: was wrong on eclipsing; still am looking for a more durable benchmark (performance per billion transistors?) given it's suspected NVDA's chips are over-priced due to demand outstripping supply for now, and AMD's are under- to get a foothold in this market.


Not quite. Assume 1kW power consumption (with cooling). At $0.08/kWh (avarage US industrial rate) this is $700 per year. Adjust for more cooling etc and for say 5 years of usage but you still won't be anywhere near the $25k MSRP for H100.


Given that a lot of projects are written or optimised for CUDA, would it require an industry shift if AMD were to become a competitive source of GPUs for AI training?


Every hardware vendor is working to provide something with their own technology. I don't know if it's possible but a lot of very resourceful companies are doing their best to break the CUDA dominance. I really hope it works and hopefully a non proprietary standard emerges.


The model code is comparatively tiny compared to pytorch or CUDA itself. Translating models from CUDA/C could be laborious but not a barrier.

Making AMD work effortlessly with pytorch et al should make the switch transparent.


These kinds of comments make me think few people have actually tried. My experience has been 1 work day of getting things set up to work the same as before for training and testing (PyTorch).


You have to consider that the average person who tried to do machine learning on AMD GPUs got burned in the past decade and has no reason to change their opinion. Also in the past it was much harder to get access to cutting edge GPUs from AMD. The fact that AMD drops GPU support for ROCm quickly also earns them scorn. I don't think it is an unfair assessment. They earned their reputation.


ROCm has improved a lot. And you can rent mi300x in the cloud now. So if you have a program that runs on Nvidia GPUs, it takes no time to test it on a cloud mi300x. If it works you can use it and save some money in the process.

AMD supports PyTorch out of the box , these comments make me feel no one has tried or even working on this stuff

The comparison is between setups with different amounts of GPU RAM and there's no quantification of final performance/price.


So? If you get twice the RAM at a comparable price and that leads to twice the performance, what's wrong with comparing that?


Nothing wrong - just for transparency.

Also, the price difference is not quantified.

Additionally, CUDA is a known and tangible software stack - can I try out this "MK1 FLywheel" on my local (AMD) hardware?


None of those things matter, if all you're looking to do is run your existing workloads on mi300x in the cloud. You get more bang per buck by going AMD.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: