Another interesting challenge for intel (and AMD to a lesser extent). Between the share of compute moving to AI accelerator (and NVIDIA), and most cloud provider having in house custom chip, I wonder how intel will positioned themselves in the next 5 to 10 years.
Even if they could fab all of those chips, the margin between the fab business and CPU design is pretty drastic.
TSMC fabs leading node, and has consistently for several cycles now. So its margins probably benefit from a premium.
If Intel can make their foundry business work and keep parity with TSMC, the net effect is that margins for leading node compress from the increased competition.
> the net effect is that margins for leading node compress from the increased competition.
That is true in perfectly competetive markets, but I'm skeptical about that idea holding true for high-end chip nodes.
I'm not sure there is enough competition with Intel joining the market alongside TSMC, Samsung, and all the other (currently) minor players in the high-end space. You might see a cartel form instead of a competative market place, which is a setup where the higher margin is protected.
My best guess is the price will remain high, but the compression will happen around process yields. You could successfully argue that is the same as compressing margins, but then what happens after peak yield? Before prices compress, all the other compressable things must first squeeze.
> You might see a cartel form instead of a competative market place, which is a setup where the higher margin is protected.
Wouldn't it more likely be that players just carve out niches for themselves in the high-end space where they DON'T compete?
If you're Intel - it seems like a fools errand to spend $50B to maybe take some of TSMC's customers.
You'd probably rather spend $10B to create a new market - which although smaller - you can dominate, and might become a lot larger if you execute well.
I figured you'd see margin compression from the major, volume-limited buyers: e.g. Apple, Nvidia, etc.
Paying a premium to get a quota with TSMC looks different, if there's competitive capacity, at least for large customers who can afford to retask their design teams to target a different process.
Even if only as a credible stalking horse in pricing negotiations with TSMC.
All I wanna know is how it compares to AWS Graviton2/3/4 instances. Axion needs to be cheaper, faster, or lower emissions to be worth even considering. Everything else is just talk and vendor lock in.
By lock-in, I’m referring to my EC2 committed use savings plan that would prevent me from considering migrating to GCP until it expires next year, even if Google’s instances are quantifiably better.
It’s difficult to quantify emissions because the power generation is mixed and changes based on demand. Water consumption should also be a factor, but there’s even less data available for that.
My hunch is that this is theater. The executives wanted this program as a weapon. Google didn't like being at the mercy of processor suppliers. Now they can threaten to just make their own. It could ultimately turn out to be a bluff, if their suppliers drop their price and improve their lineup. Source: made it up.
This is probably one reason why Intel is moving towards providing foundry services. The barrier to entry for doing chip manufacturing is higher than for designing chips now. It’s still an open question if Intel can compete with TSMC and Samsung though.
Intel 4 is superior to anything Samsung offers, and not nearly as far behind TSMC's 3nm density as people are lead to believe. The "open question" is mostly about fab scaling and the business side of foundary management. Their silicon works just fine in the market as it stands.
I read a lot about how important it is for a foundry to be able to work with customers and this used to miss from the company DNA of Intel. Cell library, process that sort of thing. We shall see.
Right, Google invested in non x86 since 2016 afaik (I was in the team supporting arm powerpc inside Google). At the size of Google, it's pretty much can break from any vendors without damaging it's core businesses
The Cloud Next Conference is obviously timed to the quarterly results which is obviously timed to the ARM processors, wait, no, the ARM processors are obviously time to the Cloud Next Conference which is obviously timed to the quarterly results
I'm divesting as must as possible from cloud offerings these days. We have our data "lakes" in colocation now, @ 1/40th the cost of the cheapest cloud offerings, but with greatly expanded CPU, mem, and disk, so it's not even an apples-to-apples comparison.
What I am jealous of though is these x86 competition announcements by AWS/Gcloud, as there simply is nothing available outside Ampere Alta, and it has not seen a refresh in awhile. The Nvidia Grave CPU is eternally delayed and I'm guessing will not be at a competitive price point if it ever makes it to market. I've come to appreciate how important memory bandwidth is after using a M1 cpu for awhile.
I am interested in the market impact of offloads and accelerators. In my work, I can only realistically exploit capabilities that are common between AWS and GCP, since my services must run in both clouds. So I am not going to do any work to adapt my systems to GCP-specific performance features. Am I alone in that?
Of course, but even if Google makes somehow networking 50% more efficient, I can't architect my projects around that because I have to run the same systems in AWS (and Azure, for that matter).
That's the flavor of it, but they didn't give us enough data to really think about. But the question stands generally for accelerators and offloads. For example would I go to a lot of trouble to exploit Intel DSA? Absolutely not, because my software mostly runs on AMD and ARM.
I think it’s a heck of a lot more than that. 30 million seems puny compared to the revenue their services businesses generates. Though to be fair much of it probably doesn’t run on public clouds.
From what I understood is that everything runs on public clouds. They tried Microsoft, Google, and Amazon. Sooo they should have enough experience by now.
Sourcing here is 2019 article about just one AWS contract. Apple also uses Google Cloud and Azure extensively, not just tryouts, they were one of Google Cloud's biggest customers. They are also building their own data centers. (TL;DR it's much more complicated than these comments would indicate at their face)
I find the posturing as a thought leader and industry leader (on this topic especially) a bit ironic. A cloud provider licensing ARM Neoverse and throwing an ARM chip into their cloud compute boxes is not exactly a novel business practice.
I'm happy to see this, and it should be all goodness, but... the posturing... I don't want to be negative for the sake of being negative, but I don't understand how anyone can write that first paragraph with a straight face and publish it when you're announcing ARM chips for cloud in 2024(?, maybe 2025?).
I’m all for investment in less power hungry chips. Even if it’s from Google (for a short period of time. Who knows how long these chips will be supported)
Server CPUs are not power hungry. Only the CPUs used in desktops and workstations are power hungry.
The server CPUs (including x86 like AMD Genoa/Bergamo) normally consume between 2 W and 4 W per core, the same as the big cores of the laptop or smartphone CPUs.
A server CPU consumes hundreds of watts only because it has a very large number of cores, so it is equivalent with a very large number of smartphones or laptops collocated in the same box.
In complete servers, there is a significant fraction of the power that is not consumed by the CPUs but by the many high-speed interfaces. That is unavoidable and it does not depend on the architecture of the CPUs.
Any newly announced server CPU should be more efficient than those already in production, otherwise there would be few advantages in introducing it, but which core architecture is used does not have enough influence as to ensure that it will not be leapfrogged by whatever server CPU will be announced later, regardless of the core ISA.
When Apple has switched to Arm it was able to improve the efficiency by using much lower clock frequencies than Intel. This was only an indirect effect of using the Arm ISA, because it makes much easier the concurrent decoding of a great number of instructions. For server CPUs, the Arm ISA cannot enable similar efficiency gains, because all server CPUs are already using the most efficient clock frequencies, in the range 2.5 to 3.5 GHz, so it is not possible to repeat an efficiency gain like in Apple M1, by reducing the clock frequency.
All the cloud vendors have chosen Arm because it is the only ISA that can be currently used to design their own high-performance SoCs, not because it would provide efficiency gains over the Intel/AMD cores. The overall efficiency of a custom SoC will indeed be better, because it is streamlined for their specific system architecture, in comparison with a general-purpose CPU from Intel, AMD or Ampere.
I would be very surprised if there is much overlap between this server sku and any mobile sku. Mainly because Google doesn’t design the compute core, they integrate some number of cores together with many different peripherals into an soc. Mobile hardware will need vastly different integrations than server hardware and the power / area optimizations will be very different. (Mobile is heavily limited by a maximum heat dissipation limit through the case) the reusable bits like arm cores and pcie / network interfaces might be reusable between designs, but many of those come from other venders like arm or synopsis
> Google did not provide any documentation to back these claims up and, like us, you’d probably like to know more about these chips. We asked a lot of questions, but Google politely declined to provide any additional information. No availability dates, no pricing, no additional technical data. Those “benchmark” results? The company wouldn’t even say which X86 instance it was comparing Axion to.
It's google, so probably not. Their claims should always be taken with a grain of salt. They went with their own stuff route for the SOC on their Google Pixel phone lines and those SOC are always much worse than the competition and are the main reason why those phones have such terrible battery life compared to Qualcomm powered phones.
My point is we shifted from "compared it to Python in fine print" to "compared it to Objective-C and Python in bold focused callout". People are jaded, we can argue that's actually the same thing, that's fine, but it is a very different situation.
Eight testimonials and it's clear the companies haven't been able to test anything yet. "[company] has been a trusted partner for customers adopting Arm-based virtual machines and an early adopter of Arm for our own operations. We’re excited about Google Cloud's announcement of the Axion processor and plan to evaluate it [...]'
This is not an end-user service, this is custom-built hardware of which they know that it is more efficient than what they are already using. Did they kill their VCU and TPU, which was created for the same reason?
Google does kill a lot of products but that site rubs me the wrong way, they really stretch the definition of "killed" to run up the numbers as much as possible. Products that were simply rebranded or merged into another product are listed as "killed" even though they functionally still exist.
The old Google Drive desktop client is on there, for example, when they still have a Drive desktop client. You may as well list Chrome versions 1 through 122 as products they've killed by that standard.
Yes, thank you, I feel the same way. It's especially frustrating because I would really like to have an accurate and fair list that I can reference when I need that information, or refer people to it when their memories are short. It's very not helpful in its current form
Edit: I don’t get the downvotes because I completely agree with the commentators that Google is unreliable. I almost pasted the killedbygoogle link myself!
With more and more big players starting production of customized private proprietary hardware compatibility becomes increasingly difficult.
Which works out well for the big players who can leverage it as lock ins.
Regular people wont be able to buy the chips from Microsoft, Google,
and you only get M* chips if you buy Apple hardware.
Good luck with the frankenMacs now.
At the same time devices that you buy get locked into company as well,
and if that company goes out of business you are screwed.
Like I was when Ambi Climate closed up shop and left with e-waste.
All the hardware I need is in there but I can do anything with it.
Or when Google decided to close down access for Lenovo Displays, because
they didnt want to support 3rd party Displays anymore.
Two more e-waste devices for me.
(There might be a way to save the Displays I just haven't got in working yet)
Open, compatible, standardized omni purpose hardware seems to be dying.
Much more profit in lock ins
> you only get M* chips if you buy Apple hardware.
Former M* team members are now at Qualcomm and HP/Dell/Lenovo will ship their Arm laptops later this year, including Windows 11 and upstream Linux support.
PC hardware (and hardware in general) has never been particularly open. We simply seem to move from one dominant player to the next. I don't think AWS/GCP using custom chip for their cloud offering changes much of the situation (well at least before they start having weird custom instructions).
It isn't a big issue. But ARM doesn't have a universal boot loader/device discovery/etc standard like EFI/ACPI, so there is some more work to support them.
Arm servers do precisely have exactly that set of standards ((U)EFI/ACPI). See Arm SystemReady. You'll notice in the blog linked above that it mentions Arm SystemReady-VE, which uses those standards.
Downside is that GCP still has the same faults regardless of which CPU is being used. Things like poor customer interaction, things not working as designed etc. Switching to ARM won't solve that.
As much as I don't trust google and their customer service is trash, their infrastructure is mostly good and some of their services are very aggressively priced. I wouldn't 100% buy into GPC but a multi cloud solution that leverages some of the good bits is definitely a top tier approach.
I don't think they're a big deal if you're hitting vertex for inference and cloud run for long tail services. If you're 50/50 with AWS though that might be a different story.
It's not about easiest to use but the way in which problems are handled. I am aware of case where something failed after being assured it would work by GCP ... when they got someone a GCP tech lead on the video call he started by saying it was all 'best effort' and that it might take several days to resolve. Ultimately it was fixed in 8 hours but that sort of laissez-faire attitude has led to the company in question making plans to go elsewhere.
Feels like they've changed their pricing structure for bigquery multiple times in the past couple years. They've never turned the status page yellow or red but there have been a few incidences where from our perspective the service was clearly degraded.
yeah on the whole (data eng by day -- data eng contractor by night) using both AWS and GCP I much prefer GCP to AWS. I find it far more simple to use, has sane defaults, a UI that isn't harmful to users doing clickops, etc.
AWS gives you low level primitives to build the moon or shoot yourself in the foot.
That'd be like complaining AWS don't sell Graviton chips to the public. Why would they, they're a cloud provider building their own chip to get a competitive edge. Selling hardware is a whole other business.
https://cloud.google.com/blog/products/compute/tau-t2a-is-fi...
So I guess this may be the end for Ampere on GCP?