Hacker News new | past | comments | ask | show | jobs | submit login
AMD CEO: The Next Challenge Is Energy Efficiency (ieee.org)
396 points by sohkamyung on Feb 22, 2023 | hide | past | favorite | 309 comments



AMD has done extremely well with multi-chip(let) modules. Zen cores & zen clusters (on Core Chiplet Die, CCD) are wonderfully small, and a huge amount of the regular stuff cores do is relegated to the IO Die (CCX), which is not as cutting edge.

But wow there's a bunch of power burned on interconnect between CCDs and CCX. And now AMD's new southbridge, Promontory 21, made by Asmedia, is another pretty significant power hog, and the flagship X670 tier is powered by two of these.

There's absolutely a challenge to bring power down. I'm incredibly super impressed by AMD's showing, & they've done very well. But they've been making trade-offs that have pretty large net impacts, especially if we measure at idle power.


CCX stands for Core Complex

CCD stands for Core Complex Die (and neither terms refer to the IO die)


Whoof, oops, thanks. I'd been using CCD and IOD as terms until this post, but had a "omgosh, I've been doing it wrong" panic & changed into what we have here. My mistake. Thank you for correcting us back!!

https://hn.algolia.com/?query=rektide%20ccd&type=comment


Manufacturing costs force them to go down the chiplet path. Its actually impressive they can remain competitive at all given TSMCs margin of 67%. [0] If Intel Foundry manages to keep up with TSMC, thats a lot of pricing power advantage over AMD. Or they could make lower-power CPUs that would be uneconomical for AMD.

[0] https://www.macrotrends.net/stocks/charts/TSM/taiwan-semicon...


Intel can't figure out how to make the next generation of chips. They used to be able to innovate on a steady regular basis, but they're still trying to get 7nm right and they've been working on that for many years while TSMC is on to 3nm. Makes you wonder what happened to that company that they got so far behind.


Why is EBITDA margin better in this context than net margin? Wouldn’t manufacturing facilities have a ton of depreciation and amortization that should be incorporated into the costs?


Gross margin is the appropriate measure of per unit cost/profit. Yes, EBITDA is not a good measure at all (for part economics).


But don't forget that TSMC's margins for massive bulk purchasers like AMD will be lower than the average.


Does TSMC even have non-bulk customers at 7nm/5nm/4nm? I would have thought the mask costs are so high that it isn't economical except for the biggest companies.


7nm yes, 5nm/4nm not as much, apple pretty much bought out the entirety of 3nm. 7nm is a relatively mature process at this point.


Cerebras has probably bought less than 100 7nm wafers, total. Though they may have gotten many more sales than the last I checked. Tesla Dojo is probably about the same. I've seen loads of random chips like that on 7nm. I guess it depends what you consider "bulk".


Yup, obviously they are aware of AMD competition and can adjust their margins accordingly.


I'm not up-to-speed on modern chipsets, but WTF is the Southbridge doing that it needs that much power? Is Thunderbolt going through it or something? I think of the Southbridge as a mostly ignorable part of the chipset (up until something goes horribly wrong).


Single word: PCIe.

Lots of fast PCIe lanes eat a lot of energy. A server motherboard contains tons of more PCI devices when compared to a consumer desktop systems, and they are not the cards, but the small units enabling the advanced features in servers, which are embedded on the motherboards themselves.

Thunderbolt is just a PCIe encapsulator of some sort, which can also do plethora of other things.


The one contradiction I have to point out here is that server motherboards dont need big southbridges: the cores themselves have gobs of PCIe. 1 and 2P AMD Epyc server cores have 128 lanes of PCIe.

I wish modern chips did a better job of breaking down where power went. It'd be so interesting to know how much power is going to usb controllers, how much is going to PCIe. I'd also hope that they could do things like shut down parts of the chip, if there's no USB or PCIe devices plugged in. But these chips seem to have a pretty high starting place of power consumption. Although maybe it's in part because the first example were flagship motherboards with a whole bunch of extra things peppered across the board - fancy NIC chips, supplementary thunderbolt controllers, sound cards, wifi - so maybe there was just an unusual lot of extra stuff going on. But it has been shocking seeing idle power raise so much on the modern platforms. It feels like there's a lot of room for improvement in power-down.


> Single word: PCIe

This explains Intel’s squandering of PCIe lanes for consumer desktops versus AMD’s generosity.


There's very little difference in the number of PCIe and M2 slots between Intel and AMD on their consumer platforms. The only difference really between AM5 and LGA 1700 motherboards is a lot more AMD boards have one M2 PCIe 5.0 slot, while only the very top Intel boards have this feature.


AMD has 24 PCIe 5.0 lanes directly from the CPU available for user, while Intel has 16 5.0 + 4 4.0. Cheaper motherboards might not expose all of those, or downgrade some to 4.0 to save on on-board components. In addition, both have more on the chipset, which is connected to (additional, reserved) PCIe 4.0 lanes. The best AMD chipsets have 12 4.0 and 8 3.0 lanes, while the best Intel ones have 20 4.0 and 8 3.0 lanes. An important point is that the connection between the chipset and CPU is twice as wide on intel (8x vs 4x).

So overall, AMD has more and faster IO available directly from the CPU, but less lanes from the chipset, and with a weaker connection to the chipset. If PCIe 5.0 drives become available and the transfer speed to storage is important, I'd say AMD is better, otherwise I'd say Intel has more IO.


> very top Intel boards

Where "top" means "most expensive". The Z790 board I recently purchased for around $300 was pretty barebones and lackluster (no TB, meager IO from ports and headers, wattage constrained VRM relative to 13th gen TDP, etc), but it was the least costly way to work with an Intel proprietary technology.

It'll last another four or five years, but it was my first Intel build since the slocket days, and likely my last.


AM5 boards aren't any cheaper than that. I'd say that Intel boards are generally cheaper in fact, though that has narrowed with Z790.

You do get to use the AMD boards for more than two CPU generations though.


Oh, I assumed that PCIe was on the northbridge. Of course PCIe eats energy.


To me the most impressive part is what incredible job they did in the years of intel dominance. They almost bulldozered themselves.


I see what you did there.


Please explain!


Bulldozer [0] was a major flop. Sounded too good to be true: 2 cores sharing the same FPU because integrated GPUs should be executing these instructions much faster anyway. Maybe it was just ahead of its time and they simply couldn't deliver

But it failed hard which coincided with Intel releasing a major winner with their last planar architecture - sandy bridge.

As a result, AMD spent years circling the drain and their stock dipped below 2 dollars. Some people made good money buying around that time.

[0] https://en.wikipedia.org/wiki/Bulldozer_(microarchitecture)

Edit: oh, I missed the exclamation mark haha. Oh well. Too tired to even feel ashamed


It was a pun: "Bulldozer" as in "(v.) to destroy" and also a former AMD microarchitecture which, as sibling reply mentions, was a flop.


Energy efficiency and improved packaging are things I can readily agree with. The last thing though - “hybrid algorithms leveraging AI efficiency” sounds an awful lot like a buzzword sales pitch.

This article reminds of this other one [1] posted about 1 month ago.

It’s an interview with some guys that just got done building an exascale supercomputer, in which it was originally estimated to need 1000 megawatts but ultimately only needs 60. The reporter asks about zettascale and the power requirements; they wave it off and say that the big question about whether it will even be possible in the next 10 years is getting the chip lithography small enough so that you can physically build a working zettascale supercomputer.

[1] https://news.ycombinator.com/item?id=34604319


Re. The “hybrid algorithms” bit: I was at this talk. The example she gave was a physics sim like CFD, iterating between a fast/approximate ML-based algorithm and slow/accurate classical physics algorithm, with the output of each feeding in as the starting point of the next round. But this was just an example, clearly there are lots of area you could apply a similar approach.


The main thing AMD has in their accelerators that enables this this is unified memory between CPU and GPU. Thats really interesting.


This has been something I've been incredibly pleased with on the apple silicon SOC's. Albeit slowly, being able to load large datasets or blender scenes on a portable, efficient laptop and still being able to use the GPU is a nice touch.

Of course performance wise it doesn't touch the $1k+ graphics cards with crazy amounts of ram, but for students and if I need to do something quick on the go, its a really useful tool.


Don't forget that AMD also bought Xilinx


@WithinReason

I expect them to use Xilinix's AI engines primarily in their CPUs, APUs and GPUs - not so much FPGA.


I expect Xilinx's AI engines to never be integrated into anything AMD. Because Xilinx AI engines are VLIW - SIMD machines running their own instruction set.

----------

AMD is doing the right thing with Xilinx tech: they're integrating it into ROCm, so that Xilinx AI engines / FPGAs can interact with CPUs and GPUs. But there's no reason why these "internal core bits" should be shared between CPU, GPU, and FPGA.


How do you expect FPGAs to be useful here?


I think its also leaning into their new product MI300, which has 24 cpu cores with 8 compute dies. Both CPU and GPU (and memory) on a single package.

Conventional processing + AI together. Hybrid approaches.


Afaik, having worked in HPC, one area where this can be employed is error bias correction of CFD models. E.g. weather models have various biases that need to be corrected for - so far this is just done with some relatively simple statistics afaik.


Yes they were really crowing about MI250X adoption and touting the upcoming MI300


So differentiable programming with nns as approximators? Cool!


I imagine something like AlphaZero would also benefit: It's basically a hybrid of tree search and neural network.


“hybrid algorithms leveraging AI efficiency”: An example I came across recently:

https://arxiv.org/abs/2202.11214

A neural network-based solution for weather forecasting: "FourCastNet is about 45,000 times faster than traditional NWP models on a node-hour basis."


I assume it means using ML to choose between multiple algorithms instead of more traditional heuristics.

That or the algorithms use a combination battery and gasoline.


AMD already uses a Perceptron for their branch prediction. I would say they're talking about support for ML speed ups in hardware but maybe the plans also include a more complex complete neural net in hardware for data prediction.


What are hybrid algorithms supposed to be anyways? Half algorithm & half dictionary?

Half algorithm & half user manual?

Half algorithm & half class?


Generally the concept of hybrid algorithms is:

- Half "hard maths" algorithms. i.e. cominbatorics, geometry, etc.

- Half "fuzzy maths" algorithms. i.e. heuristics, approximation, machine learning.

The idea being to solve the parts that can be easily solved by hard maths with those hard maths so that you can reduce the problem space for when you apply the fuzzy maths to solve the rest of the problem.

In other words, it's taking the problem, breaking out discrete pieces to solve with well established hard maths, using heuristics & numerical solutions to tackle the remaining known problems without "easy" analytical solutions, then using ML to fill in all the gaps and glue the whole thing together.


Half conventional HPC simulation that runs best on CPU, half Neural Networks that need GPUs. As proposed for example here

https://www.nature.com/articles/s41586-019-0912-1

and here

https://www.nature.com/articles/s42256-021-00374-3

> The next step will be a hybrid modelling approach, coupling physical process models with the versatility of data-driven machine learning.

The Frontier HPC system that AMD just delivered is aimed fully at that problem.

https://en.wikipedia.org/wiki/Frontier_(supercomputer)


The article gives a concrete example in the same paragraph: "For example, AI algorithms could get close to a solution quickly and efficiently, and then the gap between the AI answer and the true solution can be filled by high-precision computing."

Interestingly the example is backwards (statistical reasoning first, hard reasoning second) compared to traditional usage of "hybrid" in AI and control contexts.


That's not concrete at all.


From the article it looks like half AI guessing the solution, half some static algorithm fixing the result to be better. Not sure how it's supposed to really work.


one variant of this is sciml approaches where you use an ODE solver wrapped around a NN. the ODE solver guarantees you get the right conversation laws which NNs don't do well and the NN is more accurate than the hand written model since it doesn't ignore the higher order effects.


Half SQL statements and half array.sort() methods I assume.


    (solve (this :by strong-ai) (that :by weak-ai))


> The last thing though - “hybrid algorithms leveraging AI efficiency” sounds an awful lot like a buzzword sales pitch.

It's supposed to. Investors want to hear this, not some crap about efficiency, when the entire world is talking about AI.

Edit: TBC, I care about efficiency and it's not crap, but that's likely the view of investors.


> The last thing though - “hybrid algorithms leveraging AI efficiency” sounds an awful lot like a buzzword sales pitch.

Oddly, to me this just sounds like efficiency gains by potentially introducing massive security holes i.e. the vector of the Meltdown/Spectres of the future. It also seems like they're trying to sell AI as some sort of secular qubit that they'll be error-correcting.


Not a cpu designer, but I remember AMD using AI for branch prediction since long ago. Hopefully they still mention AI in that sense and not only branding.


From the perspective of a consumer, my MacBook Pro is basically the perfect laptop, at least in theory.

I love the battery life and performance of the hardware, not to mention the unrivaled build quality of the MacBook (screen, trackpad, keyboard).

In practice, however, MacOS limits the capabilities of the hardware such that I cannot daily drive my MacBook Pro as a work or personal computer (poor containerization support, an annoying development toolchain, and no _real_ support for video games).

When Asahi Linux is mainlined, stable and features full hardware acceleration - the MacBook running Linux will likely be the best laptop money could buy. Until then, please AMD, Intel, release some mobile hardware that's at least as good. It sucks so bad seeing what is possible with today's technology but that being exclusive to a company unsuccessfully determined to ring fence you into their API ecosystem.


My biggest complaint is that I have a "16-core Neural Engine" in my MacBook, but nothing that can be run on it. Sure, internal macOS tools such as the fingerprint reader or even webcam might use it, but not a single ML project on github makes use of it. I can be lucky if ML projects don't depend on CUDA to run.


Here's one project

https://machinelearning.apple.com/research/stable-diffusion-...

https://huggingface.co/blog/diffusers-coreml

Not surprisingly, they are quite a bit slower than those power-hungry nVidia GPUs.


Honestly, I think I'll sell my MacBook Pro and just buy a new one with less cores. Bought it expecting to be able to abuse it towards ML, but it's barely even supported.


From my experience developing C++ on it, as long as you don't use anything exotic, everything is portable (i.e. just recompile).

On the other hand, for the missing apps and other stuff, I'm running a Linux VM via VMWare Fusion Pro. It works efficiently, and interoperability is good. Just add another internal network card to the VM, and keep an always on SSH/SFTP connection. Then everything works seamlessly.

Never had any problems for 7 or so years when developing that application and writing my Ph.D. in the process.


From my experience developing C++ on it, 99+% of everything is portable, but the ~1% that isn't, causes a disproportionate amount of annoyances and extra work. We still have a program that we have to run in an x86 docker container (with all the problems that brings), just to get it to work on our M1 Macs.

The being said, the M1 Pro is a more than good enough piece of hardware that I'm willing to put up with this.


Been there done that since the era of Powerbooks - and it always causes my laptop's battery to deplete much faster.

Because I'm using my laptop to, well, work on the go a lot; this is a showstopper.


From my experience, the newest VMWare versions and Linux kernels are very good at conserving power when the VM is mostly idle.

I'm doing on that my 2014MBP, and it didn't cut the endurance to half, but need to re-test it for exact numbers. However, it doesn't appear in "power hungry applications" list unless you continuously compile something or run some service at 100% CPU load. Also, you can limit the resources it can use if you want to further limit it down.


That's so good to hear. Very well done to everyone involved.


I tried a few apple watches, and almost got used to the routine of daily charging, but other annoyances made me try others. Eventually I settled for a Garmin smartwatch and the difference that having to charge once every 10-15 days is huge (you give up a few features, but surprisingly little, at least for my use case). I hope this new emphasis on energy efficiency enables this on laptops (and cellphones)


> poor containerization support

I use UTM and Docker on a daily basis and it is extremely smooth. What exactly is missing?

> an annoying development toolchain

What are you talking about exactly? For example most Python and Rust builds just work out of the box.

> no _real_ support for video games

https://docs.unity3d.com/Manual/Metal.html

These claims look like a bit of an exaggeration.


Wanted to have a little rant about this:

> Rust builds just work out of the box

Works great until you're the author of an application written in Rust and want to distribute MacOS binaries which are automatically generated in CI/CD.

The only (legal) way to compile a Rust binary that targets MacOS is on a Mac. So your CI needs a special case for MacOS running on a MacOS agent. Annoyingly, cross compiling CPUs architectures doesn't work so you need to an Intel and arm64 Mac CI agent - the latter being unavailable via Github actions.

To make things even more bizarre, Apple doesn't offer a server variant of MacOS or Mac hardware, which seems to indicate they expect you to manually compile binaries on your local machine for applications you intend to distribute.


nested virtualization is missing on m1/m2 on both MacOS and Asahi - it's unclear if this is a hardware or software limitation.


Asahi said it's present in M2 hardware, will likely be supported in future MacOS.

https://social.treehouse.systems/@marcan/109838053800961073

  Asahi Linux is introducing support for some brand new Apple Silicon features faster than macOS.. M1 has a virtual GIC interrupt controller for enhanced virtualization performance. Linux supports it, macOS does not.. M2 introduced Nested Virtualization support. The patches for supporting that on Linux are in review; macOS still doesn't support it.


Thanks for link, that is very exciting if it works as expected.

Curious what will enable this in M2 vs M1. Looking at https://developer.arm.com/documentation/102142/0100/Nested-v... it appears to indicate nested virtualization is in Armv8.3-A and both M1 and M2 are ARMv8.5-A according to https://en.wikipedia.org/wiki/Apple_M1 and https://en.wikipedia.org/wiki/Apple_M2

Will be interesting to see what other arm cpus gain this with the Linux patches.

Have you used (non-nested) virtualization at all on Asahi? If so what is your experience with performance and overall thoughts so far?


Apple appears to have a one of a kind special license for ARM (due to being a founder of the company) so they can pick and choose otherwise "required" extensions to support and add their own extensions as well. You can't directly compare an Apple design to a specific ARM version because of this.


> What exactly is missing?

Kernel integration and a virtualized filesystem that isn't bottlenecked by APFS. Docker is excruciating on Darwin systems.

> What are you talking about exactly?

Apple makes hundreds of weird concessions that are non-standard on UNIX-like machines. Booting up a machine with zsh and pico as your defaults is not a normal experience for most sysadmins, nevermind the laundry-list of MacOS quirks that make it a pain to maintain. For personal use, I don't think I'd ever go back to fixing Mac-exclusive issues in my free time.

> no _real_ support for video games

Besides Resident Evil and No Man's Sky (this generation's Tomb Raider and Monument Valley), nobody writes video games for Metal unless Apple pays them to.

For a while, MacOS had a working DirectX translation stack for Windows games, too. Not since Catalina though.


> Kernel integration and a virtualized filesystem that isn't bottlenecked by APFS. Docker is excruciating on Darwin systems.

Docker is great as long as you don't use bind mounts. I use it daily for development in dev containers.

> Besides Resident Evil and No Man's Sky (this generation's Tomb Raider and Monument Valley), nobody writes video games for Metal unless Apple pays them to.

There are plenty of great games on macOS. Factorio, Civilization, League of Legends, Minecraft. But you're right that there aren't too many AAA games.


> the unrivaled build quality of the MacBook (screen, trackpad, keyboard)

Typing this on a ThinkPad X13, after years of MBPs, I beg to differ. The screen on the ThinkPad is better, even at a slightly lower resolution of 1920 x 1200. It's IPS like the Mac of course, but also Anti-Reflective (matte), which Apple hasn't offered since 2008?

The ThinkPad has Left, Right, and Scroll mouse buttons, as well as a TrackPoint stick. Not so on the Mac.

Finally, the keyboard. You're going to laud Apple for their quality keyboards, really? The ThinkPad has a nicer keyboard feel (subjective, I know), has actual Home, End, PageUp, PageDn, and Delete keys, two Ctrl keys, and praise Jesus, gaps between the function keys, so you can use them confidently w/o looking.


> no _real_ support for video games

Would Wine (Proton) work out of the box on M1 Linux? My knowledge of Wine is limited. Based on the How Wine works 101 article [1]:

> The code inside the executables is “portable” between Windows and Linux (assuming the same CPU architecture).

because the CPU architecture is different, Wine would need changes to support it too - just like x32 and x64. Is that correct?

1 - https://werat.dev/blog/how-wine-works-101/


It will be possible in the near future.

https://www.phoronix.com/news/Hangover-0.8.1-Released


Apple has released Rosetta for linux. I believe the use case is running x86 binaries inside of an Arm virtual machine on Apple Silicon, instead of emulating an entire x86 CPU and running the entire OS as x86. Apparently it works pretty good and some people have even used it on non-Apple arm chips. Anyways, I wonder if it could be used in combination with something like Proton to emulate x86 on linux.


It's not impossible but it is extremely hard to do in an efficient way.


> poor containerization support, an annoying development toolchain

Just run an always-on (headless!) Linux VM in the background, and don't use the host macOS for anything besides desktop apps (Slack, VSCode, Browser, mpv, terminal emulator but always ssh into the VM, etc). The same way you deal with a Windows machine.

This works good enough unless you work on hypervisors or other bare metal only tech. But hey, that's currently non-existing on M1 Macbooks (or undocumented and locked to Apple ecosystem) anyways.

> and no _real_ support for video games

That's the real deal breaker if you are into gaming.


The Windows machine (X86) has WSL2 though, which seamlessly integrates with VSCode. Add Windows Terminal and for me it made Windows the better dev platform. I still prefer Linux though.

Sometimes working local is just the easiest/fastest most convenient.


> Just run an always-on (headless!) Linux VM in the background, and don't use the host macOS for anything besides desktop apps (Slack, VSCode, Browser, mpv, terminal emulator but always ssh into the VM, etc).

This is what I do. Specifically, I use Canonical's Multipass, and treat tmux as my "window manager," -- I even mapped my iTerm profile's Command+[key] to the hex code for tmux-prefix+[key], so that Command essentially feels like the Super key in, say, i3. For example, rather than having to type Control+p h, Command+h selects the pane to the left.

The Multipass VM is flawless. Closing the laptop doesn't shut it down, and with the tmux resurrect plugin, sessions persist between Mac restarts (which are rare). If I didn't know better, I'd think it was just a native terminal session.

If I need proper x86_64, I just ssh into my super beefy Linux NAS at home via Tailscale. Both Linux machines are identical in terms of dotfiles/etc, so it feels exactly the same.

I've truly never been happier with Linux. I no longer obsess over my window manager (which used to be a serious time sink for me), and I still get what is IMO the best desktop experience via my Mac. The only tinkering I do is checking out the occasional new neovim plugin, but I really enjoy doing that, as it has a tangible benefit to my dev workflow, and kind of feels like gardening, in a way -- I like the slow but persistent act of improving and culling my environment.

I still have a PC, but I actually recently uninstalled WSL2. It never felt truly finished or "right", and Windows Terminal can be incredibly sluggish -- keystrokes have far more latency than iTerm. I've actually started to embrace just letting "Windows be Windows," even learning Powershell (and enjoying it more than I'd expected).

I've also pretty much moved from gaming on a PC to PS5. So, for me personally, I don't really see a place for Windows anymore. Every single time I boot into my PC, something is wrong -- most recently, it literally won't shutdown unless I execute `shutdown /s`, and no amount of troubleshooting has been able to fix it. I know Windows like the back of my hand, and still it's a constant feeling of death by a thousand cuts.


MacOS has seemingly gotten worse over the years not better. Features removed. Interface has been changed for the worse. The drive to unify the desktop experience with iOS. I could go on.


In the same situation. I’ve unfortunately realised that “worse is better” applies to hardware too if it supports the software you need.


Millions of people use macs for work successfully.


I did for ages. But there’s an impedance mismatch for a lot of people. Sometimes it’s a square peg for a round hole.


Windows too. That isn’t the point.


i've used a mac for development for various projects which have been globally impactful for over a decade. it's literally unix, man.

docker on mac can be improved, but if i'm developing for other architectures it's much easier to just test natively. toolchain for all the languages i use is exactly the same as any other *nix.

gaming i'll give you a point, but that's why i have windows dual boot at home. /shrug


"it's literally unix, man"

Yup, it's NOT Linux.

Been using MacOSX since Powerbook up to first gen Macbook - and I finally gave up: I installed Linux on it instead.

Pretty much all servers are running Linux even back then. Using Unix on work computer/laptop causes way too many encounters with various quirks and glitches. It continuously drags down your productivity too many times everyday.

After much stress, I installed Linux on my Macbook instead - but then I encountered various hardware-related glitches instead.

So I finally gave up, and bought a Thinkpad.


It’s Unix but it’s not Linux. When Linux is your target it makes sense to use it on your development stack entirely. The irony being that Microsoft seem to make the best Linux OS for development for my use cases of docker / server side / cloud operations.

And it runs Office better than Linux or Mac.


"And it runs Office better than Linux or Mac"

Yup, been doing that for decades. It's called "vendor lock-in". And its unethicality has been discussed comprehensively for decades as well.

And when countries tried to move into an open document format - they'll bully the country, using the strong arm of the Uncle Sam, until they're back to MS Office again.

So when people are amazed by Bill Gates' charities - I don't. His money comes from the sufferings of countries.


I understand that corporations are bastards and built on foundations of the crushed skulls of children and involuntary human sacrifice but Office on the Mac is not a bad product and is improving. But the windows version has just been around longer and had more work done on it. And don’t get me started on LibreOffice - it’s buggy as hell. Even more than Office on a bad day

I’ve met Bill. He wouldn’t be out of place among HN’s defective half: the dubious pro SaaS VC funded US university alumni…


I wasn't talking about the product - I was talking about MS using the product to lock the whole world into its own =proprietary= format (so no one could reliably open & process it - and then others got blamed for it, not Microsoft), AND then aggressively attack those who try to escape from its lock, even countries.


All the protocols and standards are documented here https://learn.microsoft.com/en-us/openspecs/main/ms-openspec...


> The irony being that Microsoft seem to make the best Linux OS for development

WSL?


Yes WSL2


while WSL2 is great for what it is it just doesn't compare to working in a linux distro. there are a lot of pain points where external tools wont always work well with WSL.

I also like the productivity customizations posseble in Linux while the same are difficult or impossible on Windows.

of course if your work is tied into the windows eco system having WSL is good to have.


Completely agree. For me it’s a trade off. I’d rather use Linux on the desktop myself but it doesn’t work for me or the business I work for.


Controversially, I have a better development experience on Windows using msys2 + zsh (basically "git-bash" on steroids). I would put that development experience almost on par with MacOS.

WSL2's virtualized workflow just causes too many issues for me. WSL1 was better IMO but it wasn't significantly better than msys2 and also had issues (like you still need remote development tools to mount codebases inside editors) - unless you want to run/develop Linux binaries while on Windows.

For anything that isn't making basic non containerized applications (simple web applications, web servers), Windows is pretty good.

For anything more involved, requires multiple containers/compose/etc, I prefer Linux as it has the tools I need available natively and no gotyas or performance penalties.

That said, credit to Microsoft on WSL2. The auto-scaling hardware provisioning inside the VM has made containerized workflows on Windows much better. To me, it's just not better than running Linux inside VMWare/Hyper-V/VBox and "DIY"ing WSL2 yourself, something I had been doing for years before WSL2 anyway. WSL2 is more fool-proof then hand-rolling a Linux VM, so there is that.


WSL2 is great if you're stuck on Windows, but it's still really not there yet. For example:

https://github.com/microsoft/WSL/issues/8725


Better to run docker native than docker desktop. Doesn’t leak.


That's interesting to hear. I've not used Windows in over a decade (I primarily work on mac) but I've heard that WSL was quite compromising and not a great experience for people who primarily use Linux tools.


bash (and other shells), coreutils, pipes, git, text/cli utilities etc. work just fine on WSL, I'd call them Linux tools. My in-shell workflow consists of using mainly those + VSCode (with the WSL plugin) + ssh'ing somewhere now and then, and it's entirely sufficient for this purpose. I haven't tried running typical webserver/db services on it though.


I’ve run 120 containers in native docker on it and it worked fine :)


Mac being a certified Unix just means that Unix certification is meaningless, bought and paid for drivel. OS X was a certified Unix before it had atomic renames!


Su is just talking up AMD’s strengths, many of which come care of TSMC and for which the original R&D was largely funded by Apple. She is not wrong, and AMD has certainly made large gains in HPC recently, but AMD does not monopolize all possible paths to success here.


I agree that a lot of AMDs wins have come from TSMC. That being said, I feel like the biggest win for them over the past 5 years (well 6 now)... was moving to chiplets, which all started at Global Foundaries. Having the same chip, from the lowest end consumer chip all the way to the top end server chip means that they can spend less time and cost developing up processors for every segment, and they just bin the chips in terms of quality.

Intel is starting to make changes towards this structure but they haven't fully committed to it yet.


A move to chiplets would be just prolonging their terminal suffering if they didn't replicate Haswell in the first Ryzen chips, making them somewhat performance-competitive and due to chiplets also economically viable with the ability to wow consumers with the first 8 core desktop chips and later 12/16/24/32-core ones.


5nm? It was also funded by AMD since it is their customer, and by many other customers.


AMD just released a 5nm product a few months ago. Apple’s M1 is a year older. Apple funded 5nm


AMD is having some challenges at the moment.

In particular, the RDNA3 graphics card line launch has been a dud.

The Nvidia 4090 turned out to be far ahead of the AMD 7900XTX.

Then there turned out to be an overheating issue on the AMD cards.

And now it just looks like both Nvidia and AMD are price gouging GPU buyers instead of competing with each other. They are deliberately keeping prices high and creating artificial shortages of GPUs, because that's what kept prices high during covid/crypto mining.

I used to be really cheering for AMD as the underdog, but I guess its true that none of these companies are your friend, they're just there to shake you down.

AMD had a chance to really pull ahead of Nvidia by being "the good guy" and actually offering end users great value for money, but instead they've chosen to emulate Nvidia.

Substantial customer good will has been lost by AMD.


Their graphics division has certainly been lacking vs Nvidia's cards, but are still far ahead of Intel's. Their desktop and server CPUs are crushing the market.

> They are deliberately keeping prices high and creating artificial shortages of GPUs

That's simply not true. Check out TechTechPotato's youtube channel for the explanation on this, but under-shipping isn't price fixing. They're just shipping less to distributors because there's less demand, allowing the distributors and retailers to keep a stable amount on hand.


One big reason that demand is low is that the current generation of GPUs are way too expensive, with top-end GPUs about twice what they used to cost a few years ago, very poor improvements in performance-per-$, and anythign below the top end offering even more dubious value for money especially on the AMD side. Originally those high prices were a result of high demand compared to supply, but the companies seem to have gotten greedy and decided they can permanently keep prices high and just throttle back supply to keep them there.


One big reson why the demand is low is because 1080ti or rx580 can still service 95% of gaming needs so what's the point of upgrading?


I’m consistently surprised just how good the RX580 really is. It can handle most games on medium at 1440p, but I tend to just chill on ultra at 1080p. Plus I play on TV, so the smaller resolution is actually better from where I’m sitting


RDNA2 (particularly, the narrower range from 6600 to 6800), is a huge step up in performance per watt. The lowest end one in there, 6600, is faster than a vega64 (which is much faster than rx580), yet uses less than half the power.

RDNA3's lower end chips, once they hit the market, are expected to further improve on this.

Most gamers won't upgrade to the current RDNA3 chips, because the current RDNA3 chips are top of the line, expensive, ~300w monsters.


Amen to that. The need for gamers to be on the hardware treadmill is no longer relevant. Five+ year old hardware can still run basically anything, albeit at reduced fidelity.

I keep eyeing a new build, but realistically, I know it’s just a vanity project because so few games will take full advantage of the better hardware. My favorite games in the past years could have run on ten year old hardware.


I think "service" is the key word here: I have an RX 580, and while it's kind of an incredible card in its longevity, it's really creaky at 1440p even with older games.

Performance per watt has really come a long way since GCN.


> Their desktop and server CPUs are crushing the market.

Seeing plenty of people choose 13th gen over Zen 4, the platform pricing for Zen 4 just wasn't very attractive. AMD [had to] significantly cut prices across the lineup by 20-30 %.


Let’s not forget that 12th and 13th gen are actually good chips


Also worth remembering that for the vast majority of people, any performance differences between Intel and AMD are utterly and absolutely insignificant as to be completely meaningless.

Nobody needs two-digit CPU core counts and 5~6GHz clock speeds to do their emails, communicate on Skype/Discord/Teams/Slack/Zoom/whatever, browse Facebook and Twitter, watch Youtube, and even play some vidja gaemz. An i3 or even a god damn Celeron is perfectly fine.

So at that point, Intel's superior stability (read: less jank) wins out by a hair and otherwise nobody really cares because there's no practical difference. The vast majority of people will just buy whatever's cheaper or just happens to be on the display table that day.


I keep having similar thoughts, but the software industry has shown a remarkable capacity to squander available hardware resources.


> The Nvidia 4090 turned out to be far ahead of the AMD 7900XTX.

7900XTX is on par or better than 4090 in alot of games, as well as some games favouring nvidia more than amd... (obviously not talking about Ray Tracing)

Coupled with a lower price...

I'm not quite sure what you're talking about since you're saying the oppisite is all reviews I've seen.


Maybe you're confusing the 4080 with the 4090?


You sure? 7900 is probably better value but the 4090 outperforms it by at least 20% in games.


It really depends on the games. Modern Warefare 2 for example, 7900xtx outperforms 4090 in every resolution. While in fortnite the 4090 outperforms the 7900xtx even more.

@ 4k gaming then 4090 on average is /much/ better than the 7900xtx, but looking at 1080/1440p that lead deminishes alot.

edit: At the end of the day tho its all too damn expensive now.


The lead diminishes at 1080p/1440p because games become more CPU bottlenecked than GPU.


>They are deliberately keeping prices high and creating artificial shortages of GPUs

you've seen evidence for this, or it's your opinion?

>AMD had a chance to really pull ahead of Nvidia

with a graphics card line that you just told us is far behind Nvidia's?


AMD said it themselves. They're "undershipping" to "reduce downstream inventory". [1]

The take from multiple[2][3][4] journalists on that call is they're trying to avoid a "supply glut" and maintain high prices.

[1] https://seekingalpha.com/article/4574091-advanced-micro-devi...

[2] https://www.pcgamer.com/amd-undershipping-chips-to-help-prop...

[3] https://gamerant.com/amd-undershipping-graphics-cards/

[4] https://www.extremetech.com/computing/342781-amd-ceo-says-it...


you put "supply glut" in scare quotes, but again, do you know that their motive is cynical?

they're talking to the public, i.e. investors, and they could easily be saying "our current sales figures are lower not because our product is not popular, but because there is currently a large inventory downstream to meet current demand. When that glut is cleared, expect our sales to resume."

if downstream sellers have sufficient inventory, the only way to induce them to buy more would be for AMD to drop prices. If AMD cards are in hot demand and selling out immediately, restricting supply would be artificially boosting prices. But if the downstream pipeline is full, it's not right to say that reduced demand from wholesalers while that glut clears is AMD artificially boosting prices.


>you've seen evidence for this, or it's your opinion?

Probably alluding to the under-shipping of GPUs; stated by Lisa Su during the recent investor call[0]

[0]https://www.fool.com/earnings/call-transcripts/2023/02/01/ad...


> with a graphics card line that you just told us is far behind Nvidia's?

Cheap and good enough can absolutely be a way to win.


Yup.

If AMD actually competed on price then they'd be shifting the GPU market away from Nvidia.

They seem content however just be second best but making bank.


They had a chance to meet a much better price to performance but IMO nvidia showed that the market is willing to pay a premium for GPUs and both major players are exploiting that.

Nvidia relaunched the 4080 12gb as the 4070ti reducing the price by $100. There has to be one hell of a profit margin on the high end cards.


> They are deliberately keeping prices high and creating artificial shortages of GPUs,

I'm not sure if you're referring to AMD "undershipping", but if you read how they use that word it's pretty clearly a bad thing. AMD has been shipping less (to retailers) than what they could, or would like to.


> The Nvidia 4090 turned out to be far ahead of the AMD 7900XTX.

I just don't get why most people care? It's 60% more expensive, too. Even the 7900XTX is ludicrous at $1000.

Give me a $400 card from this generation that competes with a $500 card from the previous generation, and I'll call it a win.

A $1600 card winning anything seems like an irrelevant battle. Is the volume / sales for those cards high enough to be the real focus, when cards like the GTX 1060 were the volume leaders by a long shot?


This is not a completely hashed-out thought. But I'll share it and see what others think.

My impression is that the simplest way to improve energy efficiency is to simplify hardware. Silicon is spent isolating software, etc. Time is spent copying data from kernel space to user space. Shift the burden of correctness to compilers, and use proof-carrying code to convince OSes a binary is safe. Let hardware continue managing what it's good at (e.g., out-of-order execution.) But I want a single address space with absolutely no virtualization.

Some may ask "isn't this dangerous? what if there are bugs in the verification process?" But isn't this the same as a bug in the hardware you're relying on for safety? Why is the hardware easier to get right? Isn't it cheaper to patch a software bug than a hardware bug?


A good reason why memory virtualization has not been "disrupted" yet seems to be fragmentation. Almost all low level code relies on the fact that process memory is continuous, it can be extended arbitrarily, and that data addresses cannot change (see Rust `Pin` trait). This is an illusion ensured by the MMU (aside from security).

A "software replacement for MMU" would thus need to solve fragmentation of the address space. This is something you would solve using a "heavier" runtime (e.g. every process/object needs to be able to relocate). But this may very well end up being slower than a normal MMU, just without the safety of the MMU.


> This is an illusion ensured by the MMU (aside from security).

Even in places where DMA is fully warranted, IOMMU gets shoe-horned in. I don't think there's any running away from costs to be paid for security (not the least for power-efficiency reasons).


I doubt it. Special purpose hardware is usually more efficient than a software implementation running on general purpose hardware.


But in this case the job of the hardware is to prevent the software from doing things, and it pays a constant overhead to do so whereas static verification as integrated into a compiler would be a one-time cost.


A problem to consider:

Arbitrarily complex programs makes even defining what is and isnt a bug arbitrarily complex

Did you want the computer to switch off at random button press; did you want two processes to swap half their memory. Maybe, maybe not

A second problem to consider is that verification is arbitrarily harder than simply running a program -- often to the extent of being impossible, even for sensible and useful functionality. This is why programs that get verified either don't allocate or do bounded allocations. But unbounded allocation is useful

It is possible to push proven or sanboxed parts across the kernel boundary. Maybe we should increase those opportunities?

Also separate address spaces simplify separate threads -- since they do not need to keep updating a single shared address space. So L1 and L2 cache should definitely give address separation. Page tables is one way to maintain that illusion for the shared resource of main memory... Probably a good thing

That's not to say there isn't a lot of space to explore your idea. It is probably an idea worth following

One final thought: verification is complex because computers are complex. Simplifying how processes interact at the hardware level. Shifts the burden of verification from arbitrarily long running and arbitrarily complex and changing software; to verifying fixed and predefined limitations on functionality. That second one has got to be the easier to verify


I like this idea, and given today's technology it feels like something that could be accomplished and rolled out in the next 30 years.

If the compiler (like rust) can prove that OOB memory is never accessed, the hardware/kernel/etc don't need to check at all anymore.

And your proof technology isn't even that scary: just compile the code yourself. If you trust the compiler and the compiler doesn't complain, you can assume the resulting binary is correct. And if a bug/0day is found, just patch and recompile.


The reality is that we do want to run code developed and compiled and delivered by entities we don't fully trust and who don't want to provide us the code or the ability to compile it ourselves. And we also want to run code that can dynamically generate other code while it's doing so - e.g. JIT compilers, embedded scripting languages, javascript in browsers, etc.

Removing these checks from the hardware is possible only if you can do without it 100% of the time; if you can trust that 99% of the binaries executed, that's not enough, you still need this 'enforced sandboxing' functionality.


This sounds like an exokernel design. What forces you to use the compiler that generates the trusted code to replace the MMU?


Perhaps instead of distributing program executables, we can distribute program intermediate representations and then lazily invoke the OS's trusted compiler to do the final translation to binary. Someone suggested a Vale-based OS along these lines, it was an interesting notion.


WASM could be such an IR unironically https://github.com/nebulet/nebulet. But I doubt that we would be gaining performance/efficiency this way.


You're thinking of single address space OSes.

I do not believe such OSes can ever be secure given how often vulnerabilities are found in web browsers's JS engines alone. Besides, AFAIK the only effective mitigation against all Spectre variants is using separate address spaces.


My understanding is that's more or less what Microsoft was looking at in their Midori operating system. They weren't explicitly looking to get rid of the CPU's protection rings, but ran everything ring 0 and relied on their .NET verification for protection.


eBPF does this, but its power is very limited and it has significant issues with isolation in a multi-tenant environment (like in a true multi-user OS). Beyond this one experiment, proof-carrying code is never going to happen on a larger scale: holier-than-thou kernel developers are deathly allergic to anything threatening their hardcore-C-hacker-supremacy and application developers are now using Go, a language so stupid and backwards it's analog to sprinting full speed in the opposite direction of safety and correctness.


Put another way: if AMD (and especially Intel) don't do something about this they're going to get completely eaten alive by ARM.

The amount of processing power available in a modern smartphone is truly mind-boggling. I'd love to see a chart showing the chip cost and energy cost of the power on an M1 chip in each previou syear. I would guess that 30+ years ago you'd be in the millions of dollars and watts of power but that's just a guess.

As we see from the modern M1/M2 Macbooks, these lower TDP SoCs are more than capable of running a computer for most people for most things. The need for an Intel or AMD CPU is shrinking. It's still there and very real but the waters are rising.


> Put another way: if AMD (and especially Intel) don't do something about this they're going to get completely eaten alive by ARM.

AMD’s latest parts are actually quite close to M1/M2 in computing efficiency when clocked down to more conservative power targets.

They crank the power consumption of their desktop CPUs deep into the diminishing returns region because benchmarks sell desktop chips. You can go into the BIOS and set a considerably lower TDP limit and barely lose much performance.

Where they struggle is in idle power. The chiplet design has been great for yields but it consumes a lot of baseline power at idle. M1/M2 have extremely efficient integration and can idle at negligible power levels, which is great for laptop battery life.


People keep repeating that Zen4 and M1 are close in efficiency but what is the source with actual benchmarks and power measurements?

At any rate, using single points to compare energy efficiency isn't a good comparison, unless either the performance or power consumption of the data points comparable. Like, the M1's little cores are 3-5x even more efficient when operating in an incomparable power class, and Apple's own marketing graphs show the M1's max efficiency is also well below its max performance [1]

Those perf/power curves are the basis of actually useful comparisons; has anyone plotted some outside of marketing materials? It might even be possible under Asahi.

[1] https://www.apple.com/newsroom/2021/10/introducing-m1-pro-an...


> but what is the source with actual benchmarks and power measurements?

Every notebookcheck.net review. For example https://www.notebookcheck.net/AMD-Ryzen-7-6800U-Efficiency-R...

They also do the same to a lot more laptops they test.

Look at the multi-core results, Zen3+ comes pretty close.

Also the single thread result shows what GP said: AMD CPU drains too much power at idle.


Their results are invalid because they used Cinebench. Cinebench uses Intel Embree engine which is hand optimized for x86, not ARM instructions. In addition, Cinebench is a terrible general purpose CPU benchmark.[0]

Imagine if you're testing how energy efficient an EV and a gas car is. But you only run the test in the North pole, where the cold will make the EV at least 40% less efficient. And then you make a conclusion based solely on that data for all regions in the world. That's what using Cinebench to compare Apple Silicon and x86 chips is like.

[0] https://www.reddit.com/r/hardware/comments/pitid6/eli5_why_d...


Cinebench/4D does have "hand-optimized" ARM instructions. It would be a disaster for the actual product if it didn't. That's what makes it interesting as a benchmark: that there's a real commercial product behind it and a company interested in making it as efficient as possible for all customer CPUs, not just benchmarking purposes.

Albeit for later releases this is less true since most customers have switched to GPUs...


Cinebench/4D does have "hand-optimized" ARM instructions.

It doesn't. As far as I know, everything is translated from x86 to ARM instructions - not direct ARM optimization.

Cinema4D is a niche software within a niche. Even Cinema4D users don't typically use CPU renderer. They use the GPU renderer.

The reason Cinebench became so popular is because AMD and Intel promote it heavily in their marketing to get nerds to buy high core count CPUs that they don't need.


Generally you see this in the lower class chips that aren’t overclocked to within an inch of instability. It’s not uncommon to see a chip that uses 200w to perform 10% worse at 100w, or 20% worse at 70w.

I can’t be bothered to chase down an actual comparison, but usually you’ll see something along those lines if you compare the benchmarks for the top tier chip with a slightly lower tier 65w equivalent.



Cheers for the link - if I read right there’s a cliff around 100w where power use goes way up for extremely marginal improvements.

Below 100w it’s more linear, but that might depend on undercoating and the like as well.


It's actually this idling power which is what defines battery drain for most people. All these benchmarks about how much it can for a certain compute intensive task is not that important considering that most of the time a laptop is doing almost nothing.

We just stare at an article in a web browser. We look at a text document. We type a bit in the document. An app is doing an HTTP request. The CPU is doing nothing basically.

Once in a while it has to redraw something, do some intense processing of an image or text, but it takes seconds.

It's the 99% in idling that counts and there most laptop CPU's suck.

Even when watching a video the CPU is not (should not be) doing much as there are HW co-processors for MPEG-4 decoding built in.

It's quite embarrassing how AMD and Intel have screwed up honestly.


And that's why so far AMDs mobile processors have been monolithic and not chiplet-based. That is supposed to change with Zen 4's Dragon Range, however most of the mobile lineup will still be monolithic and these high-power/high-performance processors should go exclusively to "gaming" notebooks.


I care a lot about idle power, even on my desktop PC. It seems crazy to me that in 2023 I still need to consider whether maybe I should shut down my computer when I'm not using it.

What should I be buying to not have to ask myself that question?


deep sleep is enough.

idle means the computer is turned on, a Mac on idle consumes less power than an x86 on idle, they both consume ~zero in deep sleep.


A Mac?


A laptop?


If you take a Zen 3 running at optimal clocks for efficiency (such as the 5800U) the difference between its computing performance per watt is competitive with the M1 if you account for the difference in node size (which TSMC claims gives 30% less power consumption at the same performance). As the article points out, the real efficiency gains will be domain specific changes such as shifting to 8 bit for more calculations.


It really isn't competitive.

First of all, when you downclock anything, you're going to gain efficiency. If Apple downclocks M1, it can get even more efficient.

Second, most of these tests use Cinebench, which is highly optimized for x86, not ARM instructions. Geekbench should be used instead.

Third, the M1 is a SoC. Everything is on it. Everything is efficiently connected directly inside the chip.


Both the M1 and 5800U run around 15W and are already clocked for efficiency. The M1 Max is their higher clocked less efficient offering.


This is false. The 5800U will boost well beyond 15w. Ignore their TDP marketing ratings.


How would x86-64 be as efficient with the same transistor & power budget when they have to run an extra decoder and ring within that budget? Seems physically impossible.


I found this to be a pretty expansive answer to this question: https://chipsandcheese.com/2021/07/13/arm-or-x86-isa-doesnt-...


Thank you for the detailed article!


All else being equal, they can't. But the difference isn't as big as some people like to think. For a current high end core, probably low single digit %. And x86-64 has had a lot more effort going into software optimization.


As I understand it, the actual processing part of most chips nowadays is fairly bespoke, with a decoder sitting on top. I doubt decode can make up that large a portion of a chips power consumption (probably negligible next to the rest of the chip?), so other improvements can make up for the difference.


How can AArch64 be as efficient when it implements all of the old 32-bit extensions?

Some don't, but a phone does.


The latest ARM Cortex CPUs (models X2, A715 and A510) drop 32-bit support. Qualcomm actually includes two older Cortex-A710 cores in the Snapdragon 8 gen 2 for 32-bit support. Don't know much about Apple Silicon but didn't they drop 32-bit a couple of years back?

Google has purged 32-bit apps from the official Android app store, but as I understand it the Chinese OEMs that ship un-Googled AOSP ROMs with their own app stores haven't been as aggressive about moving to 64-bit.


Apple also entirely drops 32-bit from their arm systems


The decoder has negligible power consumption and die area on a modern CPU.


Not if you have a complex ISA like x86 and want a very wide decode.


yes if you want to keep the transistor count low

M1 has 16 billion of them

AMD is below 10 billion


Because the more complex decoder is traded in this case for a denser instruction set, which means they can trade it for less instruction cache (which is more power hungry).


> when they have to run an extra decoder

it's not that expensive

it's a tradeoff, you lose something there, you gain somewhere else.

x86 has a more complex decoder exactly because it was less powerful and had to save on computing power and energy consumption, not being a mainframe.


My guess is it's related to the higher transistor count. The M1 for example has 16B transistors compared to the 5800U with 10.7B.


Honestly I don't understand why there's not something like a 256 core ARM laptop with 4TB RAM.

The benefit of ARM is scale of multitasking due to not requiring the same kind of lock states that Intel's architecture requires, and can additionally scale much better than only one physical+virtual core pair.

I guess the only thing that's holding back ARM is Microsoft, as laptops are expected to run an desktop OS that people are comfortable with. Windows RT wasn't really a serious desktop OS and rather a joke made only for some IoT enterprises instead of end-users.

I wish there was more serious hardware than the standard broadcom or MediaTek chips, I'd definitely want some of that...be it as a mini ATX desktop/server format (e.g. as a competitor to Intel NUC or Mac Mini) or as a laptop.

With the ongoing energy crisis something like solar powered servers would be so much more feasible than with x86 hardware.


> Honestly I don't understand why there's not something like a 256 core ARM laptop

The high power ARM cores aren’t that small. If you took the M2 and scaled it up to 256 cores, it would be almost 7 square inches. You can’t just scale a chip like that, though, so the interconnects would consume a huge amount of space as well. It would also consume over 1000W.

The latest ARM chips are great, but some times I think the perception has shifted too far past the reality.


7 square inches would also include an enormous GPU and tons of accessories.

The actual cores are about .6/2.3 mm², and local interconnects and L2 roughly double that.

So with just those parts, 256 P-cores would be about 1.5 square inches, and 256 E-cores would be about half a square inch. And in practical terms you can fabricate a die that's a bit more than a square inch.

Of course it wouldn't use 1000 watts. When you light up that many cores at once you use them at lower power. And I doubt a 256 core design would have all that many P cores either.

As a rough estimate, you could take the 120mm² M1 chip, add 28 more P-cores with 110mm², 220 more E-cores with 300mm², 128 more MB of L3 cache with 60mm², 100mm² of miscellaneous interconnects, and still be on par with a high end GPU.

That sounds doable but is pushing it. A 128 core die, though, has nothing stopping it except market fit.


even a 128 core part made like that will perform pretty atrociously. scaling up the core count without scaling the cache count means you have a lot of cores waiting for memory. also when you have 128 cores, you almost certainly need more memory channels to have enough bandwidth.


I explicitly included more cache.

And the memory controllers aren't that big on the die. You could include a bunch more on a 128 core model.


Could we make the chips go slower like around 1Ghz? Maybe that is not feasible with the current software architecture to achieve great user experience.


> The benefit of ARM is scale of multitasking due to not requiring the same kind of lock states that Intel's architecture requires

I have no idea what you mean by this. The only x86 feature I can think of that might qualify as a 'lock state' is a bus lock that happens when an atomic read-modify-write operation is split over two cache lines. That has a very simple solution ('don't do that'--you have no reason to), and anyway, one can imagine more efficient implementation strategies

> can additionally scale much better than only one physical+virtual core pair

I have no idea what you mean by this either. Wider hyperthreading? It can be worthwhile for some workloads (and e.g. some ibm cpus have 4-way hyperthreading), but is not a panacea; there are tradeoffs involved.


I'd guess they're referring to ordinary reads/writes having acquire/release semantics on x86 and relaxed on ARM.


The largest number of high-performance ARM cores you can get in a single socket is the Ampere Altra Max with 128 ARM Neoverse-N1 cores. At 2.6 GHz the processor consumes 190 W, and at 3.0 GHz up to 250 W. This is a server chip, not something you can put in a laptop.

Source: https://www.anandtech.com/show/16979/the-ampere-altra-max-re...


I think because general compute is hard to parallelize, so 256 cores doesn't help much in practice. (Compute that does parallelize well already runs on GPU).


I get that hugely parallel applications already run on the gpu, but wouldn't something like 4 power and 28 efficiency cores kinda make sense?


Not as much as say, 4 power cores, 4 efficiency cores, and 24 gpu cores.


>I guess the only thing that's holding back ARM is Microsoft

It's not Microsoft holding it back. It's Qualcomm.

Apart from their very latest SOC (designed by a bunch of ex-Apple employees, no less) their CPUs have are significantly worse than x86 in terms of general performance and have persistently lagged 4 years behind Apple in terms of performance (3 years behind x86). They sell for the same price per unit as x86 CPUs do, so there aren't very many OEMs that take them up on the offer given the added expense of having to design a completely different mainboard for a particular chassis.

As such, x86 is the only game in town if you're buying a non-Apple machine; Qualcomm's products aren't cheaper and perform much worse outside of having more batter life. Sure, Qualcomm owns Nuvia now, but that acquisition will still take some time to bear fruit.


> that acquisition will still take some time to bear fruit.

It might be a very long time considering Arm is suing to get Qualcomm to destroy Nuvia's work.


Really looking forward to buying a 256-core laptop and seeing almost all tasks using 1 single core.

Let's get real here, most things can't be parallelised at all. We must strive for better single core performance.


Let's get real. Most times you're not doing a single task on your laptop.

These days I only see a single core loaded up to 100% when I grep through a big directory or when I encounter a bug in some software.

Most of the time, it's either all cores are equally idling or equally doing something heavy (like building a big project).


>> I would guess that 30+ years ago you'd be in the millions of dollars and watts of power but that's just a guess.

30 Years ago I don't think the compute power of a modern phone chip was available at any price, even in super computers.

On a tangential note, there are economists who think this increase in compute is somehow an increase in one of their measures - I don't recall which one. I disagree, because with that logic we all have trillion dollar tech in our pocket. Making a better product over time is expected, it's not some kind of increase in output.


The Top500 supercomputer list started in June 1993, just about 30 years ago. At the top is the CM-5/1024 by Thinking Machines Corporation at Los Alamos National Laboratory with 1,024 cores and peaking at 131.00 GFlop/s (billion floating point operations per second).

It's an Apples to ThinkingMachine Oranges comparison but CPU-Benchmark[1] ranks the Apple A16 Bionic used in the latest iPhones, its GPU - in the "iGPU - FP32 Performance (Single-precision GFLOPS)" section - at 2000 GFlop/s.

GadgetVersus[3] reports a GeekBench score of the A16 Bionic at 279.8 GFlop/s. - SGEMM test of matrix multiplication, it seems.

AnandTech[4] was reporting the A15 architecture ARMv7 came in at 6.1 GFlops in the "GeekBench 3 - Floating Point Performance" table, SGEMM MT test result, in 2015.

[1] https://www.top500.org/lists/top500/1993/06/

[2] https://cpu-benchmark.org/cpu/apple-a16-bionic/

[3] https://gadgetversus.com/processor/apple-a15-bionic-gflops-p...

[4] https://www.anandtech.com/show/8718/the-samsung-galaxy-note-...


Interesting. I would have thought a few GFLOPs today would have been faster than the old super computer, but nope. The GPU is faster though. Still, the phone has both and can run on battery power while fitting in your pocket ;-)


In my experience talking to semiconductors folks, ARM is just not a concern anymore. The future is RISC-V, and ARM is already being seen as legacy tech. ARM's progress in the server space has stalled, the ARM Windows ecosystem is dead, Android has laid the groundwork for a move to RISC-V, and ARM has never and will never touch the desktop market.


> ARM has never and will never touch the desktop market

That’s a bold statement as I type all day on an M1 Mac. My FT100 company just made the leap to them as dev machines.


My company has entire teams and regions that do NOT buy PCs and only Mac laptop for employees. Started with the M-series.


I think you mean "Do not buy Windows computers". Macs are also PCs.


To be fair that whole boat sailed with the "I'm a mac he is PC" series of adverts.


Ah, I'm not in the US so I avoided those ads.


[flagged]


> It proves my point beautifully when the only response to my comment

Your comment was beyond ignorant, and wrong. Most folks here are too smart, or busy, to reply to such nonsense. I am neither.

ARM is by far the most shipped and used arch every year. AMZ is even going in heavier on it. It's not legacy tech at all. So a person decided to show you how wrong you were, by listing what's probably the most impressive chip in all of our lifetimes, and it's guess what, ARM.

> Obviously I was referring to Linux/Windows workstations

The creator of Linux is using an ARM machine as a workstation today, AFAIK.

> if everyone was smart enough to pick up on that I wouldn't be paid as much as I am

If you're making more than a burger flipper at Wendy's, the world just isn't fair.


The one viable server ARM CPU core is now tied up in a Qualcomm-ARM legal spat and probably won't see the light of day and made it pretty clear to anyone not grandfathered in like Apple that it's not worth designing your own ARM core. ARM itself has been hemorrhaging employees both because of better offers from Apple and the RISC-V stealths, and because since the SoftBank push to get their money back has simply been a worse and worse place to work. Their ability to execute is extremely compromised.

Because if the long tail of the hardware industry, the writing can be on the wall long before it's clear based on what you can go out and buy off a shelf today.


I get it that you want RISC-V to succeed - so do I - and to advocate for it but I really don’t understand why it needs this sort of comment about Arm. I see exaggerated criticism of the Arm ISA elsewhere from people who ought to know better too - it’s really CISC, it’s 5000 pages vs 2 for RISC-V etc. It’s just not necessary.


I mean, nothing I said is exaggerated here. ARM doesn't even have a viable server core that can compete with x86 even as vaporware. SoftBank ruins everything they touch, and is super focused at the moment on stealing from Peter to pay Paul to get something out of the upcoming ARM IPO since their attempt to sell it off to Nvidia fell through. The rumor is they've been cutting R&D funding hard to get temporarily boost profitability. If anything this is more a dig at how vulture capitalism ruins productive companies.

As an aside, ARM has always been a hybrid CISC/RISC core. It has nothing to do with the number of instructions, but the fact that not having an I$ on the ARM1 forced it to have microcoded instructions to mainly to support LDM/STM. That's not a dig at ARM. It's a valid design; particularly at that gate count.


You jumped in in support of a comment that said Arm is ‘legacy’ tech. You said they don’t ‘even have a viable server core’. They are ‘haemorrhaging’ staff. Softbank have ‘ruined’ them.

Sounds more apocalyptic than exaggerated tbh.

I still don’t know why you think this is necessary.


I guess I don't see why you think it's necessary to not criticize them.


The unwritten rule of HN: You do not criticise The Rusted Holy Grail and the Riscy Silver Bullet.


ssshhhhhh, the Rust Evangelism Squad will hear you.


The M1/M2 Macs run Linux pretty well. It's not perfect yet, but perfectly usable (especially as a desktop machine!) and support is improving every day.

I believe you're trying to move goalposts to avoid admitting you're wrong.


Graviton, M1/M2, Ampere etc but I’m sure you’ll be able to explain why Arm is seen as ‘legacy’ tech when billions of smartphones are being shipped every year with Arm CPUs.


Oh look, you named 4 areas where ARM development has already peaked. Hyperscalers are already looking to evolve from ARM in the near future, just look at how much attention Ventana got at RISC-V Summit. M1/M2 are Apple ecosystem specific phenomenon that haven't inspired any copycat products. Ampere has been a massive disappointment to everyone in the industry, see the fact that Nuvia had their entire business dead-to-rights pre-acquisition. ARM simply isnt at the cutting edge of the semiconductor industry anymore. Just because Apple and Qualcomm use it to great effect doesn't mean ARM is making any major innovative strides relative to the competition.


> ARM simply isnt at the cutting edge of the semiconductor industry anymore.

What you really mean is Arm isn’t the hot new thing anymore. Well it hasn’t been that for 20 years. Meanwhile billions of arm devices in leading edge nodes are being shipped. Oh well.


> Hyperscalers are already looking to evolve from ARM

citation needed

https://www.theregister.com/2023/02/08/5_percent_cloud_arm/

"5% of the cloud now runs on Arm as chip designer plans 2023 IPO"

5% does not seems to me as a position that you want to change to something else.

ARM in every day computing outside of mobile phones and SBCs is just getting started as I see it.


If RISC-V support by Microsoft is as bad as it has been for ARM, then I'm afraid RISC-V will never touch the desktop market, at all. Contrary to ARM, which is being pushed there with great success by Apple. Server-wise of course it's a different story...


If great success to you is that they put the M1 and M2 in a tower, I don't know what to tell you. Intel, AMD, and the x86 industrial complex don't care in the slightest what instruction set your Mac runs


Might I suggest taking a step back, re-reading your first comment and all the replies under it, and asking yourself "is it possible I might not be 100% correct, and maybe other opinions have enough merit to be worth considering why people aren't agreeing with me, rather than just changing my argument to make sure I'm still the winner of this thread"?


I’m not sure I expressed my point clearly. It wasn’t quite about Apple. So I will reformulate it here: the fate of any instruction set on the desktop is primarily decided by Microsoft.

Do you have any information that Microsoft is planning to support RISC-V at least as well as x86/x64? (That is to say, not with something like Windows RT, or Windows CE)

That would be tremendously good news, I shall add.


>In my experience talking to semiconductors folks,

Most, if not all SemiCoductor “folks” I know are very pragmatic. As in how a Real Engineer should be, unlike software engineers. And in my experience, only HN and the Internet are suggesting ARM is dead. Everything will be RISC-V.


Yup, difference to the engineer is minuscule.

It's getting traction coz chip companies don't want to pay ARM for licensing, not because it is particularly better at something.


>The future is RISC-V

I hope this never comes to pass, because a RISC-V future is a Chinese future.

A Chinese future will not be kind to western ideals that most of us hold dear.


Huh? China has licenses for both the x86_64 and ARM ISAs. WhT about RISC-V makes it an advantage for Chinese companies?


RISC-V is free and open as in libre, by contrast to x86 and ARM which must be licensed from Intel/AMD and ARM and are thus subject to potential western economic sanctions.

Now, yes, China will just espionage and kangaroo court their way through and around such legalities anyway, but nonetheless RISC-V is less effort for more reward for China if it becomes at least on par with x86 and ARM.

Put more basically, it's a matter of national security. China can have an entire RISC-V ecosystem indigenously, unlike x86 and ARM.


Can Zhaoxin's x86 license, or the various Chinese companies's ARM licenses, just be revoked?


If the US and/or UK place sanctions on exporting microprocessor technologies to China then that's that. Intel/AMD and ARM are subject to US and UK laws and regulations respectively.

RISC-V by contrast is much, much harder for any given country to regulate because of its free and open nature. At most the US and UK can embargo individual developments made within their jurisdictions, but they can't regulate the entire architecture. RISC-V doesn't have a kill switch named Intel/AMD or ARM.


China has licenses for x86-64 “designs” and ARM’s design. Not the ISA. Although “ARM China” is probably enough for them.


They have access to VIA's x86 license. They have entered a deal to use it for their domestic designs and have been doing so for 10 years now.


Really? No Chinese company has the ARM architecture license? That's honestly a bit surprising if true


ARM China is a wildly different animal than ARM. They went rogue a few years back and though SoftBank/ARM did a lot to get things back in line, it still shows up like this:

https://www.reuters.com/technology/arm-china-says-its-ousted...


I'm loving it. I used cheap risc-v boards for several of my projects, most notably a GD32V in my keyboard. The equivalent stm boards weren't too expensive, mostly in the 10-20$ range, but weren't as easily available (and 10$ is still 3x the price of the chinese risc board)

Though the rp2040 has largely ended my cheap risc-v addiction


As an experiment quite a few years ago I got a laptop with a special version of the Intel CPU that was not as fast but much more power efficient.

ASUS UL30A-X5

Really an excellent computer, ran linux great (games didn't really exist yet though), and with tuning was coming in under 10W if the display brightness was turned down. First time I was able to get through flights without the system running dead.

I think in this case what's going on is that temperature rises increase resistance in a chip and therefore cause lower efficiency. If you can keep it cool, you can keep it more efficient. The move seems like a necessary one, a computer as powerful as that UL30A is probably inside the phone if you turn off the radio and display, that thing still had a giant battery and only lasted 10-12 hours.

I've seen AMD do some pretty impressive things, I wouldn't count them out. They're at least willing to attempt to compete on price.


From what I've seen the CPU/ALU/decode at the center being ARM or x86 may make less difference than you think. The amount of circuitry and silicon area (correlated with power) for non core is significant. MMUs, vector instructions, complex cache hierarchy, high speed IO (DDR, pcie, you name it) extremely complex network on chip (infinity fabric) to enable cpu interconnectivity, etc. Is very significant. Look at the IO die size vs the CCD size. As one poster pointed out using chiplets have great advantages, but there is a power hit. Thankfully newer tech is bringing that power down too. I'd love to see a power breakdown of a full chip to see what % is attributed to the cpu core itself.


In my own experience, the supposed ARM chip superiority claims are almost entirely marketing. I get significantly better performance (15-50%) from nearly all of my CPU workloads on modern Intel/AMD hardware vs the ARM Apple devices.


The article is about energy efficiency. Do you get 15-50% better performance per watt from nearly all of your workloads?


When I was an undergrad, one of my professors was exploring “approximate” computation (forget what it was technically called). The gist was that you build mathematical circuitry that approximates an answer instead of giving you a concrete answer (kinda like floating point but also applied to Boolean algebra, integer math etc). The reasoning was that the approximation could let you reduce the power. I wonder where that line of research has gone.


It's used inside Google's TPUs[0]. Useless for regular logic programming (You've wouldn't want a nuclear power plant, or your payment processor to approximately work), but has found a use case in scaling ML pipelines where the accuracy of individual values is less important compared to billions of parameters in aggregate.

[0] https://ieeexplore.ieee.org/document/9264836


I don’t know that it’s “useless” for logic programming. I agree it’s less likely to be used for normal everyday stuff but it could see more proliferation (eg video decoding). I think TPUs are an interesting first step but the holy grail (which indeed may be impossible) is to be able to reduce large scale programs to run on approximate circuitry. For example, if chrome used a fraction of the memory, CPU, and power, that would be significant even though no one would notice that the page renders slightly imperfectly if executed well. As the paper notes TPUs are the first application although I think what we’re doing there is probably quite primitive be what the researchers in the area are working on long term.


"Fuzzy logic" was the "AI" hype buzzword of the 90s.


Fuzzy logic is not approximate, and definitely not high performance.


nVidia's DLSS works somewhat like that: the game engine generates a low-res/noisy/low framerate image, and the GPU refines it with an ML model.


What if all the text in your comment was approximately the correct set of letters in each word?



This line was worth reading!

> dig the life that both are scary and jamming lore whose biased top must be recoverable.


As long as the first and last are good you can kinda figure it out


Yis, wjat eef al teh tekst inn mie coment waz aproximetly tha corekt sett ove letres iyn eech worud?

For many things, like neural networks, it's probably good enough.


Approximate computing approximately means using 16 or 8 bit floats instead of standard 32 bit floats.


Analog computing?


Well, given the chips Apple have released and all the work it has done on efficiency I think this is the current challenge, not the next.


Happy to hear that AMD takes these issues seriously. The power draw of current Intel and Nvidia chips is seriously getting out of hand. There is absolutely no justification for drawing 40-50 watts of power for rendering a website!


They could score a quick win for power efficiency by making the Ryzen 9 5900 and Ryzen 9 3900 12-core 65W CPUs generally available instead of only selling them to OEMs.


If your production process and binning doesn't create enough supply for these parts, you only sell them to OEMs, because they can silently discontinue, or limit availability of these models when parts can not be found.

However it'll be a PR disaster if the parts have spotty availability in retailers, and will cause wild rumors to spread, from "AMD doesn't want you to have this CPU" to "AMD is going bankrupt".


Sure. But since AMD has been producing these CPUs for a while, their yield should be good by now, shouldn't it?

Anyway, I hear some people saying you can just buy the 105W "X" variant and limit the power usage to 65W in the BIOS. Does it really give the same result, also in idle power consumption?


I'm running my 5800x in 45w Eco mode. It does not improve idle power from what I can see. It only limits the max it can draw.


It's not a yield issue if your product is only a byproduct of another one.


I'd imagine all of the low power binned chiplets are going into EPYC and sold at multiples of desktop prices.


Low power, stable low frequency and low power, peaking high frequency make different bins. The former goes to servers and latter to enthusiast laptops.


Or go even further, and make a Ryzen 9 5950 non-X, hardcoded to -30 PBO offset and 88W maximum PPT out of the box.


It's amazing on how you can power limit the newest Ryzen to only 105 and get high energy safes for much less performance reductions.

But the review-game (same reviewers who complain about power usage) drives AMD/Intel to get the last 5% of performance increase for 20% of power increase.



I dont think Intel or Nvidia got the memo on this. Both are pumping out seemingly more and more desparate products where their solution to more performance is just to throw more power at it, causing obscene levels of heat in the process.

Meanwhile down at the mid-end you've got Apple chipping away at them with ultra efficient ARM chips.


what? the 4x series cards are the most energy efficient graphics cards ever made. ignore the 4900, they just scaled the top end to have high power and high performance. look at the low tier. they all have significant power requirement reductions compared to the 3x cards. watts per fps is down


When I worked on London Underground, we investigated regenerative breaking and found it wasn't worth it for the power savings BUT it was worth it to reduce temperatures on platforms (LU applied a £ value to the comfort of passengers for modelling purposes).

Interesting to see the same fundamental issue at the nano scale...


The next challenge? You mean the one that started 15 years ago?


Right? Like, isn't that one of the reasons people have been raving about Apple's MX chips? Because they're very energy efficient and have good performance characteristics?


Looking forward to this. It seems lately you can only get < 35W x86 CPUs through laptops or some very niche vendors or even mystery sellers on alliexpress. I’d like something that idles at sub 5W like ARM SBCs do but without the whole OS or boot jank.

I’m pretty sure people will suggest x86 options that fit into that description but again it looks like you have to scrape the internet for stuff like intel 1[2,3][1-9]00t or wathever the AMD alternative is.


Anecdotally, I use a CPU from an aliexpress mystery seller in my desktop machine. It's a i9-9980HK with about 40W TDP...

Which I unlocked to ~300W and installed a water cooling system.

It can idle at around 5W, but a synthetic load (e. g. prime95) quickly makes it draw full 300W and then throttle a bit. Fun stuff, definitely wouldn't do it again due to enormous strain on the PSU and an unreasonably high power draw relative to performance.


There are many many machines out there with x86 soldered CPUs all of them idling at less < 10W. E.g. ASUS PN-series with Celeron CPUs. You can even find them on physical stores. I have a PN40 idles at less than my RPI4b...


Has the environmental impact of less efficient processing been studied? I've been thinking about this since seeing those stories a while back about bitcoin miners collectively having the same power consumption as some smaller countries.

If anyone knows a dataset suggesting what % of world energy usage is used by computing hardware (and what proportion of that hardware is idle vs fully utilised) I'd love to see it.

Also, less on topic, what's the environmental impact of writing your app in something programmer friendly but power inefficient? I strongly suspect some big tech companies will be suffering with this (namely anyone who is still on RoR at scale).


AMD CEO, Dr. Lisa Su, has identified energy efficiency as the next challenge for the company. With the growing demand for high-performance computing, it is crucial to ensure that energy consumption is minimized to reduce the environmental impact and operating costs. AMD has already made significant progress in this area with their latest processors that deliver excellent performance while consuming less power. Dr. Su's focus on energy efficiency underscores AMD's commitment to sustainability and innovation, and we can expect to see further developments in this area in the coming years.


The last time a chip company talked about "Efficiency" in this way, we got +10 years of 5% increments in performance and 50% in price increases for 2 and 4 core CPUs..

I know it's my PTSD, the words make sense but I really hope this is not code for "we can't get much more performance out of our architectures, so we'll focus in selling "efficiency".. "

I really hope I'm wrong, since the power consumed by this chips has gotten a bit out of hand ( not as much as the GPUs.. )


ARM CPU manufacturers have been doing that for more than a decade :)


Yes, but real world performance ( aka compute speed ) has been like desktop Linux, "it's here, but not really" ( we are talking about general desktop, laptop, server here, not mobile ).

Having said that, Apple's M1 was the real one that changed things, also ARM on the server is starting to actually be a mainstream thing, so I get why AMD and Intel are sweating, I just hope they can pull it off because a World where you need to buy a +$1000 aluminum box attached to the CPU you want to get is not a better World.. or worse: Doing business with Qualcomm.


Desktop Linux has been better than Windows out of the box for more than a decade now. Unless you're talking about market share, which will never increase without a hundred million dollar marketing blitz behind it.


If Intel could also think about same.

My Thinkpad X1 Nano Gen 2 with Alder Lake has only 50% battery runtime of what X1 Nano with 1160G7 accomplished. For pretty much the same performance. Power constrained to say 5 or 7W, the 1160G7 feels faster.

I hope that Lenovo can offer something so light as a X1 Nano with AMD inside. Technically it should be both possible and feasible, given that the AMD CPUs are much more efficient/performant at low power levels.


So.... that means an Apple M1 killer is in the works? Anybody? That's the type of energy efficiency everyone can get onboard with.

Also, IMO software needs to take a lot of responsibility for energy efficiency, we just pawn that off on the hardware vendors. I wonder what the carbon cost of javascript is, I don't think I'd want to see the results, or python for that matter.


We can see in practice the gains in energy efficiency coming from very close integration from short distance, to chiplet, to everything on one die. These are our mobile phones down to the Apple M1/M2 laptops with a motherboard having everything packaged into the size close to the size of a mobile phone board.


Will neuromorphic chips ever go main stream amongst AI practitioners? The neurons on these chips are spiking, which is a whole different paradigm what is currently used in neural networks? These chips are however a thousandfold more efficient.


People criticising the AI talk, remember that they achieved great CPU performance improvement by adding AI to the look aside buffers.

Looking forward to even more efficient Zens, my next laptop will definitely be an AMD


I've been using a Zen2 notebook for the past few years, and was honestly surprised by the processors performance for the first... 3 or 4 months. Then typical updates happened. And some quirks got in the way too. Like it having a decent iGPU that I can't use because the BIOS will only let me pick one of them (down to it not having a mux to pick one or the other? Not sure what's going on there, just that I can't do it). In general it's a great machine, but there's a host of little details that make the experience a bit worse than it should be.

Per my usual update schedule, I'm looking down the line to at least a Zen5(+?) upgrade in a few years time, so I hope they improve this kind of things in the future. However that's entirely up to OEMs deciding to make a good product, and a bit out of AMDs grasp.


If they could integrate everything into one chip, it would make for great gaming PCs. Quieter, smaller, performant, low power.


It seems the more power saving features I turn on, the more my computer crashes. No thanks!


“I will take tech futures for $800.”

“What is AMD’s next challenge?”

“Efficiency.”

“Sorry, the answer we were looking for was ‘less.’”


Just in time for micro nuclear reactors to be commercial I suppose


It is, time to convert all projects back to C !


Written by programmers who live in their parents' unheated/non-airconditioned basements?


The next challenge is software stack.


AMD should build an arm chip. There's only so much efficiency you can build into the ageing x86_64 architecture


The only reason ARM is competitive with x86 is due to heavy borrowing of its tricks (i.e. OOO execution). At the end of the day, Apple M2 and AMD Ryzen cores are not all that dissimilar.


But Ryzen has to support a bunch of legacy instructions, and M2 does not.

That’s a huge advantage


I don't think it's a huge advantage. x86 instructions are not executed directly on fixed-function hardware anymore. Everything is microcode.


What don't they just make the chips bigger so there is more surface area contact on the heat sink?


That might work in the narrow case of desktop.

In mobile (anything with a battery really), it obviously doesn't work because you care about battery life.

In server, it doesn't work, since two of the main costs and limiting factors in data centers are power and cooling.


And RISC-V ISA.


Way to skate to where the puck was AMD.


To reduce power usage by any significant amount, need to get rid of the clock.


I have never seen a realistic proposal for how this would make a practical consumer PC work better.

Having a clock that every component can agree upon means that components don't have to worry about each other anymore. Physics and information theory would suggest that removal of this centralized clock signal necessarily introduces additional latency in order to safely determine or modify system state.


Instructions unclear, threw alarm clock in dumpster.


We have know this for a long time. This is one of the benefits of ARM64 CPUs as they require much less energy. What's new about his point?


They definitely don't require "much" less energy. Zen 4 is very close to the M series in perf per watt


Yes ARM isa requires less power. Why do you think we don't have mobile chips with Intel x86 instructions...


Typical for a company that is stuck to be calling out what they should have been doing for the last decade as the thing to focus on. Basically he's promising more of the same.

What he should be doing is announce the next big thing. Which I imagine might include trendy things like tackling AI with some hardware/software stack that is energy and cost efficient and competitive with nvidia. Or a non intel architecture based chip intended for high end gaming/ar/vr type devices where energy efficiency and performance are going to matter more than compatibility with legacy PC hardware. AR is going to suck if you have to be tethered to a huge power supply or battery and carry a liquid cooling apparatus with you. This requires a different approach.

And even those things really should have been the focus for the last ten years. A slightly faster version of the thing they've been selling for the last ten years is not going to turn things around and there are only so many people still assembling PCs from parts that actually know and appreciate AMD as a brand.


1. She. AMD's CEO is Lisa Su.

2. This is a keynote at a supercomputing conference. The audience are people who do supercomputing. She's not going talk about consumer stuff.

3. She did announce new, integrated products that are in at least some niches better than what nVidia has.

4. AMD brand is flying very high on the server and supercomputing side right now. They have beat Intel for 3 generations straight now.

In general, you should read the article instead of commenting based on just headlines. Doing this just makes you look really stupid.


I think it’s a she.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: