Wow does much better than Geekbench's prior top processor (https://browser.geekb...

alpha64 · on Oct 20, 2021

You sorted by single core performance, then compared multi core performance. Sort by multi core performance, and you will see that the i9-11900K is nowhere near the top spot.

For example, the Ryzen 9 5950X has single/multi core scores of 1,688/16,645 - which is higher in multi core score than the M1 Max, but lower in the single core.

GeekyBear · on Oct 20, 2021

Interestingly, the iPhone's A15 SOC did get a newer version of Apple's big core this year.

>On an adjacent note, with a score of 7.28 in the integer suite, Apple’s A15 P-core is on equal footing with AMD’s Zen3-based Ryzen 5950X with a score of 7.29, and ahead of M1 with a score of 6.66.

https://www.anandtech.com/show/16983/the-apple-a15-soc-perfo...

On floating point, it's slightly ahead. 10.15 for the A15 vs. 9.79 for the 5950X.

lostmsu · on Oct 20, 2021

Which is still not that much higher. Of the "consumer" CPUs only 5900X and 5950X score higher. And their stress power draw is about 2X of speculated M1 Max's.

tedunangst · on Oct 20, 2021

That's maybe not a bad way to sort? Most of the time I'm interacting with a computer I'm waiting for some single thread to respond, so I want to maximize that, then look over a column to see if it will be adequate for bulk compute tasks as well.

28933663 · on Oct 20, 2021

Perhaps they were referencing the highest 8C chip. Certainly, a 5950X is faster, but it also has double the number of cores (counting only performance on the M1; I don't know if the 2 efficiency cores do anything on the multi-core benchmark). Not to mention the power consumption differences - one is in a laptop and the other is a desktop CPU.

Looking at a 1783/12693 on an 8-core CPU shows about a 10% scaling penalty from 1 to 8 cores - suppose a 32-core M1 came out for the Mac Pro that could scale only at 50% per core, that would still score over 28000, compared to the real-world top scorer, the 64-core 3990X scoring 25271.

eMSF · on Oct 20, 2021

M1 Max has 10 cores.

andy_ppp · on Oct 20, 2021

But the two efficiency cores are less than half a main core thought right?

hajile · on Oct 20, 2021

1/3 the performance, but 1/10 the power. Not adding more was a mistake IMO. Maybe next time...

andy_ppp · on Oct 20, 2021

Really? I mean if it gets me 10-14h coding on a single charge that’s awesome…

hajile · on Oct 20, 2021

The A15 efficiency cores will be in the next model. They are A76-level performance (flagship-level for Android from 2019-2020), but use only a tiny bit more power than the current efficiency cores.

At that point, their E-cores will have something like 80% the performance of a Zen 1 core. Zen 1 might not be the new hotness, but lots of people are perfectly fine with their Threadripper 1950X which Apple could almost match with 16 E-cores and only around 8 watts of peak power.

I suspect we'll see Apple joining ARM in three-tiered CPUs shortly. Adding a couple in-order cores just for tiny system processes that wake periodically, but don't actually do much just makes a ton of sense.

thenthenthen · on Oct 20, 2021

Stil 8 more than my desktop pc :p

mrtksn · on Oct 20, 2021

The single core is second to Intel's best but the multicore is well below in the scale, comparable to Intel Xeon W-2191B or Intel Core i9-10920X, which are 18 and 12 core beasts with TDP of up to 165W.

Which means, at least for Geekbench, Apple M1 Max has a power comparable to a very powerful desktop workstation. But if you need the absolute best of the best on multicore you can get double the performance with AMD Ryzen Threadripper 3990X at 280W TDP!

Can you imagine if Apple released some beast with similar TDP? 300W Apple M1 Unleashed, the trashcan design re-imagined, with 10X power of M1 Max if can preserve similar performance per watt. That would be 5X over the best of the best.

If Apple made an iMac Pro with similar TDP to the Intel one, and keeps the performance per watt, that would mean multicore score of about 60K, which is twice of the best processor there is in the X86 World.

I suspect, these scores don't tell the full story since the Apple SoC has specialised units for processing certain kind of data and they have direct access to the data in the memory and as a result it could be unmatched by anything but at the same time it can be comically slow for some other type of processes where X86 shines.

sudhirj · on Oct 20, 2021

John Siracusa had a diagram linked here that shows the die for M1 Max, and says the ultimate desktop version is basically 4 M1 Max packages. If true, that’s a 40 core CPU 128 core GPU beast, and then we can compare to the desktop 280W Ryzens.

pocketarc · on Oct 20, 2021

Interestingly, the M1 Max is only a 10 core (of which only 8 are high performance). I wonder what it will look like when it’s a 20-core, or even a 64-core like the Threadripper. Imagine a 64-core M1 on an iMac or Mac Pro.

We’re in for some fun times.

spacedcowboy · on Oct 20, 2021

John Siracusa - no the chart isn't real, but maybe qualify that with "yet"...

https://twitter.com/siracusa/status/1450202454067400711

kzrdude · on Oct 20, 2021

Hm, related to that reply https://twitter.com/lukeburrage/status/1450216654202343425

Is this a yield trick, that one is the "chopped" part of another? So they'll bin failed M1Max ones as M1Pro, if possible?

GeekyBear · on Oct 20, 2021

Bloomberg's Gurman certainly has shown that he has reliable sources inside Apple over the years.

>Codenamed Jade 2C-Die and Jade 4C-Die, a redesigned Mac Pro is planned to come in 20 or 40 computing core variations, made up of 16 high-performance or 32 high-performance cores and four or eight high-efficiency cores. The chips would also include either 64 core or 128 core options for graphics.

https://www.macrumors.com/2021/05/18/bloomberg-mac-pro-32-hi...

So right in line with the notion of the Mac Pro getting an SOC that has the resources of either 2 or 4 M1 Pros glued together.

concinds · on Oct 20, 2021

I wish there were laptop-specific Geekbench rankings because right now it seems impossible to easily compare devices in the same class

wmf · on Oct 20, 2021

The M1 Pro/Max are effectively H-class chips so you can search for 11800H, 11950H, 5800H, 5900HX, etc.

zsmi · on Oct 20, 2021

Your comment got me wondering if there was actually a method to Intel's naming madness, and it turns out there is!

https://www.intel.com/content/www/us/en/processors/processor...

11800H = Core i7-11800H -> family=i7 generation=11 sku=800 H=optimized for mobile

11950H = Core i9-11950H -> family=i9 generation=11 sku=950 H=optimized for mobile

I didn't look up the AMD names.

So, now that I know the names, why not use Core i9-11980HK?

family=i7 generation=11 sku=800 HK=high performance optimized for mobile

It seems like it exists https://www.techspot.com/review/2289-intel-core-i9-11980hk/

P.S. General rant: WTF Intel. I'm really glad there is a decoder ring but does it really have to be that hard? Is there really a need for 14 suffixes? For example, option T, power-optimized lifestyle. Is it really different from option U, mobile power efficient?

erk__ · on Oct 20, 2021

I really wonder how a single z/Architecture core would fare on this benchmark, though I imagine it's never been ported

LASR · on Oct 20, 2021

Probably not as good as you might expect. Z machines are built for enterprise features like RAS, and performance on specific workloads.

The ultra-high-clocked IBM cpus are probably significantly faster at DB loads, and less than the best at more general benchmarks like Geekbench.

systemvoltage · on Oct 20, 2021

Per core performance is the most interesting metric.

Edit: for relative comparison between CPUs, per core metric is the most interesting unless you also account for heat, price and many other factors. Comparing a 56-core CPU with 10-core M1 is a meaningless comparison.

PaulDavisThe1st · on Oct 20, 2021

Not when building large software projects.

ur-whale · on Oct 20, 2021

> Not when building large software projects.

Or run heavy renders of complex ray-traced scenes.

Or do heavy 3D reconstruction from 2D images.

Or run Monte-Carlo simulations to compute complex likelihoods on parametric trading models.

Or train ML models.

The list of things you can do with a computer with many, many cores is long, and some of these (or parts thereof) are sometimes rather annoying to map to a GPU.

Someone · on Oct 20, 2021

It seems Apple thinks it _can_ map the essential ones to the GPU, though. If they didn’t, there would be more CPUs and less powerful other hardware.

‘Rather annoying’ certainly doesn’t have to be a problem. Apple can afford to pay engineers lots of money to write libraries that do that for you.

The only problem I see is that Apple might (and likely will) disagree with some of their potential customers about what functionality is essential.

eyelidlessness · on Oct 21, 2021

Related content: the round Mac Pro

josephg · on Oct 21, 2021

> Not when building large software projects.

While working in rust I am most limited by single core performance. Incremental builds at the moment are like, 300ms compiling and 2 seconds linking. In release mode linking takes 10+ seconds with LTO turned on. The linker is entirely single threaded.

Fast cold compiles are nice, but I do that far more rarely than incremental debug builds. And there’s faster linkers (like mold[1] or lld) but lld doesn’t support macos properly and mold doesn’t support macos at all.

I’m pretty sure tsc and most javascript bundlers are also single threaded.

I wish software people cared anywhere near as much about performance as hardware engineers do. Until then, single core performance numbers will continue to matter for me.

[1] https://github.com/rui314/mold

PaulDavisThe1st · on Oct 21, 2021

My project [0] is about 600k lines of C++. It takes about 5m40s to build from scratch on a Ryzen Threadripper 2950X, using all 16 cores more or less maxed out. There's no option in C++ for meaningiful incremental compiles. Typically working compiles (i.e. just what is needed given whatever I've just done) are on the order of 5-45 secs, but I've noticed that knowing I can do a full rebuild in a few minutes affects my development decisions in a very positive way. I do 99% of my development work on Linux, even though the program is cross-platform, and so I get to benefit from lld(1).

The same machine does nychthemeral builds that include macOS compiles on a QEMU VM, but given that I'm asleep when that happens, I only care that the night's work is done before I get up.

[0] https://ardour.org/

gchokov · on Oct 20, 2021

Like… everyone builds large projects all the time?

akmarinov · on Oct 20, 2021

If you don’t then you don’t really need the top end do you?

ajuc · on Oct 20, 2021

Most people who buy fast cars don't need them and it's the same with computers.

YetAnotherNick · on Oct 20, 2021

By that logic you could build an array of mac mini if you don't care about price/heat.

semicolon_storm · on Oct 20, 2021

What compiler could even make use of 10 cores? Most build processes I've run can't even fully utilize the 4 cores.

PaulDavisThe1st · on Oct 20, 2021

Compilers typically don't use multiple cores, but the build system that invokes them do, by invoking them in parallel. Modern build systems will typically invoke commands for 1 target per core, which means that on my system for example, building my software uses all 16 cores more or less until the final steps of the process.

The speed record for building my software is held by a system with over 1k cores (a couple of seconds, compared to multiple minutes on a mid-size Threadripper).

spacedcowboy · on Oct 20, 2021

I can stress out a 32+32 core 2990WX with 'make -j' on some of my projects, htop essentially has every core pegged.

ukd1 · on Oct 20, 2021

Just running the tests in our Rails project (11k of them) can stress out a ton; we're regularly running it on 80+ cores to keep our test completion time ~3 minutes. M1 Max should let me run all tests locally much faster than I can today.

andy_ppp · on Oct 20, 2021

Wow, what is the system doing to have 11000 tests?

postalrat · on Oct 21, 2021

  add(1, 1) = 2
  add(1, 2) = 3
  add(1, 3) = 4
  add(1, 4) = 5
  ...

pornel · on Oct 20, 2021

Rust (Cargo) does, and always wants more.

jlmorton · on Oct 20, 2021

Often a single compiler won't make use of more than a core, but it's generally easy to build independent modules in parallel.

For example, make -j 10, or mvn -T 10.

destitude · on Oct 20, 2021

Xcode has no issues taking advantage of all cores.

throwawaywindev · on Oct 20, 2021

C++ compilers probably will.

5faulker · on Oct 20, 2021

Just when you think things hit the top, another kid's out of the town.

DeathArrow · on Oct 20, 2021

What about this? https://browser.geekbench.com/v5/cpu/7421821

andy_ppp · on Oct 20, 2021

That’s a strangely shaped laptop, what is the battery like on it?

mcphage · on Oct 20, 2021

It's actually compatible with a tremendous range of third party external batteries like so: https://www.amazon.com/dp/B004918MO2

And forget about fast charging—you can charge this battery up from 0% to 100% in less than a minute just by pouring some gasoline in the thing!

It's the very pinnacle of portability!