Hacker News new | past | comments | ask | show | jobs | submit login
Apple Silicon M1 Die-Shots (techinsights.com)
185 points by stefanhoelzl on Dec 26, 2020 | hide | past | favorite | 107 comments



I work at TechInsights.

Depending on what the community is looking to know, there may be some things we can share publicly. I’m in the software group and not intimately involved with the RE parts, but I will act as an ambassador and see if I can get the right people involved.

Throw some comments on this thread and we’ll see what we can do to help. Might be next week due to holidays.


I am actually interested in the Non-Labeled part which no one on the Internet have even guessed what they are. Image Signal Processors ( ISP ), Video Decoder and Encoder, there should be a FPGA somewhere, Secure Enclave, NAND Controller, PCI-Express Controller / USB3 / Thunderbolt etc.

Mostly because all the other part are already well known.


I wonder what they'll do for "M1X" and pro-machines?

Seems like heat isn't an issue, so can they increase the clock-speed of the M1?

And will they create a larger die with 2xGPU and 2xCPUs, or will they "stitch" multiple M1 dies together on one package to have a higher yield and reduce supply chain concerns?

An M1X with 2xGPU will bring it to Radeon 5500M level which is what I have in my current 16", but still not quite to Nvidia 1080 level.

So they'll definitely need a more powerful GPU for the pro-users. I wonder if they can make a 4xGPU, or if the yield will become poor as the die size increases. Maybe they'll have to do a discrete GPU, but that doesn't seem to match their current trajectory.

And then there have been "leaks" (rumors, really) about an 32xcore M1-family CPU, but I imagine the yield of such a beast would be horrible, with all the other SOC components on the same die. Especially, if they put a 32xcore GPU on the same die.


(microelectronics graduate in 2001, turned software engineer, I have worked for ARM and Imagination Technologies on simulations of CPUs, GPUs, networking products and other bits of SoCs)

Yield wouldn't be that bad - ARM CPUs are small in comparison to their x86 cousins. I would expect that you would get to 32 CPUs the same way everyone else does - build a 32 core design and blow enable fuses on bad CPUs, then badge as appropriate.

To get a desktop class GPU they will need to address memory bandwidth. I am assuming they are using a tile based rendering process so memory bandwidth is less of an issue in comparison to scan line rendering (do desktop GPUs still do that?) I would assume that they are enjoying the benefits of the system level cache with the CPU and GPU sharing. I would expect there to be some careful benchmarking of increasing GPU cores, memory speed, memory size and system cache size going on at Apple.

There isn't anything stopping Apple from supporting external GPUs, but it would require a new SoC with more PCIe lanes. External buses are much more power hungry and take up space on the die. I don't have a mental model of how much space plonking a PCIe x16 interconnect would cost in terms of area or power (taking into account you need to get that data across the chip too), but my gut reaction is that it isn't likely unless there is a customer use case they can't solve on chip.


>Seems like heat isn't an issue, so can they increase the clock-speed of the M1?

They are running into their current node limits, so dont expect a 4Ghz+ coming unless they decide to remake the whole thing in High Performance Node. ( Which means it will cost a lot more for a relatively niche volume.)

>I wonder if they can make a 4xGPU

GPU increase should be given, and die size / yield would not be a concern / problem. The question is how will they handle the additional memory bandwidth requirement? As you can see in the Die-Shots those DDR I/O side are already take up much of the side space. Going with HBM2? Or LPDDR5?

>32xcore M1-family CPU, but I imagine the yield of such a beast would be horrible,

A 32x CPU / GPU Core M1 will only be 4x increase from its current form. If you exclude I/O part, the CPU + GPU + Cache is only a little more than 50% of die size, or ~60mm2. 4x That is only 240mm2+. The final product would be a sub 350mm2 if you include all the additional. Which is roughly inline with the current estimate of AWS Graviton 2 die size ( ~400mm2).

But yes, M1 ( A14X ) isn't really that interesting. The most exciting part is how they will go about doing high performance chip. Especially on iMac and Mac Pro.


Well, the prevailing rumor currently is a processor with 8 high-performance cores (as opposed to the 4 on the M1). I don't know if the amount of high-efficiency cores would also be doubled so that it's like 2xCPU when compared to the M1.


Yes, that's the M1X rumor, but that isn't going to be enough to compete with high-end CPUs in iMac and Mac Pros. For those, we have the 32-core rumor (but that is a massive die).

And doubling the M1-GPU will not be enough to even surpass current Macbook Pro offerings. So they'll also need to do something more on that front.


For the truly high end they will need an external GPU. Removing the integrated GPU should get up enough space for lots more Firestorm cores.

And given that Firestorm is the fastest available CPU core, they probably don’t need 32 of them to match the highest multicore performance levels of existing AMD and Intel CPUs.


I'd like to see a pro machine with daughterboard slots, allowing you to add extra CPUs. A bit like the Acorn RISC PC. I guess Apple have already done this to seom extent with the FPGA cards that you can buy for the current Mac Pro. I do wonder if that was a taste of things to come?


Techinsights page returns error.

Archive.org seems to work:

https://web.archive.org/web/20201227010504/https://www.techi...


How does one identify which regions of a die-shot is the CPU or GPU? Are the structures on silicon easily recognizable from such pictures?

How far can someone with time and money go towards reverse engineering a chip from such pictures?


In short: educated guesswork. The lighter blocky regions in the die shot are recognizable as memory; a lot of the rest is based on recognizing duplicated or mirrored elements (like the two CPU1 cores) and making guesses based on their number, relative size, and what they're close to.

Identifying the GPU, for example, is based on recognizing that there are eight repeated elements within it, and matching that with the known fact that there are eight GPU cores.


This [1] is a Talk from John McMaster about "Low-Cost IC Emission Reverse Engineering" using, iirc, commodity cameras.

[1]https://www.youtube.com/watch?v=iAuLXg2fVz0


For very old chips reverse engineering is relatively easy; here's one example: http://www.righto.com/2020/04/inside-am2901-amds-1970s-bit-s... For modern chips you need an electron microscope and there are a lot of metal layers to deal with.


You don't need an electron microscope to discern large-scale repeated structures like CPU and GPU cores, or to tell the difference between random logic and highly regular memory arrays (usually cache, but sometimes you can see microcode or register files if you look at a small enough area). An electron microscope is only useful if you want to fully reverse-engineer the entire circuit schematic, but that's overkill for basically any purpose except manufacturing your own clone.


So none of the underlying layers matter much for the large and intermediate scale structures?


With the right camera modifications, you don't have to remove any layers from a chip to discern the core count: https://www.flickr.com/photos/130561288@N04/48258310027/ That's an IR photograph taken with a Sony APS-C camera that the IR-blocking filter has been removed from. That's enough to make out the cores and L3 cache slices. (The rest of Fritzchens Fritz's flickr uploads are worth checking out too, especially the video showing how he preps a die for visible light photography.)


That’s cool! How would the major companies reverse engineer these chips?


TechInsights employee here.

The chip design and fabrication firms have their own internal RE divisions mostly to QA their own chips. They do competitive RE, but its often less expensive for them to get a subscription to TechInsights. Economies of scale and all that.

In terms of how its done, its very proprietary. I can tell you it involves clean rooms, ion beams, electron microscopes, chemistry, and bleeding edge software.

If you want to come work for us (its enjoyable, you should :) ) check out our careers page. We are hiring like crazy in many positions.

https://www.techinsights.com/about-techinsights/overview/car...


Why not underestimate the engineering effort, could the problem is higher up in the chain (software api like Apple os) and lower (the equipment to produce the chip). Would the IP that important?


In general, the other comments give an idea of how to do it. But Apple shared their own die shots in their presentation ;)


And they were actually accurate which is unusual for this sort of thing.


I assume you could at least make an educated guess based on the layout and ratios of the area of the caches (assuming you know their topology and sizes) and therefore their proximity to non-cache stuff.


> How does one identify which regions of a die-shot is the CPU or GPU? Are the structures on silicon easily recognizable from such pictures?

Infrared camera


Interesting to see the CPU isn't actually that big in the chip. What are the uncommented parts? Maybe peripheral interfaces?


It may be interesting to look at a more completely labeled die image (eg. a recent AMD laptop chip [1]).

I/O controllers, various PHYs, the display pipeline and the media engine account for a significant fraction of chip.

We also know the M1 has an integrated SSD controller, ISP and Thunderbolt which probably take a good chunk of room too.

[1] https://potatopc.net/amd-renoir-die-annotation-raises-hopes-...


> We also know the M1 has an integrated SSD controller, ISP and Thunderbolt which probably take a good chunk of room too.

SSD controller and even Thunderbolt. That's the most integrated SoC I've heard of.


Cache, GPU, Secure Enclave, Neural processors, peripheral controllers. GPU takes the lion’s share.


The Cache, GPU, and NPU are all already labeled in the image.

The Secure Enclave is a good one to point out that is in the unlabeled space.

Certainly some non-DDR peripherals are in the unlabeled space, but it would be neat to have a more comprehensively labeled diagram. I wouldn't expect most peripherals to take up so many transistors compared to the processor cores.


There's probably 4 lanes of pcie to connect the nvme ssd to, a couple more lanes to run the thunderbolt controllers, 2-3 usb controllers + a hub to connect the keyboard and touchpad, an audio dac and some thermal and power usage monitoring. They might even put the ethernet controller that shoes up on the Mac mini in there.


Sorry, was looking at the marketing image. The enclave would be quite tiny though. Id say most of the undocumented part is custom glue logic which isn’t particularly identifiable.


What's all the stuff that isn't annotated with a yellow rectangle?


This is a SoC, not a CPU. The majority of the non-annotated is glue logic, I/O, etc.


That's alot of glue! I would say ~50%! What is the system cache? nm, see question below: https://news.ycombinator.com/item?id=25547796


It has a lot of roles. You have memory controllers, i/o controllers. Not sure if the SSD controllers is also part of it. Then the dedicated video encode/decode the image processing the neural engine. There's a lot of features.


It lists the nvme drive as being attached to "Apple Fabric" so that sounds like an internal bus of some description....


Can someone explain to me, what the term "bionic" means in this context? How is the chip derived from nature / natural systems?


Pure speculation: the word bionic means using an artificial body part. The A14 Bionic includes a new Neural Engine for Artificial Intelligence.

EDIT: the name Bionic has been used since the A11 when the Neural Engine was introduced. Fusion introduced big.LITTLE.


Could a genetic algorithm have been partially used in the chip design also?

For example, in the layout to the right of the DDR on the M1, it looks like some significant optimization was done, but only in certain areas.


Does anyone know if its possible/ tried to do ram upgrades with higher capacity chips by replacing the dram like one does in the newer chip stacked iphones?


What is a "system cache" ? Is that an L3 cache?


The system-level cache isn't an L3 cache specific to the CPU only, it's common across all the blocks in the SoC.

It is used by the GPU, neural engines,... in addition to the CPU itself.


I know nothing about chips, but I guess it's probably the last-level CPU cache but the GPU shares it as well, similar to Intel iGPU, so they decided to call it "system cache".


It's L3, but I think it's called 'system cache' because it is also used for the bus that loads memory from RAM into the GPU.


I would really like to know where the L1 caches are for proportions!


The L1 caches are part of each CPU itself.


Sure, but what part are the L1... I can se a bunch of similar patterns to the L2/3 caches but I would like someone that knows to draw a border around them!


Can we step back and marvel at the absurdity that one of the most powerful CPUs ever created was for a phone?


From a device class where the newly all-important perf per watt was always the north star, in a market with the largest chip volume, by a company with ample margins and mountains of cash? Does not seem all that absurd :)


yea it's *the world largest company (by market capitalisation), whose main product is their flagship mobile and whose main differentiator is the CPU (almost everything else is beaten or available in competing products like Camera, number of apps etc). The moment the CPU performance falls below Android devices, people will look at iPhones differently.. iOS is a differentiator but doesn't have the x factor to make iPhones justify the premium tag.


CPU speeds have been similar to competitors until around the A10 era a few years ago. The real differentiator has always been software + hardware integration happening inside the same company. The hardware team gets the insight of the software team on what areas of performance should be looked at, and the software team can optimize for the exact specifications of a future CPU line.


i thought Apple crushed other chipset in single core and managed to stay ahead in overall.

https://www.phonearena.com/news/apple-a14-vs-a13-vs-snapdrag....


You’re saying The iPhone is mostly popular because the A13/14 are faster than the competition.

The grandfather points out that iPhones up to the A10 had comparable performance as the competition, and still were popular, so having faster CPUs can’t be the major reason for that popularity.


I suspect for the majority of buyers the differentiator is the logo, then the OS, then maybe the CPU.


The logo alone wears off after a few years.


I always assumed that they would be inferior in terms of absolute performance if you were going for performance per watt. I never imagined just how efficient the silicon could be.


But these devices aren’t ‘phones’ any more are they? That’s disingenuous.

They’re hugely powerful general purpose computing devices - many people’s primary computing device if you go outside the tech bubble. They do compute intensive tasks like photo manipulation, on-line video editing, grading, and compression, they do face detection, depth estimation, etc etc.

I don’t know why people still say ‘wow all this for a telephone’ like all it does is make analog phone calls.


its a shame we can't 'dock' em to a larger monitor to do desktop work yet. There are attempts by few players, but if Apple can do that it would be another revolution..


I was pretty excited about Linux on DeX for Samsung devices since it had a capable CPU but unfortunately they discontinued it when Android 10 was released.


The standard dex is also quite capable. You can install terminals which have all the standard utils. I was even able to run vs code.


Because it fits in your hand and runs on a battery and is called a phone.

iPhone

As an aside, I was wondering what things does the iPhone 12 do that can even begin to tax all this horespower?


Right - 'one of the most powerful CPUs ever created was for a phone' sounds funny, so to get your head around it think instead 'one of the most powerful CPUs ever created was for a miniaturised 4k on-line editing suite and live broadcast system' and it doesn't sound so absurd anymore does it?


I assume there are some games that can utilize the power.


It's an ultra portable computer that for most people less than a majority of it's time will be used as a phone.


It's not absurd at all, nor surprising that the most power efficient chip was compelled by the most portable general computing device. Plugged-in desktop computers certainly weren't going to bring us here.

This just sounds like "DAE think phone=bad XD" anti-intellectualism.


I think people are misconstruing what I am saying here. I said marvel at it. It's crazy that a handheld battery powered device has one of the greatest processors in the history of human existence in it.


If you were used to using mainframes, you could have said much the same thing about desktop CPUs in the 90s. "That tiny thing doesn't even fill up a room!"


Only a die-shot, it's disappointing that there's no comment or explanation. But the website advertises:

> A DFR report will be available to TechInsights subscribers on Apple's M1 disruptive chip architecture in January 2021.

So I assume the actual write-up is still a work-in-progress. Perhaps we just need to be patient and come back a month later.


The “reports” are targeted at competitors who want to know how and what Apple did. They are quite expensive for a consumer or casual user.

The are more than just a document. A DFR includes in depth analysis, hundreds of supporting images, and usually access to the Circuit Vision for it. CV is a viewer that allows drilling into the chip down to viewing individual transistors along with the chip schematics. Think map tiles but for a chip.

Disclosure: I work at TechInsights.


The actual report will be thousands of dollars because it's for industry professionals trying to compete with Apple's M1.


> Only a die-shot, it's disappointing that there's no comment or explanation.

A die shot that Apple themselves showed at the event. Which has had the same level of annotations previously by Anantech.

Sure, the DFR in early 2021 will be much more detailed, but also expensive to the point 0.001% of HN viewers will actually be able to see it. As such, I think your skepticism of this being worthy of home page billing is not out of line.

For the record, this is not meant to be a dig at TechInsights. I’ve seen a DFR before and they really are good for what they are, it’s just not something most HNers would gain value from, even if it were free.


Kind of sad to think about this - Why does marketing dept not actually show these and instead put up a cartoonized version of Apple's chips [1]? That's in every keynote. This is a rhetorical question, the real mystery is why does an average consumer want to see unreal things to convince them its cooler/better? You'd think seeing actual die shots would be more convincing to average consumer, but they love marketing that's untruthful, fake and exaggerated. Not the fault of the marketing department. This is what consumers want and this is what sells. Sad to think how shiny things sell and the whole luxury industry in general.

[1] https://www.apple.com/v/mac/m1/a/images/overview/chip_cpu__d...


Get real - which consumer ever really cares about die shots?

I'm a rather technical folk and even I don't care.


They do have a shot on https://www.apple.com/mac/m1/ (the first non-logo image) which looks like it matches up (you'll need to rotate one by 180°). So it's not like they're hiding it or actively avoiding it. It apparently just doesn't suit their needs.


The degree to which they hype UMA (a term of art within the PC industry for decades) on that page is amusing --- if Intel did the same sort of marketing for its integrated GPUs, they would almost certainly be laughed at.


From what I've read it seems to be different from Intel's UMA. It's cache-friendly single pool of memory directly accessible from different cores of the SoC: CPU, GPU, NPU etc. Since there's no division of RAM between different parts of the SoC, copying operations aren't needed.


I also read somewhere that they made sure the data formats used by the various modules are identical. Without that, you have to copy data to do format conversions. I don’t know how much of a problem that is in general, though.


This is consumer marketing; “people who have a solid grasp of fundamental SoC architecture” are not the target audience.


What do people with such a grasp have to say about it?


M1's performance is great, but there isn't much evidence to attribute it to the memory subsystem. Also, people wanting to fit 32GB worth of apps into 16GB is just wishful thinking that isn't supported by evidence either.


Quite a lot of testing in Final Cut and Logic etc showing very good performance and efficient swap. So for most consumer apps it works well. No magic if you need a big block of memory for ML or analytics.


Do we know what benefit they're getting by putting the memory on the chip to begin with? I had initially assumed there was some non-trivial performance advantage in it, but now people are saying not so much, so why did they do it?

It seems weird to artificially limit the max memory if there was no advantage in it.


I think it's a power advantage, not performance.


The benefit is they can charge you $200 for memory they bought from Samsung for $15. There’s no performance advantage to the user.


I’d love to know where you’re picking up LPDDR4X-4266 RAM for $15?


Are you actually asking? The spot price for LPDDR4X 16G is only $9. I was actually being generous. No doubt the contract price is far lower.


Yeah a link would be good. I haven’t seen anything approaching those prices - all well over $100


As a technical consumer I don’t know why I would want to see the actual die shots - it really doesn’t tell me anything at all.

The graphics they put on slides or marketing sites aren’t supposed to be technical reference material, usually it’s just about conveying the idea of a SoC or performance improvements, or relative sizes, or whatever.


Because it's more real, and more inspiring to look at real technical material.


For some people yes, for others it’s a confusing jumble of fuzzy shapes that distract them from the pertinent information. I think Apple approach of providing simplified diagrams, and also illustrative die shots (yes, they did provide die shots, so the whole premise of this discussion is stupid) was perfectly fine.


> Because it's more real

Why is more real better for this purpose?

Isn't a big idea of diagrams that they aren't real so can be simpler and easier to understand? Otherwise we'd just use photos or actual schematics.

> more inspiring

Most people just need to get the info so they can get on with their job and day and get home to their kids. They aren't looking to be inspired right now. That's fine.


The inquiry I am making is more general - infantilization of consumers and the idea of dumbing things down for consumers like they're blobs of meat with cash to spare.

You don't need to see the cartoonized version of it either. May be its easier as a schematic to explain? Not sure. A list of M1's features would work equally well. I think its just marketing pizzaz.


Your refusal to take a moment to think why Apple’s marketing team did this is very funny. Do you think they just throw a dart into a spinning board when making decisions like this? There’s a difference between “dumbing” things down and effectively conveying your messages. Suggesting that they include an actual die shot in their ads for general consumers is one of the most HN-esque comments I read this month.


> Do you think they just throw a dart into a spinning board when making decisions like this?

I think the opposite. Apple has perfected marketing in a lot of ways.


They are simplifying the presentation for educating a wider audience of non-ECEs.

This is not a new phenomenon in human communication.

But we’re all just blobs of meat, that’s true. Maybe one day Apple will better tailor their marketing to sentient Bitcoin algorithms.


Though there’s still the possibility that they could add it into the detailed specifications page, or perhaps even a link in the footnotes of the specifications page? Surely that would be a better match for the target audience that would care.


There is a list of features here:

https://www.apple.com/mac/m1/

Or are you looking for more detail, like the width of the decode pipeline or depth of the reorder buffer?

I would think that a majority of software engineers don’t know what those are, fewer still would understand the significance compared other chips.


Apple isn't an education company so it shouldn't be surprising their marketing department doesn't worry about simplifying things for their real goals. At the end of the day the customers are blobs of meat with cash to spare and the advertisement is to change the latter part not teach the former.


Does it make a difference? The ones from the marketing department convey the message better and are pretty to look at. These are not as cool to average consumer.


They've actually done this for some web pages, you likely can find them on Wayback.


The other discussion here recently about how Apple is very silo'd internally is relevant: the marketing department very likely doesn't have access to real die photos.


They do, die shots are literally on Apple's marketing page: https://www.apple.com/mac/m1/

A quick visual comparison shows this to be pretty accurate, just inverted and rotated.


I think if Apple wants to be a tech leader they should open up more. The secrecy is totally unnecessary.


Are you claiming that they are not a tech leader right now?


They're in a weird spot. They're obviously making a lot of money, but most companies in their position do a lot of work to court developers and accessory makers etc. Apple's position is essentially that they control the market so you need them more than they need you.

But that makes others resent them. Which is a vulnerability because it means you have all the rest of the industry ready to sell Apple down the river the first chance they get, whether it's in making antitrust claims or lobbying against them or just fervently supporting any half viable competitor in the hopes that it becomes fully viable.


This is how it seems to me too. I still have their hardware I got years ago and it works great, but it's been a couple of years since I've decided to refuse any software upgrades from them, and to avoid their hardware. Their attitude seems to be getting more and more "our way or the highway", and I'm not interested in playing those games.

Previously I already switched from Windows to Mac as my primary environment, so going to GNU/*ix was not a big stretch. I already know several non-hackers who have gone this route as well.


> fervently supporting any half viable competitor in the hopes that it becomes fully viable.

Great. I’m an Apple fan and I’d love it if there were more half-viable or better competitors. Seems like Apple is doing a good job.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: