They are still building i386 packages for use on 64-bit kernels, for the sake of precompiled 32-bit software (games etc). That's scenario B from Simon McVittie's reply in this thread, and it sounds like that's not affected.
I was able to upgrade from Lubuntu 16.04 to 18.04 for my 32bit IBM Thinkpad T42 using do_release_upgrade. It’s not downloadable on the website, but I think for 18.04 the build is still there.
I have a load of 32bit laptops, "I'll put ChromiumOS on them", nope, that's not a thing any more. OK, err Linux? Apparently not. Could put Windows 10 on them I guess.
You could try Debian, which is essentially Ubuntu minus some themes. I doubt Debian will drop i386 support soon (I just searched and couldn't find hints of an end of support of i386)
I remember being an early adopter of 64-bit Linux with my Athlon64 back in the day. Lots of stuff was broken and I got a lot of debate on whether it was any faster or worthwhile at all. It's really cool to see the technology curve go full circle.
My dad bought me an Anthlon64 machine for Christmas and we put it together. Not know much about anything, I remember being pissed off/confused that I didn’t have Windows XP 64-bit in spite of having that Anthlon64 sticker on the front.
A few years later my brother was starting a web server to host a forum for the Digital TV switch-over focused on the Madison, WI market, shout-out and RIP madcityhd.com
Watching him setup Fedora on the box on the floor of his bedroom and using Compiz wobbly-windows was enough to hook me. I stole his install CD and nuked my drive (much to the irritation of my dad, who knew that I would be stealing one of his nights to reinstall Windows when I eventually realized could no longer play counter-strike), It was fun to see it come full-circle a year or two after I was ticked-off about my Anthlon64 not running a 64 bit operating system, when I started trying other distos and realized, “Hey, I can actually use the 64 bit one now!”
Fun times. I wonder how many kids got hooked on Linux by wobbly windows. I know that’s what brought me in, haha.
Worthwhile? We had PAE for a long time, sure, but I think it was fairly obvious when the Athlon 64 appeared that 4GB was not going to cut it, given some people already had 1GB at home...
PAE was always a terrible hack though. It caused so many problems that I'm super glad native 64 bit became the norm shortly before it would have had to become widespread.
It's not a terrible hack, it's just 32-bit virtual addresses with a larger physical address space.
It only caused problems because some kernels used to expect that all physical memory is mapped at all times (and some hardware could only DMA to 32-bit physical addresses, but that's a problem with 64-bit CPUs as well).
At the time 1GB was a ton of memory and few applications would use anywhere near that. A 64-bit kernel took on fairly quickly but actually compiling 64-bit applications took a lot longer. Consensus at the time was the memory overhead of larger pointers outweighed the benefits of extra registers.
I wouldn't call it consensus, just a persistent claim by a large fraction of developers. They finally got to prove their claims with x32 ABI, which was ILP32 in amd64 mode, including the additional registers. Few if any people used it, nor did it show better performance in practice.
Regarding 64-bit support: by the time amd64 chips shipped people had been writing software for the 64-bit Alpha for 10 years, and 7 years for sparc64. IME most open source software was already 64-bit clean and worked well out-of-the-box. Back then the open source community, and especially GNU projects, heavily emphasized platform and hardware portability.
Perhaps the situation on Windows was different. It also didn't help that Windows kept long 32 bits, which had the effect of breaking code that cast between pointers and long (intptr_t didn't come until C99). The 64-bit ABI for all Unix platforms (AFAIU) carried forward the relationship between long and pointer. I don't think I ever recall seeing Unix or open source code casting a pointer type to int, only long; it was Windows software that presumed pointer->int conversions worked.
> Few if any people used it, nor did it show better performance in practice.
We use it for our specialized analysis framework. The real world performance gain is just shy of 30% which is impressive since all it took was a bunch of compiler flags.
Personally I think x32 is a vastly underused ABI. Every application that is unlikely to use more than 2GB should use it. It is literally free performance.
> If anything, we've only learned to squander all available memory.
If I've learned anything in my computer career, it is that this is an evergreen comment. I remember reading Q-Link posts from Commodore 64 users with this complaint when the wasteful C128 came out.
Go ahead and buffer 4k video streams to disk, I dare you :) There's a reason that even modern appliance platforms like Apple/Android TV need multiple GBs of memory, and it ain't bloat.
If we aim high and say we need 50mbps of bitrate for 4k HDR, we still only need 6-7mb/s. So buffering 30 seconds is still less than 200mb.
It is entirely possible to make a modern system that use far less than they do today. If you're gaming it's a different beast, but for normal OS and programs it's certainly bloat.
An aside: that's often true in Google's data centres. So much, that someone wrote a system to sort-of swap out memory to other machines' RAM over the network (instead of to disk, which Google doesn't do in general).
This. The basic stuff they did back then didn't magically inflate. Except, it seems from afar that thats exactly what happened. But it didn't, and it doesn't have to look like that from afar.
Althlon64 was a swift kick in the nuts for Intel, but AMD really didn't followup until Ryzen. I wonder if they're going to stick around and fight this time.
Well Intel wasn’t exactly sitting idle during the Athlon 64 era. It was just unfortunate that none of their plans worked out.
NetBurst was a bust. IA-64 flopped too. Both were attempts to progress from the P6 architecture - which they felt had reached its limits.
In the end, the market spoke, they want the P6. No one was willing to rewrite software (or even recompile). So back to the P6 in the form of the Pentium M followed by the Core /Core dual series and finally the i3/5/7/9 that we use today - which you would notice hasn’t improved much per core; as mentioned the P6 style design is pretty much tapped out and all they can do is “squeeze blood from stone” for a few percentage improvements here and there.
AMD recently caught up but frankly, they aren’t doing much better. Performance per core is just on par with Intel. Their main selling point is that they made it cheaper by splitting the L3 cache in 2 - the Ryzen is basically 2 quad-cores glued together for better yields.
> which you would notice hasn’t improved much per core
Not sure which data you refer to? I just made a comparison on a single-threaded integer-heavy code between my old Core2 Duo and a Skylake, and just got x16 normalized perf improvement (from 2.5-cycles/byte to 0.15). Same code, same compiler.
So sure, progress have slowed down and they are adding more and more specialized stuff (AVX-512, AES-NI...) but still.
The real question isn't if Intel will fight back but how. They've been known to innovate shady marketing tactics just as much as their CPU architecture.
I've still got my AMD Athlon 64 3200+ in a bin somewhere. I remember being in sheer awe that i could just drop in a dual core cpu into my nforce4 based mobo. It felt revolutionary at the time.
Back then, AMD kind of had to license the x64 stuff to Intel for cross license reasons. They aren't under any pressure this time to give away anything.
Really? When did that change? I thought they had some sort of in-perpetuity cross licensing agreement going back to when AMD was a second source for x86 for Intel with Government contracts.
The idea is to expose the 64-bit instruction set and registers, but keep memory and pointers per-process 32-bit as most apps use less than 4 GB of memory.
Do you know how much memory we can address with 64 bit?
(64 bit for addresses was arguably a mistake. 48 bit word size would probably have been better, but doesn't sound as cool.)
Having said that, 128 bit can make sense for certain calculations. So floating point calculations and GPUs support long registers for some of what they are doing.
(And having said that, Google figured out that they don't actually need all that precision when using GPUs for machine learning, and made TPUs with much smaller words.)
Your modern Intel chip only has 48 addressable virtual address bits. The recent Ice Lake processors support 57 bits. I deal daily with Python processes that have "6.1t" in the "resident" column in top. Hitting the 48-bit 256 TiB limit isn't science fiction to me. I can already hit that with a moderate amount of effort (by memory mapping ~100 large kv-stores). It isn't that I need to do it, but I could imagine that someone does.
It is well and proper that Intel rounded up to 64 bits. It should serve us well for the next 40ish years, which is good enough for my professional lifetime at least.
> Why would we want to switch to 128 bit addresses
Predicting the future is difficult. 30 years ago (I am old enough to remember) it was hard to imagine that every household would have 10s of devices connected to the internet. My university VAX for 16 concurrent student users had less memory than required to show the splash screen when today's phone boots.
So if in 30 years the computing paradigm has changed and we directly address memory over the internet? I must admit that in my imagination we have reached a point where growth will slow down. But I have learned that my imagination is not always good enough.
Direct addressing that way seems unfeasible, it would require a extending the IP stack to support direct addressing for it to work, and it already support 128 bit addresses. A far more feasible model is direct addressing through a unique IP, and we do not need a bigger address space for that.
That said, considering the current amounts of data Google holds, I could see the theoretical point about unique addressing every single byte they have. 64bit addressing only allow for a single order of magnitude of growth in that scenario.
The fact that you can put everything, ever into 128-bit means you can easily shove ten thousand 20-terabyte storage devices into a single 128-bit system's memory-mapped I/O space and still have 3 exabytes left over.
No more packet-switched serial storage I/O. You now have first-class ability to ask for any byte anywhere, really really really fast.
Because I/O request speed is now only limited by the the memory controller (which already goes at TB/sec in x86 hardware), a fast storage controller now has the opportunity to optimize and batch requests downstream to storage devices much much more efficiently. Because if the storage devices go at a certain speed but suddenly the addressing infrastructure is A LOT LOT faster, your optimization window just went through the roof and you can coordinate much more effectively.
I forget the exact architecture, but one of IBM's 128-bit boxen already does this. Various random bits of the hardware use MMIO as a first-class addressing strategy. The OS does the rough equivalent of `disk = mmap2(/dev/sda)` at bootup. Maybe this is a z series thing.
The calculations are off. 128 bit addressing allow for more than 3.40 x 10^38 addresses, and your storage example is 1.76 x 10^18 bits. That is, if we address the individual bits (we do not), we don't even have a name for the unit denoting that magnitude of space addressable left when using 128 bit addresses.
Thing is, an “address” could be much more than an address. One example is the CHERI hardware capability stuff and its Arm counterpart, Morello: sizeof(void *) is 16 there.
And yeah, getting real world software, such as Postgres or Nginx, to work requires some fixes, but it’s really not that bad.
And I don't mean that in an absolute sense. People might very well get up to those orders of magnitude of RAM; but you are unlikely to have that amount of RAM available to a single processor.
The extra margin of the 64 bit might help with address space randomization, though.
PAE is a kludge because the architects needed to add another tiny page table level to keep 32 bits of virtual space, but didn’t really want to add another level.
The concept of having more physical address bits than virtual bits is reasonable, although it falls apart a bit with virtualization. The idea of having magic registers that fill themselves in for you and can’t be read at all (such as the PAE PDPTR registers) is a bad idea that unfortunately repeats itself in x86 design. Architects: don’t do this.
This is only 13 years from now. I really doubt that people will need 16 EiB by then in a single processor for a single process. Plus by that point you will probably have moved to message passing between distributed systems or something.
I think the issue is not the hardware needed but simply the cost of the extra compute. Linux doesn't have an "official" CI server but relies on third parties running it, and none of them want to spend $ on compute for an architecture they will never use.
Not always. Cache speculation bugs for example I dont think are covered by emulations. (At least not explicitly. If the mmu model is the same and transparent enough I suppose those could 'pass through' but only if they're implemented on the same plat form.
You can certainly exploit Meltdown from a VM. The hypervisor tries hard to stay out of the way, and on a properly configured VM the vmexits should be few enough while running compute-bound code that microarchitectural side channels are very well exploitable.
For clarity, in this context I think @
vetrom was talking about emulation, not a virtual machine. A virtual machine makes use of the native instruction set (relying on the hardware) to create a machine within a machine and naturally can only create virtual machines of identical hardware capabilities (or a subset thereof). Emulation does all of that in software and could emulate even a completely different architecture.
Though to get fast emulation, you can re-use techniques from virtual machines (and vice versa).
Btw, emulation is how you can, in principle, run Linux on an 80286. (Normally you'd need a 80386 at least to get memory protection.)
Of course, Linux on the 80286 is gonna be even slower than Linux in your browser. (https://bellard.org/jslinux/)
Using the emulation idea you can go almost arbitrarily primitive. But the 286 is a nice target, because it was a sometimes requested feature in Linux's past.
No, that's the i386 architecture, which is still well supported AFAICT. x86_32 is a funny architecture that requires 64 bit chips but uses 32 bit pointers.
This post is about i386 / x86_32. What you're referring to is "x32," a different thing entirely, which uses the x86-64 instruction set. You can tell this post is not about "x32" because it talks about "segments, gates, and
other delights from the i386 era," which don't exist on the x86-64 architecture regardless of how long your pointers are, and it says things are "fine on a 64-bit kernel" (x32 uses a 64-bit kernel, it's a userspace ABI only).
For all x86 processors, now there is a name "X86_32" which "depends on !64BIT" (i.e. 64BIT flag has to be off).
and there is a name "X86_64" which "depends on 64BIT" (i.e. 64BIT flag being on).
Under new convention X86 is a common prefix for both 32 and 64-bit kernels for x86 processors, and the suffix _32 means the kernel uses only 32bit instructions, and _64 that it uses 64bit instructions.
And also, there's handling of an old arch name "i386" which is recognized to mean 64BIT is off and if only "x86" is seen as arch name then there is a prompt that asks 64BIT yes or no. It also notes that the old name used for kernel with 64BIT off was "i386" and the old name used for kernel with 64BIT on was "x86_64".
Finally, the support for X32 ABI (allowing running executables which use 64-bit instructions but only "short" 32bit pointers in 64-bit kernel https://en.wikipedia.org/wiki/X32_ABI) is enabled in my 64-bit distro:
No, that’s “x32”. “x86_32” is not an official name for anything, but based on the contents of the post, I believe Andy Lutomirski is using it to refer to 32-bit x86, aka i386.
Your comment should be the top comment. I had no clue i386 and x86_32 where different. I though the article was saying 32bit linux is broken.... but no some strange "alternative" mode on linux is broken.... who cares.
Unfortunately, upstream doesn't really notice if 32-bit x86 breaks. This isn't just a kernel problem; glibc managed to push out a release back in 2017 that broke memchr on 32-bit Atom in a way that tended to cause programs calling it to segfault. It turns out that amongst a few other things, including Python, the glibc build process itself relies on memchr working... I don't think they actually tested the code path in question after modifying it.
The really fun part was when they launched the first 64 bit Atom CPUs (Bay Trail) and then a bunch of them got shipped with only 32 bit UEFI and cannot boot in 64 bit mode.
They have to _boot_ with a 32-bit UEFI bootloader, but can run a 64-bit kernel and userland. Fedora Linux even allows for this while keeping secure boot on; Debian supports it as well but the interaction with secure boot is buggy, so you need to turn it off. Not sure about other distros, however.
Casual computer users don't buy new computers each year. Pretty sure set top boxes and routers with 32-bit x86 processors have been sold since then, too.
If only... I have been hoping for a native x64 version of Visual Studio for almost a decade, and signs do not look good for that ever coming to fruition. As it is, loading up solutions that consume nearly 2GB of RAM are still bringing the IDE to its knees and making it thrash and page.
They are still building i386 packages for use on 64-bit kernels, for the sake of precompiled 32-bit software (games etc). That's scenario B from Simon McVittie's reply in this thread, and it sounds like that's not affected.