Adventures of putting 16 GB of RAM in a motherboard that doesn’t support it

CharlesColeman · on April 4, 2019

> In order for the document to help me, I clearly needed to find the four error parameters that used to be displayed with the blue screen on older versions of Windows. Windows 10 hides all the information by default, but I discovered that it’s possible to re-enable display of the extra error information by adding an entry to your registry.

I hate stuff like this. What's the point of hiding the parameters, instead of tucking them away in the corner of the screen or something? It just makes things needlessly difficult.

I know the answer is going to be "but it makes the UI look beautiful and most users don't know what the parameters mean anyway!" Error UIs don't need to be beautiful, they need to be functional. Definitely have a nice, clear user-friendly error message, but keep the overall screen technically informative. I can imagine that the only thing more infuriating than having a computer that won't boot, is having one that won't tell you the problem unless you change registry settings (which you can't, because it can't boot). Leaving the extra information there does not harm, and it give people something to Google if they do run into a problem.

magicalhippo · on April 4, 2019

Indeed. Regular users have no idea what to do with "friendly" error messages telling them about some unrecoverable failure either, while they just leaves the techies with nothing to go on.

When raising exceptions in code, I make sure the error message contains as much relevant detail as I can stuff in there. Users will just mail a screenshot to support regardless (if they want help before the full error report gets home), so might as well give as much info as possible to make troubleshooting easier.

spuz · on April 4, 2019

Yes. What I can't understand is why this extra diagnostic information is not available with a keyboard shortcut, rather than with a change to the registry. If you get a blue screen on boot, you typically have no way to alter the registry. Holding shift or something would make much more sense.

tareqak · on April 4, 2019

Something similar is how remapping the Capslock key in Windows to another one has to be done via registry edit instead of a Control Panel setting. Given the renewed drive from Microsoft to become the developer’s OS of choice over the last five to ten years, choosing to not offer a more a accessible solution than a registry edit seems contrary to that drive for me.

HeWhoLurksLate · on April 5, 2019

Honestly, it's probably something that just got overlooked. I've only ever hated my caps lock key enough on a keyboard to rebind it once.

lbotos · on April 5, 2019

Seems like you aren't using Vim Then. Caps to escape is a common binding. It's not about hate, it's about ease of modal editing.

thaumasiotes · on April 5, 2019

I use caps-to-ctrl. Ctrl is so common that not being able to hit caps lock for it is now a major usability issue for me on other people's computers.

cheschire · on April 5, 2019

Older Sun keyboards used to have control in that location labelled by default. I remember buying a happy hacker keyboard back in the early 2000's that also had that as the default configuration, with a dip switch to flip it.

bscphil · on April 5, 2019

After using a Chromebook for a while I discovered how much I love having my Caps Lock key remapped to launching the desktop search tool, and now I do it on all my computers. E.g. Milou for KDE, I think the MacOS one is called Spotlight.

bootlooped · on April 7, 2019

Remapping caps lock to "app" has helped me a lot. It allows me to bring up the right click context menu without touching the mouse.

TomK32 · on April 5, 2019

I've mapped it to switching the keyboard locale on my linux desktop.

anticensor · on April 6, 2019

Windows Input API is multi-leveled and toplevel one sends key combos and characters only. Because keyboard drivers are system-level (although configurable per-user), you can only remap via system reconfiguration, which makes sense.

winrid · on April 7, 2019

That's just an implimentation detail. You could make a UI around it.

tareqak · on April 7, 2019

Microsoft is capable of creating an additional tab in Control Panel to do exactly this: if it requires a system restart or a user logout, then it is fine even if not ideal.

Avery3R · on April 4, 2019

It's inconvenient, but you could boot into a WinPE(recovery mode, or installation media) environment, and edit the registry from inside there.

Arnavion · on April 4, 2019

Does booting a recovery environment work if (Windows believes that) the BIOS memory mapping is hosed, as in this case?

taborj · on April 4, 2019

The article mentioned that he got the BSOD even booting a recovery environment.

neilsimp1 · on April 5, 2019

It took switching from Windows to Linux to figure out why my PC had issues booting sometimes.

In Windows, it would just spin a little circle forever until I rebooted. After installing Linux and getting actual error messages, I learned that my SATA AHCI controller wasn't always working at boot and was fixed by plugging my HDD into a different port.

The problem could have been fixed so much earlier had Windows just told me what was wrong.

hrktb · on April 5, 2019

> Error UIs don't need to be beautiful, they need to be functional

They need to be both. If it can’t be both, there’s as many arguments for it to be either human friendly or actionable.

I think it comes down to it being a human/machine interface, and what the human side does with it. If for more than half of the users the next step from the error UI is to rage phone their IT department, and a sympathic UI might actually stop them from doing it (or at least be considerate), I’d argue a friendly UI should be prioritzed over an overly informational one.

If it can be both, that’s better, but I wouldn’t be surprised if after user testing it appears it’s just damn hard.

Also having an error displayed on screen that can’t be retrieved any other way (logs, diagnostic dump on a console somewhere else, whatever) is a design choice I’d loathe way more than not having the error number on the BSOD.

rkagerer · on April 5, 2019

Hah! I came here to quote the exact same sentence.

Love to smack the goof who decided a giant sad face is more helpful than dumping even some minimal diagnostics fine print.

dzek69 · on April 4, 2019

Another reason why I like Xiaomi software over other Android phones:

When app crashes Android simply tells you that "Unfortunately, App has stopped". Xiaomi UI additionally presents you the stack trace.

BTW: Android used to (?) have "Report" button, but it seems it's gone on new versions (Since 5.0?). Anybody knows where these reports were send? Probably to Android OS team, just like MS sends some crash infos of 3rd party apps to themselves.

AnssiH · on April 4, 2019

You can see the stack trace on stock Android when you click "Report", then "Show system info", then scroll down to "Stack trace" (didn't check exact wording).

The Report button is still there - maybe it is not shown if there is no one to report to? For apps on Google Play, the user reports can be seen in Google Play Developer Console.

nneonneo · on April 4, 2019

Yes but the stack trace is always visible in the logcat, which is the first port of call for developers.

benj111 · on April 4, 2019

"unless you change registry settings (which you can't, because it can't boot)."

Yeah, I was a bit surprised by that also. Even when turned on the extra debugging info isn't exactly front and centre, why not keep it there.

londons_explore · on April 6, 2019

It's easy enough to load windows from a live USB stick and then load in the registry hives from your dead system.

Arguably linux is the same - you can't edit files from grub - you need a bootable system to use vim.

codedokode · on April 4, 2019

Windows has always been like this. Instead of giving any details they used to advise user to talk to their sysadmin.

userbinator · on April 5, 2019

BSoDs and application crashes used to be far, far more informative.

https://en.wikipedia.org/wiki/File:Windows_NT_BSOD_at_GVA_ba...

https://i.stack.imgur.com/QCqSy.png

uranium · on April 4, 2019

Man, this gave me flashbacks to trying to put 3.5-4GB of RAM into motherboards in 2003. After a bunch of iterations of telling the motherboard manufacturers what we wanted, and having them hack the BIOS to work for exactly one DIMM configuration, then fail on anything else, I eventually gave up expecting most of the BIOS to work at all. I added code to the Linux startup sequence that ignored everything the BIOS said about memory maps, queried what physical DIMMS were there, and rebuilt my maps by hand.

It's been so long that I can't tell an e820 map from an MTRR, but this was still a fun read.

JoshTriplett · on April 4, 2019

> I added code to the Linux startup sequence that ignored everything the BIOS said about memory maps, queried what physical DIMMS were there, and rebuilt my maps by hand.

I'd love to see this code; doing memory management entirely in the OS and ignoring the BIOS sounds like fun (modulo working around BIOS-specific memory reservations so the OS doesn't get its memory stomped on by the BIOS).

ianai · on April 4, 2019

Post-UEFI I wish this were more of a viable option.

josephg · on April 5, 2019

I agree.

I spent some time last year running various scripts to get an NVIDIA GPU working over thunderbolt in windows on a macbook pro. The problem is the DSDT table in many macbooks doesn't allocate enough space for pci-e devices. The NVIDIA driver in windows tries to allocate memory via that table to talk to the eGPU and it fails.

For some reason it works fine under MacOS - either macos ignores the DSDT table completely, or it allocates memory a bit differently than windows. In any case, the answer is to use obscure tools to download and patch the DSDT tables to allocate more RAM toward PCI-e. Doing this through UEFI feels very magic.

I don't know why so much of this stuff is up to the hardware vendor. Maybe there's a good reason, but I would expect the windows memory manager could do a much better job if it didn't have a bunch of memory range sizes hardcoded by apple.

onion2k · on April 4, 2019

I spent many, many hours tweaking autoexec.bat and config.sys on DOS to get 3Mb of free memory so Doom would run.

Kids today...

amiga-workbench · on April 4, 2019

My Toshiba Libretto has similar issues where the BIOS will lie about the capacity of the hard drive, its so bad that when reinstalling Windows I've got to format the drive in another computer.

dougg3 · on April 4, 2019

That's awesome! I'm not sure how I could have gotten Windows to work in that situation. :-) Glad you enjoyed the post.

userbinator · on April 5, 2019

One of the things you'll quickly learn from working in IT (and probably other fields) is that "not supported" can mean anything from "physically impossible" to "we don't want you to".

In the case of RAM, "physically impossible" would mean something like the CPU not having that many address bits or the PCB traces not routed, but there are a lot of different not-well-advertised configurations of DIMMs available with the same size[1], so it could be that the manufacturer specified a lower maximum just to avoid having to answer subtle questions. For example, 2x8GB may be OK but 1x16GB not.

It could also be a "X if Y except Z else B" situation, and they just couldn't be bothered to document all the possible combinations and/or explain the details of the PC system architecture that result in such limitations.

To test if all the memory is present (and if it all works correctly), running something like MemTest86 might be sufficient.

[1] https://en.wikipedia.org/wiki/Memory_geometry

anonoholic · on April 5, 2019

> "not supported" can mean anything from "physically impossible" to "we don't want you to"

More like:

from "physically impossible" to "we haven't tested that"

> they just couldn't be bothered to document all the possible combinations

Each combination adds exponential testing and documentation requirements.

Not tested != doesn't work. But that doesn't make not supporting it malicious. There are practical and financial cost to testing every combination, ultimately born by the consumer.

HeWhoLurksLate · on April 5, 2019

My favorite is when they say "not supported" and it's easy to make it work and is just their way of scaring Enterprise customers to their bread & butter. (One example that comes to mind is a bunch of Rigol oscilloscopes that, even though they have FPGA's on them, can still have more functionality enabled with resistors.)

SmellyGeekBoy · on April 5, 2019

As an owner of a lower-end Rigol oscilloscope, this is very interesting to me...

gwbas1c · on April 5, 2019

> "we don't want you to"

Can be:

- Something that will be rather obsolete soon, so it won't be tested

- Something that usually works, but has poorly understood corner cases or other implications

- Something that is so infrequently used that it's not worth it to build up proper testing infrastructure to prevent regressions

In this case, who thinks Foxcon has an engineer on call, who's familiar with the motherboard, who can actually answer this question? I personally doubt it.

alchemism · on April 4, 2019

My Dad bought us an IBM PCjr with 256KB of RAM, which was kind of elite in 1984.

But he wanted more. So, he purchased a pair of thick memory expansion board that brought it up to an unheard-of 640KB of RAM.

Of course, these slabs generated an impressive amount of heat, so we had to set up a series of cooling fans, or else computations would go awry and weird bugs would appear in programs as it overheated.

It had all manner of strange behaviors, and IBM engineers on the support contract would dutifully visit and give us stacks of floppies containing custom builds of MS-DOS to try and help us out when he called about problems with his mods.

Today, incidentally, I’m a sysadmin.

benj111 · on April 4, 2019

So was the memory expansion official?

I would guess not? But then IBM were supporting it???

OT. I now have visions of young engineers joining IBM with dreams of working on big iron, but ending up in the burbs fixing PC jrs.

pvg · on April 4, 2019

The base PC AT shipped with 256k of RAM in 1984, the Fat Mac with 512KB. I think the elite thing here is not the amount of memory but buying a PC jr and then really committing to trying to upgrade it.

dev_dull · on April 4, 2019

I love this line:

> All I know is it passes my RAM tests. Since Linux has been working fine with the 16 gigs of RAM installed, I am not too worried. It’s possible I would have problems if I had more PCI/PCIe cards installed or something, but in my use case, it seems to behave fine. Obviously, your mileage may vary

I run into this so often. "Look, it works for me. Don't try it at home. Who knows if it keeps working? What can I say I'm going to ride it until the wheels fall off".

ivanhoe · on April 5, 2019

Well, it's not like you're in any real danger for trying it, if your system becomes unstable you just take out that extra ram. It's exactly the same as overclocking, your box might not work as stable as before, but you get the extra speed, so thousands of people do it...

zaroth · on April 5, 2019

“Just whatever you do, don’t reboot the server.”

Said on the last day to the new hire.

PaulHoule · on April 4, 2019

It's a long term difference between the Unix culture and the Windows NT culture (which date back to VMS) that Windows will give up when things are wrong and Unix will barrel ahead anyway.

toast0 · on April 4, 2019

I think there's reasons that either approach is the right one, but Microsoft has the benefit that if they panic on boot, there's a good chance the issue will be fixed, and most likely before the product is released. (In this case, the fix was apparently just to list a smaller maximum ram size though)

ajross · on April 4, 2019

This is it. Firmware and OEM system configuration in the PC industry is basically garbage, and always has been. OEMs tweak until it boots whatever version of windows will ship on it, rush it out the door, and never touch it again unless some volume customer comes back with a bug report.

Linux needs to come in after the fact and run on whatever garbage happened to ship.

(FWIW: in this case the root cause was a host bridge in the tables which had been granted a truly outrageous memory space despite having no devices in it. It was likely a typo, or some test stuff that got left in.)

throw0101a · on April 4, 2019

Unix was not designed to stop you from doing stupid things, because that would also stop you from doing clever things. -- Doug Gwyn

* https://en.wikiquote.org/wiki/Unix

ineedasername · on April 5, 2019

An excellent quote. The line between the two isn't always clear either, though usually the uncertainty is in only one direction for me, i.e., things I thought were clever at the time turn out to be stupid down the line.

duxup · on April 4, 2019

In another career I sat on a conf call where an engineering manager was really upset and went off on a tangent about how X code should not "freak out" every time it sees something unexpected.

I was not writing code at the time and had nothing to contribute.

So apparently they fixed it later.... then came the security issues, unexpected behavior, shit going sideways.

Sometimes a straight crash is not such a bad thing.

voidfunc · on April 4, 2019

I would rather something crash and log than continue silently most of the time, especially with backend and server systems.

One of the things I like about Kubernetes is that the ecosystem (generally) tries to adhere to the "Not healthy? Then crash and keep crashing" mantra when something doesn't work. If I see something is in a CrashLoopBackoff I at least know its b0rked. Stuff that reports its up and running when it's actually hosed is really annoying.

duxup · on April 4, 2019

I'm inclined to agree.

The worst sort of thing to track down is working ... but not. God knows when it started or what it has impacted.

Worse if it impacts real data or data goes where it shouldn't.

olliej · on April 4, 2019

Indeed - and increasingly deterministic crashes are being put in release builds (see RELEASE_ASSERT in webkit, and similar elsewhere)

Johnny555 · on April 4, 2019

And since windows was meant for the masses, it also helps keep someone from accidentally running in an unstable configuration.

cptskippy · on April 4, 2019

> (In this case, the fix was apparently just to list a smaller maximum ram size though)

You're assuming this was an issue. Manufacturers religiously create artificial tiers in software to upsell users. This is true of hardware just as much as it is software.

raverbashing · on April 4, 2019

To be honest, in regards to ACPI it was a very painful couple of years until Linux got the hang of it and stopped breaking for any issue

(It also took some Perkele by Linus to make it work correctly)

waterside81 · on April 4, 2019

For those like me who don't know what "Perkele" is:

Perkele means evil spirit in Finnish and is a popular Finnish profanity.

benj111 · on April 4, 2019

Thanks, I was about to google, but had guessed correctly anyway.

https://en.m.wikipedia.org/wiki/Perkele

bogomipz · on April 4, 2019

I'm curious what's the connection between VMS and Windows NT?

Johnny555 · on April 4, 2019

I'm curious what's the connection between VMS and Windows NT?

https://www.itprotoday.com/compute-engines/windows-nt-and-vm...

lasereyes136 · on April 4, 2019

Read Showstopper it is about the making of Windows NT. It talks about the people that moved for Microsoft from DEC.

https://www.amazon.com/Showstopper-Breakneck-Windows-Generat...

Anarch157a · on April 4, 2019

The creator of the NT kernel (Dave CUtler) also created VMS while working at DEC. Because of this, NT have a lot in common with VMS.

bogomipz · on April 5, 2019

Interesting I never knew this. The links above all look like good reads. Cheers.

seba_dos1 · on April 4, 2019

VMS++ == WNT ;)

EpicEng · on April 4, 2019

That returns false and now you have to track down a bug because you decided to be clever :D

qntty · on April 4, 2019

++VMS == WNT

bregma · on April 4, 2019

WinNT = ++VMS

benj111 · on April 4, 2019

WinNT = ++*(VMS)

ch_123 · on April 4, 2019

Obligitory reminder that the NT name predates Cutler at MS (it was originally called OS/2 NT)

EDIT: fix typo

justinclift · on April 5, 2019

VMS > WNT

:)

bitwize · on April 4, 2019

Dave Cutler.

bitwize · on April 4, 2019

That's because back in the day, Unix was designed for hackers to mess around and play games, while VMS was designed for actual production workloads (and Windows inherits this legacy).

PaulHoule · on April 4, 2019

Partly.

Toward the end of the Bell system, AT&T used Unix to run the control plane of phone switches and also to do administrative tasks for the phone network. I remember a paper in the Bell Systems Technical Journal where they did auditing of billing records for the whole U.S. with a set of tools kinda like grep, awk, etc. but with binary-format data records.

There are situations where operating systems get behind the 8 ball and there isn't a perfect thing to do. Giving up to prevent data corruption is one choice, but trying to soldier on and do the best you can is another.

ryacko · on April 4, 2019

The L3 cache is large enough to run an 80s operating system, there is no reason why BSODs to be so minimalistic, information about the possible why it failed could be done.

Lists of recently switched processes, or the general category of errors it falls in.

ninkendo · on April 4, 2019

There's a school of thought that would say that by default, error messages shouldn't be a huge scary wall of text, because you have to consider that a majority of the people who would see an error message are not developers, and have no idea what to do with that information.

It should certainly be possible for someone who can diagnose an error to see the details, but making it the default can be problematic when the message has a chance of being shown to real end users.

AnthonyMouse · on April 5, 2019

> There's a school of thought that would say that by default, error messages shouldn't be a huge scary wall of text, because you have to consider that a majority of the people who would see an error message are not developers, and have no idea what to do with that information.

I think normal users have a pretty good idea what to do with it -- show it to the nearest tech person or paste it into Google.

Just giving it to them up front makes things easier, not harder. They need help so they take a screenshot. If it contains the relevant information, someone can help them. If it just says "an error occurred" then nobody can help with that because it contains no information. Now they have to go back and have the tech walk them through trying to extract the information from some log file, which is seventeen clicks through an interface they've never used and now there are 5000 entries and they don't remember exactly what time it happened etc.

It also deprives the user of any opportunity to learn. The user has a problem, they get a message, they ask the tech what to do and the tech tells them. Now the next time they get the same message they know what to do. But if the message is always the same regardless of the cause then they lose the ability to match known problems with known solutions and give up.

Not all users care to learn that sort of thing, but you take it when you can get it, not purposely inhibit the user from doing it.

afureta · on April 5, 2019

Thank you for that perspective - I've never thought of it that way before. Would it be worth appending "if you don't know what this means, ask your local nerd"?

AnthonyMouse · on April 5, 2019

Sure. You don't have to take over the whole screen with a gdb console they don't know how to use. Make the message as friendly as possible, but it still ought to actually contain the information necessary to identify what happened.

I suspect the modern trend of making problems opaque comes from companies that do it on purpose. If the user can solve their own problems, what do they need with your expensive support contract? Why have the user fix problems with their existing device when you can sell them a whole new one the first time anything happens?

And then other developers who don't even use those business models still cargo cult the same UX.

firethief · on April 5, 2019

That sounds like airbags in a car. On the majority of trips you don't need them, but their presence is of little consequence. When you do need them, you're out of luck if they're not there.

jxfreeman · on April 5, 2019

Not to mention that Intel chips are already running Minix ...

ryacko · on April 5, 2019

x86 used to support coprocessors and has the lock instruction for it. They should have added the security module as a PCI card.

shawnz · on April 4, 2019

First of all, how would you localize it? English-only errors would not be acceptable today like they were in the 80s. Secondly, what if accessing/storing that diagnostic data is what caused the crash?

hak8or · on April 4, 2019

Not acceptable? English is the lingua de franca of technology.

I rather have a detailed error report in one language and then have to find someone who speaks it to translate it (if Google doesn't do a good enough job), than fifteen poorly written versions. Not to mention, those error screens should be as minimal as possible in terms of features so less goes wrong.

I don't want animations, fancy colors, dealing with the horror show that is localization, or anything of the sort. Because the system when it hits a bsod is in an undefined state, so it's best to exercise as little as possible of the system, just enough to get the error message out.

shawnz · on April 5, 2019

Exactly what I'm saying! If all it does is show a bitmap, that's the simplest possible way of getting the error message out. The more detailed you want to make the BSOD screen, the more chance of something going wrong while rendering it. Localization is just one problem with making it detailed, but there are plenty of others too even if you settle for English-only errors.

doteka · on April 4, 2019

Yeah, given the choice between a translated message that is utterly useless and an English-only informative error dump, I'm gonna bet everyone remotely capable of dealing with the error would choose the latter every single time.

shawnz · on April 4, 2019

> everyone remotely capable of dealing with the error

Already this excludes most people who will see the error. Meanwhile, experts already have more sophisticated tools available to them like the event viewer. Obviously that wouldn't be useful in a situation like this where boot-up is blocked, but like I mentioned previously, there's only so much diagnostics you can reliably provide on a system that is in the middle of crashing.

MrStonedOne · on April 5, 2019

The people who see the error will be the ones who screen shot it and send it to people can deal with the error.

So both sides have to be able to see the details.

shawnz · on April 5, 2019

Well, you can't screenshot a BSOD, so I assume you mean they would take a photo of it (which would be awkward as hell just getting a high enough res copy from them that I could read it) or more likely they would just tell me the error message verbally and leave out all the advanced details. So I'm not sure this is really that good of a justification to make the BSOD screen more detailed.

logfromblammo · on April 5, 2019

A BSoD screen could display a QR code that encodes a URL at microsoft.com that unpacks all the data after the '?' in the URL, and provides an option to change culture and locale, including links to support pages known to be associated with that error, and a button to Bing-search for more information on the error. They could then copy-paste the URL for that page using their phone into a text or email to their favorite technical adept.

shawnz · on April 5, 2019

They actually do exactly that already! You can see the QR code right in OP's pic.

logfromblammo · on April 5, 2019

I meant that the QR code should be larger, to encode the additional information shown after the registry patch as additional URL parameters. On re-reading, I did not make that clear.

Nobody is decoding QR codes by hand, so it isn't that big a deal to go from "https :// www.windows.com / stopcode ? code=ACPI_BIOS_ERROR" to "https :// www.windows.com / stopcode ? code=ACPI_BIOS_ERROR & p1=0000000000000002 & p2=FFFF9A0..."

shawnz · on April 5, 2019

Ah, you're right. I didn't realize the bugcheck parameters weren't included in the code, that's definitely something which would be helpful. However then you still might run into the second problem of not being able to reliably render it in a crashing state. That's the only reason I could think why they would not include those parameters in the QR.

michaelmrose · on April 5, 2019

If the info is for the vendor who deals almost exclusively with English then localization is pointless.

If it is for the user then being able to google the exact error is going to return more results than 47 translations.

Most users will in fact do nothing but report said error to their vendor whom will refer to their English language documentation.

In other news most programming languages aren't localized for example.

shawnz · on April 5, 2019

> If it is for the user then being able to google the exact error

And they already provide just enough information to be able to google the error. I am just opposed to adding more detailed information, the kind of information that's only relevant for experts and is not guaranteed to be available in the middle of a crash.

> In other news most programming languages aren't localized for example.

Programming languages aren't consumer products like operating systems are

michaelmrose · on April 5, 2019

Error messages aren't really consumer products either.

equalunique · on April 4, 2019

My HP N40L "MicroServer" only supports up to 8GB RAM, but I have it running now with 16GB. Sometimes FreeNAS will boot saying there is 16GB, and other times it will boot saying there is only 8GB. I'm unwilling to risk loosing that old reliable system via a potentially dangerous BIOS hack, but it sure would be nice not have to reboot it until I win the lucky 16GB jackpot.

orev · on April 4, 2019

I would be concerned that there’s some electrical reason why it’s detected intermittently, and that the extra 8GB could easily disappear while the system is running.

segfaultbuserr · on April 4, 2019

Low-level software is responsible for most memory issues, but there are definitely electrical reasons, too.

For example, I have a devboard that only support 1 GiB of DDR2 RAM, but it's a 64-bit system and the memory controller on the CPU was supposed to support at least 2 GiB of RAM. Meanwhile, another board that uses the identical chip runs 2 GiB RAM without problems.

The engineers of the devboard briefly explained that the problem was electrical. The memory controller itself has inadequate drive strength, adding more RAMs would increase the load on the DDR bus and destabilize the system. On the other hand, the other board had better PCB layout so the problem did not occur.

equalunique · on April 4, 2019

I would sure like to find an inexpensive Mini-ITX motherboard in 2019 that could run my 2x8GB sticks of 204-pin DDR3 ECC UDIMMs properly.

remlov · on April 4, 2019

Holy smokes. I thought finding an ITX board for 2x8GB sticks of full size 240-pin DDR3 ECC UDIMMs was hard.

segfaultbuserr · on April 4, 2019

I have a Mini-ITX board, I noticed that if I activate XMP, the board will cease to work, but it works with 4 GiB of RAM at standard frequency.

So it seems memory is a general problem among Mini-ITX boards? Perhaps the reason is that these boards have less available space for routing, fewer layers, and targets a lower price, so they trend to have worse electrical characteristics?

remlov · on April 5, 2019

Quality of the PCB and the number of layers definitely plays a factor I'm sure it isn't limited to ITX though. I have noticed compatible differences between super robust Intel ITX boards vs ECS thin and wobbly ITX boards, with the ECS board having more issues, quirks and what not.

spockz · on April 4, 2019

I had something similar with my 2011 MBP. I had it running just fine with 16GB of ram even though it officially only supported 8GB. After having the main board replaced because of a defective graphics card I could not run it with more than 8GB without it crashing continuously.

bluedino · on April 4, 2019

Should work fine. I wonder if you needed a firmware update?

mikeash · on April 4, 2019

I once had a defective 512MB RAM module. Usually it would be detected as a 256MB module and it would work fine. Sometimes it would be detected as a 512MB module, but corrupted data would crash the system within a few minutes.

jacob019 · on April 5, 2019

You used to be able to scan memory consistency and pass a kernel parameter to skip bad regions of memory. I ran a system for years with known bad memory that way.

userbinator · on April 5, 2019

...and it probably already has, multiple times, but you haven't noticed because you've not used that much RAM.

The BIOS does a (relatively) quick memory check in the POST to detect how much memory is actually available, basically by writing a series of patterns to all addresses and then reading them back to confirm; some desktops have a "fast boot" option which mostly skips it (I believe it's something like testing one byte per 4KB instead of every byte), and servers usually have a much more thorough test that can take many minutes.

The best way to check whether the memory is functional when 16GB is detected is to run a memory tester like MemTest86.

kstenerud · on April 4, 2019

Are you sure about that? I bought mine with 16gb a number of years ago and still use it as a NAS + plex server (zfs loves to chew through ram). The wikia page also talks about it: https://n40l.fandom.com/wiki/Memory

Spare_account · on April 5, 2019

The "supports 16Gb" assertion on the page you've linked to is sourced from another forum page:

https://hardforum.com/threads/hp-proliant-microserver-owners...

> It seems to be confirmed, that these wonderfull little boxen can actually support 16Gb Memory, not the 8Gb HP Says they are limited to.

So it would appear that HP doesn't officially support 16Gb, which puts us firmly back in the territory of this post, YMMV, etc.

kalleboo · on April 4, 2019

I've done the same thing with a Synology NAS. Both Synology and the Intel CPU datasheet say it only supports 8 GB of RAM, but put 16 GB in it and it will use it.

The Synology web UI only shows 8 GB and it breaks the graph display in the memory usage monitor (although when I first installed it, it used to show 16 GB but after some update, or random reboot, it stopped. Haven't tried rebooting to see if it's just random)

fyfy18 · on April 4, 2019

Same here on a N54L. Officially these machines only accept 8GB, but plenty of people say they run fine with 16GB.

gyrgtyn · on April 4, 2019

I got two from the same guy on craiglist and they already had 16GB and it works everytime. Not sure what he did but I'm glad I wasn't the one fiddling with the BIOS

TMWNN · on April 7, 2019

As kstenerud said, https://n40l.fandom.com/wiki/Memory discusses this issue. More accurately, the comments discuss it, with some reporting your issue. I am among those who've never had problems with detecting 16GB; I use two KVR1333D3E9S/8G.

ownagefool · on April 4, 2019

I ran the hacked bios and it worked, for what that's worth. This was when they were fairly new.

jchw · on April 4, 2019

Love this. I've got an old Pentium box I thought I was going to have to hack up the BIOS to get larger IDE disks working, but don't you know it the motherboard manufacturer (AOpen) still had the BIOS update files on their site and after updating I could boot 32 GB disks on a board that only claims to support up to 8 GB. I think the lesson is, at least check for BIOS updates before you throw in the towel :) Maybe not everyone needs to go as far as hacking ACPI tables in memory, but it sure sounds like fun.

anyfoo · on April 4, 2019

In that era, you could often just ensure that the BIOS could load the bootloader/kernel by placing it into the region still reachable by the BIOS. OSes like Linux (and anything Windows NT based) practically completely disabled[1] the BIOS anyway, and their own IDE drivers would be recent enough to understand how to address the full disk.

It was not uncommon to have a small boot partition at the beginning of the disk for that purpose.

[1] Bit of a simplification. In reality, the BIOS being 16bit real mode code meant that you had to jump through very elaborate hoops if you wanted to use it in your protected mode OS in any way, and then for questionable gain.

jchw · on April 4, 2019

Oh yeah, definitely. I could get it to boot by marking the disk as 8 GB in BIOS, but this had side effects:

- I'm using CompactFlash as many do, since it's fast, cheap, and reliable. Some of my CF cards are smaller than 8 GB, and if I hard code it I can't boot with those.

- I actually do run OSes on that machine that thunk out to BIOS (Win98 for example.) If I really wanted a computer for useful things, no doubt I'd run Linux on it, but all the same, I've got plenty of smaller computers with unilaterally more power (even a RPi is much faster.)

(FWIW: autodetect properly detected the right parameters, it just locked up on boot. My guess is it was some kind of simple integer overflow bug or something.)

nisa · on April 5, 2019

> It was not uncommon to have a small boot partition at the beginning of the disk for that purpose.

still had to that a few weeks ago because on a dell r720 grub did not want to boot my zfsonlinux rootfs on a 6tb pool - it's either the dell hba controller, grub oder some other limitation but once you go beyond 2tb or 4tb disk I always run into strange behavoir.

Maybe UEFI solves that, I don't know.

jchw · on April 5, 2019

EFI doesn't solve it directly, but GPT does. Regular MBR can't handle more than 2tb disks. Of course when running EFI you always want to use GPT. Linux might be able to handle >2tb MBR somehow, but in doing so it might be confusing your BIOS perhaps.

bluedino · on April 4, 2019

Hard drive makers used to include a disk in the box that would fix these limitations.

Western Digital called their program EZBIOS, see if you can scrape a copy of that up somewhere.

jchw · on April 4, 2019

Actually, the BIOS would happily let me enter the parameters, it just locked up when iterating the disks if there were too many cylinders.

zaroth · on April 5, 2019

Wow, I remember those disks! I don’t think I ever had to actually use one.

pickle-wizard · on April 4, 2019

I remember running into this back in the day on one of my Windows NT boxes. Knowing that NT only uses the BIOS during the boot process, I just installed NT on a partition that was below the 8GB limit. Once the system booted the HAL communicated with the hardware directly and not through the BIOS. So it saw the full size of the drive, and I was able to create a second partition to use the full drive.

simcop2387 · on April 4, 2019

Yep this was one of the reasons that a boot partition for linux and other operating systems became some common back in the early 2000s. Disks outpaced the hardware and a lot of them didn't support reading from further in on the disk. Operating systems could use drivers to talk directly to the hardware and then get around that limitation once booted but everything they needed to that point needed to be available early on.

anonlapwarmer · on April 4, 2019

The workhorse oldie-but-goodie A1278 MBP mid-2012 non-Retina supports 16 GiB and two SSD's (one in the optical bay, and one in the HDD bay).

The Lenovo T480 supports 64 GiB. A 13" hackintosh-friendly laptop with:

- dual m.2 slots

- WQHD 2560 x 1440

- Thunderbolt 3

- water-resistant keyboard with drains

- 9 hours of battery life with the second, extended battery

- officially user-serviceable parts/guides

- & 64 GB!

The iPhone 6S, even with the headphone jack, is IP67 in all but name but didn't sell it as a feature... it has all the gaskets of the 7 but supposedly the headphone jack was an issue... mine's been in the shower a few times for YouTube morning news and still works.

You can make any data & charging connector USB-C magnetic & right-angle for $30. https://www.aliexpress.com/item/USB-C-L-Tip-Magnetic-Adapter...

drudru · on April 4, 2019

Thx for the tip on the Lenovo. I have been on the lookout for a Linux laptop.

anonlapwarmer · on April 4, 2019

Videos from Louis Rossmann convinced me modern MBPs are beyond awful: https://www.youtube.com/user/rossmanngroup

macOS and most apps have nice UX and mostly work well together. Docker usually works on top of the built-in macOS intel-origin hypervisor. (VMware Fusion is another $olid option for $$; had problems with VirtualBox.)

What's neat about the T480 is there's an utility on the Windows partition to flash the BIOS startup boot logo from red "Lenovo" to whatever you want.

somacert · on April 4, 2019

As far as I can tell there are three acpi stacks/interpreters

Intel, This is the reference, most operating systems use this (linux, freebsd, macos)

Microsoft

Openbsd, yes the openbsd project is crazy enough to build their own acpi implementation, keep fighting the good fight openbsd.

Because have you seen the spec? good lord what a train wreak.

smartmic · on April 4, 2019

A true hacker. And a wonderful read. Reminds me that the ability to really understand and control your system is one of the biggest achievements and power of Linux and FOSS.

saagarjha · on April 4, 2019

> I clearly needed to find the four error parameters that used to be displayed with the blue screen on older versions of Windows. Windows 10 hides all the information by default

A somewhat disappointing trend in user interface design…

dmarlow · on April 4, 2019

I love articles like these. It's an amazing skill to be able to do what was done here.

VectorLock · on April 4, 2019

Now thats the kind of deep dive you don't see very often in the rarified cloud computing world I live in now.

tombert · on April 4, 2019

About six years ago I updated a laptop to 16gb of RAM that only officially supported 8gb, and I remember thinking "the greedy manufacturers lied! It works fine!" However, I was running Linux, and now upon reading this I wonder if I would have had the same issues if I had tried booting into Windows.

Tsiklon · on April 4, 2019

My old 2011 MacBook supports only 8GB officially - though the Intel chipset onboard will support 16GB - It's run happily with 16GB of RAM since 2012...

tombert · on April 4, 2019

Oh yeah, I actually might have had that same model of MacBook, and I also think I put 16gb in there with an SSD; gave it to my brother in law. Amazing how well those things can hold up long term.

aidos · on April 4, 2019

They released a firmware update at one point and it started working with more RAM. I almost jumped for joy at the time.

madrox · on April 4, 2019

I remember being 14 years old playing with my NEC PC because IRQ conflicts caused it to reboot at unexpected times. My dad might have fixed it, but he didn't have time so I had to rely on myself if I wanted to play Warcraft ever again. I didn't really know what I was doing, but it was the first time I took something apart trying to figure it out. I explored BIOS and all the hidden parts that Windows tries to hide from casual users. Looking back, that's really where my perseverance in working through hard problems began.

I was expecting this to be "Linux roolz, Microsoft droolz," but I really enjoyed this. It's been a while since such simple, fun hardware exploration came across my desk.

Nr7 · on April 5, 2019

I have a kind of opposite problem on an old Lenovo laptop where Windows boots just fine every time but any Linux distro I try randomly fails to boot with an ACPI error.

It's not a RAM issue though. AFAIK there's some bug in the UEFI/BIOS even on the latest version which Windows handles just fine but Linux randomly doesn't. I can boot Linux every time if I disable ACPI at boot but then I lose power management and battery percentage. Latest kernels (can't remember the version exactly, 4.somethingoranother) should handle the issue according to some forum posts I found but it doesn't for me.

bluedino · on April 4, 2019

I wonder what the Mac equivalant 'fix' would be? The 2010 15" MacBook Pros only supported 8GB, where the 13" models supported 16GB. They used this same i5 chip.

satmandu · on April 5, 2019

Seriously. I tried putting the 16Gb of ram into the 2010 15" mbp, and it would boot into linux, but not macos.

Running memtest it was clear that there were issues with accessing ram in various segments, but I didn't have the knowledge to muck around with the acpi tables to see if that would fix it.

I'd love to give it another shot if possible.

glvn · on April 4, 2019

Back in 2012 I had a white MacBook. Apple said it only officially supported 4GB butt there were numerous reports of people getting it to recognize 8GB without issue. Back then I was still in high school and only semi-technical so I thought it was some sort of black magic.

bluedino · on April 4, 2019

The white polycarbonate MacBooks only supported 8GB according to Apple, but 16GB worked fine. Same thing with the 2010+ 13" Unibody MacBooks

unpixer · on April 4, 2019

Gather 'round, young 'uns: the 1998-vintage Powerbook G3 "PDQ" officially supported only 192 MB of RAM. Unofficially it was possible, if you bought the right DIMMs, to upgrade it to 512 MB. Which I did, just for laughs. (It was also possible, using XPostFacto, to upgrade it past the official OS X 10.2.8. maximum to 10.4.11. But it was so slow it was practically unusable.)

zelon88 · on April 4, 2019

Nice find! Reminds me of the Dell Vostro 230. It will "officially" support 8GB memory but if you buy 8gb "low density" DIMMs it will accept 16GB just fine. Not nearly as difficult as your setup but similar in nature.

maaarghk · on April 9, 2019

I had a similar adventure on the Vostro 230, followed by editing the BIOS to activate the vmx flag so that I could do hardware virtualisation with the Xeon processor that it shipped with!

encore2097 · on April 4, 2019

This is how hackintoshs boot! Love the independent discovery of ACPI hacking :)

xaduha · on April 4, 2019

Since it's kind of related to something a wondered about - can you spoof a device ID with those tools? So an operating system just eats it up? I know it's possible with VMs and device passthrough.

EB66 · on April 4, 2019

Wow, I didn't know this was possible...

My every day workstation is a Dell Studio 540 from 2009 that I've upgraded to the max in every possible way but without changing the motherboard and case.

The only upgrade I haven't pursued (because I thought it was impossible) is bumping up the RAM beyond 8GB. If anyone has experience bumping 2009 era Dell motherboards (in particular the M017G motherboard) beyond 8GB RAM and has any tips to offer, please let me know!

If I could bump up the RAM to 16GB, I might be able to keep using my beloved desktop for another 10 years :)

EB66 · on April 11, 2019

I just thought I'd provide a follow-up on this in case someone else with a Dell Studio 540 or Intel G45 chipset tries to upgrade to 16GB memory.

It works!

I ordered 4 x 4GB 2Rx8 PC2-6400U DDR2-800Mhz 240PIN UDIMM from a business on eBay called Memory Masters and it worked perfectly. No hackery required.

My desktop from 2009 now runs with 16GB RAM and will be hopefully be in use for another 10 years :)

justinclift · on April 5, 2019

As a first step, check your CPU model on ark.intel.com (if it's an Intel):

https://ark.intel.com

In theory (!) for older generation chips like yours and the one in the article, the maximum listed there (if present) will be correct.

For example, for the i5-750 listed in the article it shows 16GB maximum, which the article author was then able to make work:

https://ark.intel.com/content/www/us/en/ark/products/42915/i...

If yours shows 16GB as well, you may be in luck. :)

nisa · on April 5, 2019

looks like it's a Intel G45 chipset - so 16gb should work, however you need to be careful when selecting memory - there might be max size on the chips on memory module (so there are memory modules with 8gb with 8x1gb chips or some with 2x2gb etc.pp. - we had some trouble updating older dells with the newer modules with less chips. It's called low density I think.

d33 · on April 4, 2019

I think I would also run the memtest under the VM to make sure that the memory map is actually all reachable within Linux and doesn't mess anything up.

Etheryte · on April 4, 2019

As far as I can tell, the author did run tests on the memory after finishing:

> All I know is it passes my RAM tests.

d33 · on April 4, 2019

I don't know much about kernel memory management, but if he was booting memtest86+ bare metal, I wouldn't be surprised if he wasn't triggering the same edge cases Linux would do - which is why I'd prefer to make the actual system try to allocate as much memory as possible, use it and see if the results are sane.

dougg3 · on April 4, 2019

I'm the author of the blog post. My first time here on Hacker News...very cool that it was picked up! I did my memory testing inside of Linux using memtester. I honestly want to do more thorough testing at some point to prove that it's actually working okay. If some of the PCI hardware is still paying attention to that overlapped range, weird stuff could definitely happen.

drudru · on April 4, 2019

Simple - just run Chrome. It will consume memory like no tomorrow.

comprev · on April 5, 2019

My MacBook Pro 13 mid-2010 "officially" only supports 8GB RAM maximum - but I run 16GB - thanks to the research by other folks on the internet related to the Intel hardware inside. The particular brand & model of sticks had to be fairly specific, but can confirm it's been rock solid for 3 years and certainly extended it's useful life for me when money was tight.

Florin_Andrei · on April 4, 2019

> don’t know if the motherboard is actually hardwired to look at those addresses for PCI stuff or what. All I know is it passes my RAM tests.

I guess you could allocate a humongous matrix with random cell values and do a CRC or something (once while you're filling it in, and again from square 1) to make sure the values are indeed what you think they are.

quotemstr · on April 4, 2019

With respect to booting with a conflict in the system memory map: the Windows behavior is the right one, IMHO. An operating system shouldn't just ignore self-contradictory system configuration information and try to continue anyway. Who's to say that the resulting behavior is the right one? You're in undefined behavior territory. Failing fast when encountering invalid state, as Windows does, leads to more robustness in the end: fail-fast gets problems fixed, not swept under the rug.

wtallis · on April 5, 2019

Fail-fast only has a hope of getting the problem fixed if it occurs with popular software within a few months of product launch. In the case of using a motherboard with DIMMs that didn't exist until the MB was obsolete, you're never going to get the vendor to fix it, unless you have a very expensive long-term supply and support contract of the sort that doesn't exist for consumer parts.

balls187 · on April 4, 2019

I love this stuff.

I remember custom flashing a motherboard to support a pair of SSD's in RAID-0, and allow it as a boot device.

ameyv · on April 5, 2019

Brilliant work! I know this feeling very well..God if i can this everyday at work it would be just amazing...

gpvos · on April 5, 2019

What would happen if he removed the second 8GB but forgot to remove the GRUB ACPI table patch?

drudru · on April 4, 2019

It would be nice if Linux would avoid using any of the BIOS. I wonder if that is possible.

jadar · on April 4, 2019

I remember messing with ACPI and DSDT when I was into the hackintosh scene. Good times.

randyrand · on April 4, 2019

I wonder if microsoft would be interested in copying the linux approach.

bug report?

LeonM · on April 4, 2019

I don't think it's a bug, it's a feature (no pun intended).

If there is problem with reported values from a BIOS, it's a good thing Windows does not continue. BIOS issues (such as a RAM address out of the reported range) can lead to unpredictable behaviour. Troubleshooting such issues is next to impossible.

dclowd9901 · on April 4, 2019

If I were trying to explain to someone who knew computers but not cars why I would never buy a car with an engine swap, this is exactly the post I would point them to.

_bxg1 · on April 4, 2019

Just reading this gave me a headache

_bxg1 · on April 4, 2019

Really? Five downvotes for an innocently tongue-in-cheek comment?

Dylan16807 · on April 4, 2019

Were you making a joke, or just fake-complaining? Five is not a lot either.

_bxg1 · on April 4, 2019

Making a joke about how miserable it is for some of us to deep-dive into systems to get stuff working. I wasn't expecting any upvotes, but it was neither a rude nor an idiotic thing to say, so y'all need to chill.

zaroth · on April 5, 2019

It wasn’t clear you were sympathizing and not actually complaining.

For a second I thought you might say the contrast was too low, and which CSS style you had to adjust.

qwerty456127 · on April 4, 2019

In good old days of i440BX it usually took just a BIOS update (and some socket rewiring for the latest CPUs) to add support for extra RAM and CPUs...

frabert · on April 4, 2019

That's what it would take in this case as well, it's just that the manufacturer doesn't provide any.

qwerty456127 · on April 4, 2019

Sure. I've commented before I've read the article. Now I see. In fact the BIOSes I've ended up using during the last years of my BX PC (before I've bought a laptop) were community-patched. Now I wonder if it is possible to patch my Dell E5500 to support more than 4 GiBs or if it's actually an unbreakable limitation of its chipset...

nisa · on April 5, 2019

8gb should work fine. 16gb probably not.