Meltdown Update Kernel doesnt boot

hansendc · on Jan 10, 2018

This arch/x86/events/intel/ds.c fix is unlikely to have rendered too many things unbootable. I missed ds.c entirely when doing the original implementation. I can't fault the nice folks at Canonical for mis-merging a tiny hunk like this. It really only affects pretty specific hardware anyway.

MBCook · on Jan 10, 2018

What hardware does it effect?

fancyfacebook · on Jan 11, 2018

From the bug it looks like a smattering of Xeons, Celerons, i5s, not sure what's "specific" about it from first glance.

JackNichelson · on Jan 11, 2018

Pretty specific one.

ASalazarMX · on Jan 11, 2018

Take my hand, I'll guide you back to Reddit.

_pfxa · on Jan 10, 2018

Would this entire Meltdown/Spectre thing count as the biggest mess-up of computing history? When yesterday the PoC repo was posted here, the spy.mp4 demo video gave me some chills. And now I can't update my OS before making an installation USB because Canonical can't just follow Linus' releases. Thanks.

bArray · on Jan 10, 2018

>Would this entire Meltdown/Spectre thing count as the biggest mess-up of computing history? When yesterday the PoC repo was posted here, the spy.mp4 demo video gave me some chills.

It must be up there amongst the greats, probably with the "halt and catch fire" op code. Normally they just patch this stuff with microcode and never really tell anybody, this time that won't work.

I'm not entirely convinced it was a mistake at all (dons tin foil hat), Intel have been making suspicious design decisions to their chips for a well now (think this, Intel Me, hidden instructions, etc). It seems clear to me that this security by obscurity approach is quite frankly crap.

>And now I can't update my OS before making an installation USB because Canonical can't just follow Linus' releases. Thanks.

Linus' releases need some sort of buffering before they can be considered stable, often distributions will apply their own patches on top. Also consider the scenario where Linus releases a bad kernel and no testing has been performed before rolling out to all Linux users.

throwawayfinal · on Jan 10, 2018

I think it's absolutely unreasonable to imply that this was intentional. Besides the massive amount of complexity these systems have, there are plenty of "legitimate" places to hide backdoors, instead of in a performance architecture decision.

Keep in mind that whatever "evil agencies" would have asked for this would most likely find themselves vulnerable, and nobody would sign off on.

I do agree, however, the "security by obscurity approach is quite frankly crap". The fact that even large corporations (not the big 5) can't even get away from ME speaks volumes about why this is a bad idea. Facebook isn't the only company with important data.

dataflow · on Jan 10, 2018

> I think it's absolutely unreasonable to imply that this was intentional.

Amen. It blows my mind that some people think clever techniques like speculative or out-of-order execution must've somehow how nefarious intentions behind them. Come on HN...

Animats · on Jan 11, 2018

The Intel Management Engine is a backdoor. Speed variations in speculative execution are an inherent property of the technology. Until recently, few people thought this was exploitable, and it took a lot of work to figure out how to exploit this.

andrewflnr · on Jan 11, 2018

You do realize those are ideal properties for a backdoor, don't you? If you were writing the spec for a dream backdoor, you would write that down. The only way you could improve it would be "everyone thinks it's impossible, and they never figure it out."

netsharc · on Jan 11, 2018

This backdoor is too tricky to be a backdoor. A simpler backdoor would be "Call this opcode 45 times, followed by another opcode 20 times, and you will have activated backdoor mode where these opcodes are now available"...

jijji · on Jan 11, 2018

the ideal properties of a backdoor were visualized to me the day i hacked into an author of a largely distributed piece of smtp mail server, only to find sitting in his home directory an unpublished integer overflow exploit written by him years earlier for a version of the software that is currently in wide distribution...

dvfjsdhgfv · on Jan 11, 2018

That's close to perfect, indeed. The drawbacks in this scenario are that (1) not everybody runs an SMTP server, (2) if it's open source (and if it's very popular, then it is), some other smart people will look for the bug and publish it for fame. That's quite different from a backdoor built into a processor (although I really doubt Intel was really involved in any shady practices, it looks like they were not smart enough).

paulie_a · on Jan 11, 2018

Judging from the numerous decades old bugs recently found, the concept of many eyes needs to die.

And in the case of SMTP, it's basically a pinata of bugs for the last 30 years regardless of platform

sigi45 · on Jan 11, 2018

it is still way more likely a reasonable design decision for performance reasons than it is for a backdoor.

Alone the risk would not be worth to intel. Do you really think, nsa has enough money to compensate for this backslash and newscoverage?

Animats · on Jan 11, 2018

Yes, though it's moderately hard to exploit against a specific target. It's more useful for bulk attacks - getting everyone who visits a specific web site to run a DDOS attack, or ransomware.

eli_gottlieb · on Jan 11, 2018

If any quantity about what the processor does, outside the intended effect, has a different distribution when X happens versus Y, then the distribution of that quantity is exploitable. Period.

Any nonuniform distribution in any quantity that is not part of the spec is exploitable!

_0w8t · on Jan 11, 2018

It is only exploitable if one can measure the difference and extract useful information. Until Spectre guys discovered the double read technique, the expectation was that speculative execution did not allow to extract useful information besides extremely artificial theoretical cases.

galvin · on Jan 11, 2018

Adding a backdoor seems unreasonable but they may have chosen performance over security. Even if this individual bug wasn't intentional they are responsible for setting their priorities.

Filligree · on Jan 11, 2018

There are CPUs available which choose security over performance. They aren't made by Intel, but you can buy them, and they're even cheaper.

Oh, you don't want to do that?

alphonsegaston · on Jan 11, 2018

Well, I read somewhere the other day that this form of error/attack was conceived of in the academic literature back in 1992. I won’t believe it’s intentional without evidence in that direction, but this is conceivably the kind of obscure/complex attack you’d expect of a state actor.

throwawayfinal · on Jan 11, 2018

This has been a known issue in xbox 360 hardware since about 2010.

It just keeps popping up, someone finally thought to weaponize it.

oblio · on Jan 11, 2018

>It just keeps popping up, someone finally thought to weaponize it.

Someone published its weaponization, you mean :)

PatchMonkey · on Jan 11, 2018

Those undocumented features & byte code? HAP mode - something the NSA doesnt want you to know exists, but that they had put into intelME from Skylake onward.

But yet and still we found out. So yes, this security through obscurity approach is terrible (with a code embargo being the obvious exception).

They only update microcode when they have to. When doing otherwise risks... Well, this kind of mess.

You dont wanna know how many times I've rebuilt my gentoo system chasing after retpoline kernel & gcc builds that just... Break everything.

It should be interesting to see how it all develops

_pfxa · on Jan 10, 2018

Yeah ME is a scary thing also. WRT Linux, well my Xubuntu 16.04 (Xenial) is on 4.10 and no new kernels are available to ma ATM. So if they're going to patch my OS, that's probably going to be a backport to that version, not the latest release integrated to my OS version. I guess that's what caused this bug too, although I admit I only skimmed the conversation linked.

jlgaddis · on Jan 11, 2018

They've put out updates for 4.4 and 4.13 (HWE) for 16.04, if that helps.

See https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/SpectreAn...

_pfxa · on Jan 11, 2018

I'm guessing that updates for LTS kernel will come later. I don't know if I can update to 4.13.

lscotte · on Jan 11, 2018

LTS for xenial is 4.4, and patches have already been released for it.

lscotte · on Jan 11, 2018

Note there won't be fixes for 4.10, as it's reached EOL for Ubuntu, so you'll need to move to the 4.13 patched kernel.

drb91 · on Jan 11, 2018

I think that title is currently held (deserved or not) by null pointers.

Tbh it’s not the most meaningful of statements, but it’s food for thought.

cratermoon · on Jan 11, 2018

A null pointer doesn't hold a reference, though.

drb91 · on Jan 11, 2018

It does if you have something at 0x0.

Or, to put it another way, I have no clue to what you're referring--what do references have to do with "The Billion Dollar Mistake"[0]?

[0]: https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retra...

EDIT: my apologies, that joke was actually pretty good.

MBCook · on Jan 11, 2018

I think it was supposed to be a joke.

drb91 · on Jan 11, 2018

Yes, thank you.

andrewflnr · on Jan 11, 2018

I think it was supposed to be a pun on "hold". As for the word "reference", your own link uses it.

drb91 · on Jan 11, 2018

I use it in a later comment! I was confused about the word in context.

However, I completely missed the pun. Cheers :)

cratermoon · on Jan 12, 2018

Cheers to you :)

pubby · on Jan 11, 2018

What modern systems even map memory to 0x0? Doing so breaks the C standard, among other things.

steveklabnik · on Jan 11, 2018

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc....

> On system reset, the vector table is fixed at address 0x00000000.

Also, I'm not an expert on the C standard, but in my understanding, it doesn't "break" it. That is:

* Address 0 and the null pointer are distinct

* A 0 literal treated as a pointer is guaranteed to be the null pointer

* The null pointer is not guaranteed to be represented by all zero bits

* If you get a pointer to address zero via pointer math or by other means than a 0 literal, you can still access address zero.

DSMan195276 · on Jan 11, 2018

Yeah, the NULL pointer is a pretty weird part of the standard - it makes some sense, but leads to weird situations. That said, I think your last point needs a bit of clarification. What you've described is actually already impossible per the standard - with a few exceptions, it is illegal to use pointer arithmetic to address past the size of an allocated object (Because those pointer values may not even be valid for the architecture), so it is technically impossible to use pointer arithmetic on a valid pointer to end-up with the NULL pointer - it would require calculating an address outside of the current object.

So the question of what happens when you actually do that is purely up to your compiler and architecture. In most cases, if you manage to get the NULL pointer value through pointer arithmetic, it will still compare equal to the 'actual' NULL pointer and treated as if it was a literal 0, so that doesn't allow you to get around NULL pointer checks. The only situation where it really matters if the NULL is only known at runtime, since that may have implications on optimizations. Since dereferencing the NULL pointer being undefined behavior, the compiler can remove such dereferences, but it can't remove the dereference completely if it can't prove the pointer is always NULL. There is nothing preventing the compiler from adding extra NULL checks in that aren't in your code however, which would foil the plan of generating a NULL pointer at runtime to dereference it. So unless your compiler explicitly allows otherwise, you cannot reliably access the memory located at the value of the NULL pointer - as far as the standard is concerned, there is no such thing.

Talking specifically about the ARM vector table, that largely works ok because only the CPU ever has to actually read that structure, normally you C code won't have to touch it (If you even define it in your C code. The example ARM programs define the vector table in assembly instead). If you did ever have a reason to read the first entry of that table from C though, you could potentially run into issues (Though I would consider it unlikely, since the location of the vector table isn't decided until link-time, at which point your code is already compiled).

On that note, it's worth adding that POSIX requires NULL to be represented by all zero bits, which is useful. Lots/most programs actually rely on this behavior, since it is pretty ubiquitous to use `memset` to clear structures, and that only writes zero bits.

(Sorry for the long comment, I've just always found this particular part of the standard to be very interesting)

steveklabnik · on Jan 11, 2018

Oh not at all! Thanks for this; I also find it very interesting, and was glad for the correction.

drb91 · on Jan 11, 2018

Again, I am unsure how this relates to "The Billion Dollar Mistake" I linked above and was referring to.

I am not sure your point. The reason modern systems don't map memory to 0x0 is because NULL pointers exist. It is a reflection of a leaky abstraction equating pointers to references. That leaky abstraction has (or so the argument goes) caused >$1B in software bugs.

The other mindset would be "malloc always has to allocate memory or otherwise indicate failure; you cannot cast from a integer to a pointer; you cannot perform arithmetic on a pointer to get a new one; you must demonstrate there are no hanging references when freeing". This is essentially what rust did for safe code.

The reason why I indicate so much skepticism is that rust is the first time I've seen the problem solved well in the same problem space as C. Ada has problems of its own. It's more about how small assumptions can have massive economic (and health, and safety, and ethical) consequences. Certainly comparable to a speculative execution bug leaking memory in an unprotected fashion--in both cases the bugs find their way through human error in evaluating enormously complex systems for incorrect assumptions :)

hornetblack · on Jan 11, 2018

Web assembly.

Llvm won't use it for anything. (I think it starts putting things at 8). Trying to access it explicitly in C will generate `unreachable` instructions.

paulddraper · on Jan 11, 2018

"The Billion Dollar Mistake"

https://www.lucidchart.com/techblog/2015/08/31/the-worst-mis...

ggm · on Jan 10, 2018

"biggest" by number of affected CPUs? very possibly, yes. the march of time has that effect: there are more cpus potentially and actually affected, worldwide, than at any other time in history.

"biggest" by net financial loss to a single entity? I dunno. How much did that failed NSA launch cost the state again?

semi-extrinsic · on Jan 11, 2018

It's unclear whether it failed or if Northrop Grumman want us to think it failed; since the second stage actually did one full orbit with nominal performance, they might be trying to slip one past us. We'll know in a few weeks time I suppose, every satellite tracking enthusiast will be looking for it.

Y_Y · on Jan 10, 2018

> failed NSA launch

I didn't hear about that one

mbroncano · on Jan 11, 2018

Maybe this https://www.theverge.com/platform/amp/2018/1/9/16866806/spac...

ygra · on Jan 11, 2018

To which no government agency is officially attached and the failed thing is more a rumor. SpaceX said that on Falcon 9 side everything worked like it should and NG says they cannot comment on classified payloads. So there's literally no information.

eltoozero · on Jan 11, 2018

Link to spy.mp4?

Found it: https://www.youtube.com/watch?v=RbHbFkh6eeE

brokenmachine · on Jan 10, 2018

I watched that spy demo. How does it know what memory location contains the password being typed?

KallDrexx · on Jan 10, 2018

I'm assuming if you know either common byte patterns or string patterns you might be able to figure out where the password string is being allocated and watch that area of memory for changes.

brokenmachine · on Jan 10, 2018

Not sure if Meltdown is the same, but I read that Spectre can recover memory at about 10kb/sec. So it wouldn't be very efficient to scan the entire memory for a known pattern.

I suppose if there was an exploit targeted at a specific program, it would be possible to work out what location the secrets are stored in?

monk_e_boy · on Jan 11, 2018

I leave my machine on for weeks at a time. If something was scanning the memory even if it failed to find the location of my password 99.9% before it is erased eventually it will be lucky and get it.

brokenmachine · on Jan 11, 2018

Good point. I was only thinking about a single run but that makes sense.

Zakharov · on Jan 11, 2018

According to the paper, Meltdown can recover memory at about 500 kb/s

hvidgaard · on Jan 11, 2018

It's still "only" 1.7gb/hour. If programs follow reasonable security practices, it shouldn't be possible to stumble upon secrets in the memory. This underlines the importance of things such as ASLR and not holding your key in memory longer than needed and rotating them as well.

AstralStorm · on Jan 11, 2018

Once you know the location, if the process is not randomized, you can extract from that location. You may assume some things about implementation (e.g. libstdc++ or libc++, glibc memory allocator, general compiler version)

Additionally some hardening methods like stack protector make stack allocated objects stand out a lot from register values.

_0w8t · on Jan 11, 2018

Meltdown is fast enough to learn everything about layout of data structures in kernel or other programs and then use it to extract information from particular areas holding the keys.

jnordwick · on Jan 11, 2018

It appears to be known to the exploit. I feel that this is being so overblown and that the exploits we are seeing require more info that something in the wild would have.

monk_e_boy · on Jan 11, 2018

Code in the wild would have access to all memory (slowly) so could eventually find the correct location.

Given that whoever writes is would also have access to the other program, they would have a lot more information on where to look in memory.

topmonk · on Jan 11, 2018

I would think it would be. The strange thing is the markets didn't react at all. They actually went up on January 4th.

Santosh83 · on Jan 11, 2018

Because this has largely remained theoretical, unlike Maersk or Equifax.

DarronWyke · on Jan 11, 2018

What are you talking about? We've seen working POCs since last week. This isn't "largely theoretical", this is an actively exploitable hole.

eric_b · on Jan 11, 2018

Meh, it's not really very serious in the average case. It's a lot of sky-is-falling rhetoric from the infosec community. Remember Heartbleed and how it was end-of-times bad? Yeah, turned out to be a non-event. Information disclosure bugs like this are difficult to glean useful information from in widely targeted attacks.

(Obviously if you have nation states or serious criminal organizations trying to breach you regularly, this is more serious)

DarronWyke · on Jan 12, 2018

You clearly haven't been paying attention or reading about how this works.

Heartbleed was touted as being bad by those that didn't read too far into it. You could scrape memory, sure. But it was always random fragments. This lets you make targeted address attacks. Force a process to use that memory space through a NOOP and now you can start scraping at will. Or you can just do an entire memory dump and pull things out in plaintext (like scraping Firefox passwords, which we've seen done already).

The only reason this isn't worse is it requires the ability to execute code on the machine. It has high (near absolute) impact, but low-to-moderate on the ease of execution.

rsync · on Jan 11, 2018

"Would this entire Meltdown/Spectre thing count as the biggest mess-up of computing history?"

This title is held by autorun.inf which has caused over 20 years of broken, vulnerable behavior and, AFAIK, is still going strong.

aaomidi · on Jan 10, 2018

Link to the demo?

elihu · on Jan 10, 2018

from https://github.com/IAIK/meltdown/:

https://cdn.rawgit.com/IAIK/meltdown/master/videos/spy.mp4

bringtheaction · on Jan 10, 2018

YouTube mirror for mobile users anyone?

gregsadetsky · on Jan 10, 2018

https://www.youtube.com/watch?v=RbHbFkh6eeE

katastic · on Jan 11, 2018

I think Y2K had more practical impact across the business world. There was genuine fear that it could cause an actual apocalypse with all major computerized systems failing, medical machines killing people, banks being affected and all money and debts disappearing overnight.

It wasn't that bad, because people took it seriously. But there were still tons of practical systems affected and billions of corporate dollars associated with fixing it.

So when you say "biggest mess up" you gotta define specific qualifiers. Because Meltdown/Specter is going to be solved by simply... buying a new CPU. (And retrofitting the old ones). So it consist of mostly a patch.

A BIG important patch, granted. But it's still just a patch. But some ATM's aren't going to start spewing money like they did on Y2K.

_pfxa · on Jan 11, 2018

I didn't know Y2K was that big of a deal! I guess I'll have to read a bit more about it, as it seems to be an interesting topic. Thanks!

tripzilch · on Jan 14, 2018

I can't find any reports on ATM's spewing money after Y2K, or was that a figure of speech?

trendia · on Jan 10, 2018

(Copying my instructions from another post).

If kernel 4.4 doesn't work, I recommend compiling the 4.15 kernel. (Note, however, that you may need to apply a patch to NVIDIA drivers).

I've done this on Ubuntu 16.04 LTS, 17.10, and Debian 8 so far this week. To compile, set CONFIG_PAGE_TABLE_ISOLATION=y. That is:

    sudo apt-get build-dep linux
    sudo apt-get install gcc-6-plugin-dev libelf-dev libncurses5-dev
    cd /usr/src
    wget https://git.kernel.org/torvalds/t/linux-4.15-rc7.tar.gz
    tar -xvf linux-4.15-rc7.tar.gz
    cd linux-4.15-rc7
    cp /boot/config-`uname -r` .config
    make CONFIG_PAGE_TABLE_ISOLATION=y deb-pkg

lathiat · on Jan 11, 2018

I wouldn't really recommend doing this, but if you really want to do this, it would probably be easier just to use the pre-spun mainline kernels: https://wiki.ubuntu.com/Kernel/MainlineBuilds

gphreak · on Jan 11, 2018

Exactly, and use 4.14. According to a recent comment from a kernel dev both kernels use the same patch approach. 4.4 and 4.9 are using a different approach that’s less ideal, less complete and apparently less tested.

trendia · on Jan 11, 2018

I can somewhat understand where you're coming from if the person doing it only wants something that works.

But for those who would be willing to risk breaking a few things to try something out, building a kernel is a worthwhile effort, and the meltdown / spectre bugs provide a perfect excuse to do it.

nykolasz · on Jan 11, 2018

Waiting a few days to patch my own servers... Not sure what is more dangerous right now: applying these rushed patches or the vuln itself.

snuxoll · on Jan 11, 2018

What distro are you running? I trust Red Hat to get kernel updates right the first time, I just patched externally facing servers and systems that handle PHI tonight with no issues (outside of one of my PostgreSQL servers showing a non-neglible increase in CPU usage, damnit Intel).

Of course, I also go into any updates with a rollback plan. ITIL sucks, but one thing it taught me was the value of well documented plans any time you make changes to production systems.

gphreak · on Jan 11, 2018

Same. RHEL/CentOS went without a hitch. The age of the kernel start’s to concern me, though.

According to the top comment in one of the posts in HN even 4.9 and 4.4 use a less ideal patch: https://news.ycombinator.com/item?id=16085672

I can’t really judge how much RH engineers are capable of fixing that kind of stuff in a kernel that’s officially out of support upstream.

Based on the general quality of RHEL/RHV I trust them to do the right thing, but I have no insight whatsoever in how kernel development actually works.

snuxoll · on Jan 11, 2018

Red Hat pays the salary of a couple kernel developers, backporting security fixes is a pretty big part of their job. Keep in mind, RHEL/CentOS 7 doesn't even use something newer like 4.4 - it's still on 3.10 because Red Hat guarantees a stable kABI throughout the lifetime of a release

IceyEC · on Jan 11, 2018

This seems like one of those things that is very hard to get right: https://forums.aws.amazon.com/thread.jspa?messageID=823033

snuxoll · on Jan 11, 2018

I'm guessing Xen PV isn't well tested by Red Hat anymore since most (if not all) of their paying customers still (stuck) using it are likely on RHEL5, which they haven't released a patch for yet due to that very reason.

I'm kind of shocked Amazon doesn't have something like Linode's Fennix, but you can always do an EBS snapshot of your /boot volume and revert it if a kernel upgrade breaks stuff.

jnordwick · on Jan 11, 2018

Unless you are running remote code there is no reason to patch.

_jcwu · on Jan 11, 2018

This is wrong and bad advice. All you need is a remote code execution vulnerability in PHP or so.

Only don't patch if your server is isolated and not connected to the internet.

nitrogen · on Jan 11, 2018

RCE is already kind of game over though. If you have RCE on the server, you can probably get to everything interesting without having to go through a slow side channel.

jnordwick · on Jan 11, 2018

If you have a remote exploit, there are much bigger issues to worry about. And since this is a timing issue, I'm not even sure that would be enough.

lunorian · on Jan 10, 2018

See this is why you wait a day or two before patching :)

Whitestrake · on Jan 11, 2018

If everyone waited a day or two before patching, this bug would simply be opened a day or two later than it was.

snuxoll · on Jan 11, 2018

How hard is it to just boot an older kernel and rollback the default? Before I even thought about patching sensitive systems tonight the first thing our IT director asked was if I had a rollback plan. The answer? "Yes, boot old kernel, yum history undo [transaction id], reboot".

Always have a backout plan when doing upgrades, I'm just glad EL and derived distributions have an easy way to do it with yum's transaction history.

tripzilch · on Jan 14, 2018

(tinfoil hat) That may be this kernel boot bug exists--the agencies are just trying to squeeze a few more days in of extracting data from prime targets, conveniently under the guise of public knowledge about exploiting it.

noncoml · on Jan 10, 2018

What was the problem?

All the notes say is that 109 fixes it.

jonathonf · on Jan 10, 2018

https://launchpad.net/ubuntu/+source/linux/4.4.0-109.132 :

    linux (4.4.0-109.132) xenial; urgency=low
    
      * linux: 4.4.0-109.132 -proposed tracker (LP: #1742252)
    
      * Kernel trace with xenial 4.4  (4.4.0-108.131, Candidate kernels for PTI fix)
        (LP: #1741934)
        - SAUCE: kaiser: fix perf crashes - fix to original commit

diff'ing the two changes it was this:

    > diff -u linux-4.4.0/arch/x86/events/intel/ds.c linux-4.4.0/arch/x86/events/intel/ds.c
    > --- linux-4.4.0/arch/x86/events/intel/ds.c
    > +++ linux-4.4.0/arch/x86/events/intel/ds.c
    > @@ -415,7 +415,6 @@
    >  		return;
    >  
    >  	per_cpu(cpu_hw_events, cpu).ds = NULL;
    > -	kfree(ds);
    >  }
    >  
    >  void release_ds_buffers(void)

plus it's here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1741934...

AdmiralAsshat · on Jan 10, 2018

They consider a bug that renders the OS unable to boot a "low" urgency!?!

geofft · on Jan 10, 2018

I don't think the urgency field is meaningful in Ubuntu, especially for post-release fixes - I believe a package migrates as soon as it's approved and passes builds/tests. The urgency field comes from Debian, which uses it to describe how many days a package should sit in Debian unstable before migrating to Debian testing, in the hopes that if it's buggy, people will file a bug before it migrates to testing. The Debian default is now "medium" (5 days) instead of "low" (10 days), but people with older tools tend to generate changelog entries that say "low". (And even in Debian, I don't think the field has any meaning for post-release updates; I think it only applies to unstable-to-testing migrations.)

mst · on Jan 11, 2018

Given this seems to be affecting a relatively small number of systems, that's not necessarily unreasonable. It might be very urgent for the people affected, but still low urgency for the userbase overall as compared to other problems.

Though it seems more likely to me that this bug was filed as a placeholder for the already-written patch and verification thereof, and the person filing it simply didn't bother with the urgency field since it wasn't really a bug-report-as-such.

lunorian · on Jan 10, 2018

Sorry fam - security issues are more important ¯\_(ツ)_/¯

arcticbull · on Jan 10, 2018

I think it's fair to say build 108 has no security issues :P in fact, it's the most secure one yet.

dingo_bat · on Jan 11, 2018

Yup. It's so secure that they don't even load userspace into RAM!

revelation · on Jan 10, 2018

The last time I used Ubuntu, they hadn't yet realized what vertical display synchronization is, and nobody had explained to them that you don't do your rendering on the framebuffer you're currently scanning out. So an occasional boot of the recovery kernel truly vanishes behind the plain broken display they expect you to put up with.

agumonkey · on Jan 10, 2018

any report of windows update messing with the bios rendering motherboard non bootable (powers on , but no post, not even an error beep)

otakucode · on Jan 11, 2018

For AMD chips, yes. Yesterday Microsoft announced they were suspending rolling out updates to certain AMD chips because it was resulting in non-bootable systems. I didn't read the technical details so I can not say whether it was specifically BIOS-related. Both a total non-booting state and BSODs were mentioned in the article I saw (from general press, so might have been garbage, sorry).

rincebrain · on Jan 11, 2018

Microsoft's claim was "Microsoft has determined that some AMD chipsets do not conform to the documentation previously provided to Microsoft to develop the Windows operating system mitigations to protect against the chipset vulnerabilities known as Spectre and Meltdown"[1], and their docs simply suggest asking AMD for more details[2].

So it sounds like it was probably specific chipsets and not CPUs, but who knows.

[1] - https://www.engadget.com/2018/01/09/microsoft-halts-meltdown...

[2] - https://support.microsoft.com/en-us/help/4056892/windows-10-...

cpncrunch · on Jan 11, 2018

It happened to one of my HP boxes that has an AMD chipset. After installing the update, windows 10 just hangs at the blue windows logo. Only solution is to turn the machine off and on twice, which then results in it undoing the update.

The Microsoft link you provide says "Microsoft is working with AMD to resolve this issue", so they're not just brushing it off and telling customers to contact AMD.

krutzger · on Jan 10, 2018

I was under the impression that Ubuntu would automatically revert to last good kernel of the new one fails to boot. Was I mistaken?

michaelt · on Jan 11, 2018

There's something close-but-not-quite-the-same: by default the grub menu is hidden with an instant timeout (IIRC) but if a boot fails to complete, on the subsequent boot the menu won't be hidden. Google 'recordfail' if you're interested in the details.

bproven · on Jan 10, 2018

AFAIK worse case you should be able to select the previous kernel in grub2 upon reboot. It just hangs after grub on this (bad) kernel I think...

ams6110 · on Jan 11, 2018

Yeah but if you have a couple of hundred machines that aren't booting.... That's pretty worst case.

jlgaddis · on Jan 11, 2018

That is pretty worst case. OTOH, it teaches the lesson about testing updates on a small number of hosts before rolling 'em out globally.

One has to learn that lesson at some point.

bproven · on Jan 11, 2018

yeah - I hear you. :( Still better than being totally hosed...at least there is some option

myth_drannon · on Jan 10, 2018

No, you have to manually restart, boot into grub and select the one that works

nicc · on Jan 11, 2018

This is a little bit sensationalist (is that a word?).

It's not like Windows that bricks your laptop. It's a handful of hardware config, and you can easily boot with an older kernel.

nabilt · on Jan 10, 2018

I just updated my Dell, but haven't restarted. Do we know how widespread the problem is and should I roll back the update?

jlgaddis · on Jan 11, 2018

There's a -109 out now. I'd install it before rebooting and probably expunge -108 entirely.

bArray · on Jan 10, 2018

I would suspend your machine until we find out more, I have the same problem.

Nacraile · on Jan 11, 2018

You all do realize that (at least by default) ubuntu will keep old kernel versions around, and you can choose to boot them in GRUB, don't you?

This is certainly a pain, but it's hardly the first time a broken kernel has shipped. Reasonable recovery mechanisms are in place.

qznc · on Jan 10, 2018

For me the 108 kernel boots fine, but breaks suspend. :D

mycpuorg · on Jan 10, 2018

Please look under "Meltdown - x86" section in GKH's (The Stable Kernel Maintainer) blog: http://www.kroah.com/log/

del_operator · on Jan 11, 2018

Sounds like one way to stop the bug. :P

lurr · on Jan 11, 2018

So do we make snide remarks about fixes not being tested like we did when microsoft also had issues fixing CPU level bugs?

user5994461 · on Jan 10, 2018

Did ubuntu botched an update again? Or is it the upstream kernel?

eecc · on Jan 10, 2018

Has Ubutu botched or did Ubutu botch... please ;)

ksenzee · on Jan 10, 2018

Muphry's Law strikes again!

jwilk · on Jan 11, 2018

https://en.wikipedia.org/wiki/Muphry%27s_law

If you write anything criticizing editing or proofreading, there will be a fault of some kind in what you have written.

nicc · on Jan 10, 2018

Can't spell Ubuntu right, though.

eecc · on Jan 12, 2018

Whatever, it’s a commercial name (might have even been the autocorrect.) I recommended — nicely, no offense intended — the correct grammatical form.

Stop staring at my finger. Please ;)

ask098 · on Jan 10, 2018

it's the fault of Intel, why don't they recall all the CPU? just like vehicle company

_Nat_ · on Jan 10, 2018

Are you suggesting that we recall the great bulk of modern CPU's? Like, literally gut everyone's computers, including those in data centers and running critical infrastructure, until replacements are eventually manufactured?

Or did you mean something else?

cr0sh · on Jan 11, 2018

I'd think it'd be reasonable to get a refund in some manner, provided you could provide proof-of-purchase for the CPU in question.

I wouldn't expect them to replace any CPU, unless it was manufactured recently and still being manufactured.

But a refund in some capacity? That's reasonable, I think. In the meantime, we would have to settle for the software fixes.

swozey · on Jan 11, 2018

Why would you need a proof of purchase? Intel can verify that it's its own unpatched chip out in the wild being returned for a recall. It doesn't matter if it's the original owner or a woman 15 owners down the line, it's still a loose security flaw out in the wild; who knows where or who whose network it will wind up. I don't need a proof of purchase when I bring my Ford in for its 10 recalls a year. I don't even need to care about which dealer I bring it into. It has to be fixed. They look at the VIN and if it's not marked as fixed they fix it.

Is there a market of 99%+ seemingly authentic fake Intel chips out there?

krick · on Jan 11, 2018

I think this is pretty weird thing to talk about, because it's kinda pointless. Do I think Intel ought to refund us somehow? Hell yeah I do, especially given the fact that I have bought a laptop with Intel processor recently and why even bother buying products with a warranty if any fatal design flows don't qualify as refundable anyway? Do I believe Intel will refund or replace something? Of course not, it's hardly even realistic. Even if they wanted to (which they surely don't) what kind of loan do they have to get to afford even a partial refund of every single Intel CPU out there?

Consultant32452 · on Jan 11, 2018

Boxed Intel processors carry a 3 year warranty. It certainly seems reasonable for everyone who bought a CPU within the last 3 years to expect a warranty replacement with the manufacturer defect fixed.

In the EU virtually every product comes with a 2 year warranty. So every CPU sold in the EU in the last two years should be replaced for free by Intel, even through OEMs.

I wonder what potential class action lawsuits Intel might be facing.

rlanday · on Jan 11, 2018

Any sufficiently complex CPU surely contains some number of defects, perhaps even serious security defects, just as any sufficiently complex piece of software contains bugs and security holes. I wouldn't be surprised if someone tries to sue Intel over this, or even if they win, but this is way outside the scope of what a warranty would traditionally cover, which in the case of a CPU would be hardware failure. If a warranty had to cover every possible defect, a bunch of people would be constantly trying to get free CPUs out of Intel every time they updated their errata:

https://www.intel.com/content/dam/www/public/us/en/documents...

Note that the cost of overly onerous regulation (e.g. requiring that every computer manufacturer replace these chips even though the problems can largely be worked around in software) is of course passed onto consumers.

flukus · on Jan 11, 2018

> but this is way outside the scope of what a warranty would traditionally cover

The warranty and any other legalese from intel is irrelevant here, this is about consumer protection laws of various countries that supersede an intel warranty. A serious post sale drop in performance would be enough for a refund on any computer purchased in many countries. In Australia if I bought a computer 6 months ago I'd be entitled to take it back to the store for a refund, then it's up to them to argue with dell and dell to argue with intel.

> Note that the cost of overly onerous regulation (e.g. requiring that every computer manufacturer replace these chips even though the problems can largely be worked around in software) is of course passed onto consumers.

Demanding that a product works and in lieu of that offering a replacement or refund is not overly onerous regulation, it's a very basic standard protection.

rlanday · on Jan 11, 2018

I’m not convinced that a software update slowing down your phone or computer a few percent while performing certain operations should automatically qualify you for a refund. It’s widely understood that keeping your computer secure requires installing software updates, and it’s even more widely understood that installing updates often slows down your computer. If that’s going to be your bar, I think an iPhone would have to sell for about $25,000 so Apple could afford to give you a replacement every year for the rest of your life.

Consultant32452 · on Jan 11, 2018

Of course the cost of producing products that actually perform at the level they're advertised to perform is passed onto the consumer, regardless of regulation.

rlanday · on Jan 11, 2018

I guess it depends if everyone agrees on whether or not the product performs "as advertised" as not. If you have a defect that affects e.g. 1% of your users, but the government forces you to compensate 100% of your customers, that seems like an unnecessary cost.

For something like Meltdown/Spectre, the patches/workarounds reportedly barely affect some workloads, but cause drastic slowdowns for others. So already not everyone's affected to the same extent. Then you have computers with easily replaceable CPUs vs. stuff like phones and laptops which probably were only designed to work with a single CPU, and the manufacturer's already working on their next model and doesn't want to waste money building replacement parts for the previous one. At that point, maybe you have a complaint with e.g. Apple for selling you an iPhone that doesn't work as performed because they had to work around a security problem, and Apple might themselves go after Intel. The whole situation is a lot more complicated than "it should totally be covered under the warranty."

collinmanderson · on Jan 11, 2018

Intel would just say it functions exactly as designed. :)

(because it's a design flaw)

deathanatos · on Jan 11, 2018

They did say that:

> Intel and other technology companies have been made aware of new security research describing software analysis methods that, when used for malicious purposes, have the potential to improperly gather sensitive data from computing devices that are operating as designed.

> […]

> Recent reports that these exploits are caused by a “bug” or a “flaw” and are unique to Intel products are incorrect.

(https://newsroom.intel.com/news/intel-responds-to-security-r... ; emphasis mine.)

collinmanderson · on Jan 11, 2018

Right, right. And they'll just keep saying it. :)

JonRB · on Jan 10, 2018

Doesn't this affect all of their CPUs going a long way back? And how do you recall embedded or laptop CPUs, which are often soldered in-place?

A recall would be great, but there's no way they'd be able to do it. Vehicle recalls are a bit different because they impact physical safety. Digital safety doesn't get the same priority.

hollander · on Jan 10, 2018

I don't think they would be able to produce all necessary CPUs. Replacing all current stock would create a huge problem. Replacing all sold cpus from the last two years would be a huge problem, even if (and I don't know how complicated or not it is) it would be quite simple to redesign all these chips, how long would it take to do that?

Then imagine all chips from 1995 to 2015, having to make them again, they don't have the machines anymore.

semi-extrinsic · on Jan 10, 2018

Also vehicle recalls are usually done by fixing stuff next time the vehicle comes in for regularly scheduled service. How often does your computer get those?

vvanders · on Jan 10, 2018

Depends on the recall, the GM ignition recall was done on an independent appt basis.

(besides you should not be taking your car to the dealer if you value your wallet)

jabits · on Jan 10, 2018

You need a new dealer...most have greatly improved customer service experiences these days, and many independents are no panaceas...

vvanders · on Jan 11, 2018

Dealers make almost zero margin on car sales(aside from used and trade-in shenanigans). The majority of their margin comes from services so they'll happily gouge you on them.

QAPereo · on Jan 10, 2018

How about they go into bankruptcy with most of the world’s computer users as their creditors? Maybe not, but it’s terrifying that you can avoid responsibility by fucking up on a larger scale than most.

lunorian · on Jan 10, 2018

I'm sure Intel has a liability insurance policy that covers this type of thing.

QAPereo · on Jan 10, 2018

Maybe, but that assumes they get the payout, that they did everything on their end of that deal, etc. Insurers don’t like to pay.

taivare · on Jan 11, 2018

Someone on Twitter had the last safe cpu that Intel made it was date stamped 92'-93' he was asking a Bitcoin for it!

em3rgent0rdr · on Jan 10, 2018

even if it affects all speculative CPUs, if this happened in the car world, all the cars would be recalled. Not saying that is practical in computer world...just continuing with the analogy.

Spectre/Meltdown is a wakeup call for many things, one of them probably being for computer manufacturers to not solder the CPU to the Motherboard and for the x86 world to stick with a standard socket, to facilitate replacing parts.

morganvachon · on Jan 10, 2018

> "Spectre/Meltdown is a wakeup call for many things, one of them probably being for computer manufacturers to not solder the CPU to the Motherboard"

Good luck with that. A large portion of affected CPUs/SoCs are in mobile devices and ultrabooks. Socketed chips simply won't fly in those kinds of devices.

flukus · on Jan 11, 2018

Then the whole device should be replaced, that's the price they pay for their design decisions. Being "too hard" doesn't absolve you of your responsibility to consumers.

em3rgent0rdr · on Jan 11, 2018

Alternatively, even if the boards + CPU are tightly integrated, if used a particular standard like EOMA-68, then they could be easily replaced with rest of desktop/laptop/phone not being affected.

https://www.crowdsupply.com/eoma68/micro-desktop

occams_chainsaw · on Jan 10, 2018

That's why using non-OEM parts in your car voids the warranty.

zerohp · on Jan 10, 2018

No it doesn't. The Magnuson-Moss Warranty Act of 1975 forces them to honor the warranty unless they can prove that the non-OEM part caused the fault.

bitL · on Jan 10, 2018

And how would they replace the bad part when a good one doesn't exist?

QAPereo · on Jan 10, 2018

With money, and a handwritten apology soaked in the tears of their C-levels?

Or just the money actually. If you can’t replace my broken item, a refund is always appreciated.

imglorp · on Jan 10, 2018

Bryan Cantrill refrence has been noted.

bcantrill · on Jan 10, 2018

That was a reference to me? If so -- I'm flattered. I haven't yet asked Intel to soak a written apology with their tears, but it's an excellent idea! (I have, however, given them many other candid thoughts on how they can improve their handling of Spectre and Meltdown -- but so far, to no avail.)

dbenhur · on Jan 11, 2018

Most vehicle recalls are not of the form: return your vehicle, we give you a new fixed one; but rather: bring your vehicle to one of our dealers and they'll perform some action to repair the defect.

The later is pretty analogous to issuing firmware and OS patches to mitigate the flaw.

1stranger · on Jan 11, 2018

What if the fix to the car reduced your MPG by 30% to address a safety issue? This seems somewhat analogous to the number being bandied about as the CPU performance decrease. (Depends on workload, etc, etc.) I think most car owners would expect some kind of compensation for a product that no longer has the same efficiency as when they bought it.

zoe1337 · on Jan 11, 2018

This issue is similar to what happened with the Volkswagen emission cheat. After they fix the ECU to have the advertised NOx emissions, the car lost peak power and fuel efficiency.

willk · on Jan 10, 2018

Is this endangering the life of anyone? A car manufacturer doesn't do a recall if personal safety isn't at risk.

dmschulman · on Jan 10, 2018

They will often refuse to do a recall even if personal safety is at risk. The real quantifier is how much financial damage the auto company will endure in the event they do or do not do a recall

cr0sh · on Jan 11, 2018

> They will often refuse to do a recall even if personal safety is at risk.

Ford Pinto, anyone...

Another one that many people don't know about is a problem with old 2-door Tahoes; a bracket on the driver's side seat likes to fail upon quick acceleration - such as when getting on the freeway, for instance.

One minute you're upright, pushing the pedal to get up to speed, the next - whoops! - flat on your back! If you're lucky, you live to tell the tale...

AFAIK, GM never issued a recall about that one (it caused me to pass on a really nice lifted 4wd Tahoe a few years back)...

brewdad · on Jan 10, 2018

Eh, sometimes they do for the sake of brand preservation. If personal safety is at risk they will do a recall, either voluntary or mandated by the NHTSA.

derekp7 · on Jan 10, 2018

But they have yet to do any recalls for cars that are susceptible to being broken into using a slim jim.

tchaffee · on Jan 11, 2018

Volkswagen emissions recall.

pmontra · on Jan 11, 2018

And replace them with what? They don't have any current processor that doesn't have the same bugs. It will be years before they design and make one. The best we can do is some class action and get a refund, but not too much or they'll go broke and close.

tchaffee · on Jan 11, 2018

> And replace them with what?

An AMD CPU?

pmontra · on Jan 12, 2018

Which doesn't have Meltdown but still has Spectre. Furthermore you have to replace at least the whole motherboard on a desktop or probably all of your laptop except the discs and maybe the RAM.

tchaffee · on Jan 13, 2018

> still has Spectre

I didn't realize that when I made the comment, and I agree my suggestion falls flat now that I know.

> You have to replace at least the whole motherboard on a desktop or probably all of your laptop except the discs and maybe the RAM.

I'm ok with putting that responsibility on Intel to remedy the situation, even if it deeply hurt them financially or put them out of business. If you sell a faulty product that doesn't live up to its description, yes you risk actually going out of business. But with the fact that AMD has Spectre this idea of replacement no longer makes much sense and your original idea of a partial refund makes the most sense.

valarauca1 · on Jan 10, 2018

They really should.

The problem is this nearly every single processor Intel shipped for a decade so Intel doesn't have the cash flow to RMA that many replacements. They're going to fight tooth and nail to avoid this.

mjevans · on Jan 10, 2018

In reality they could probably argue standard depreciation on a product and offer the remaining value as a discount towards a working product... only there currently aren't any equivalent products on the market. AMD's CPUs are still affected by lower profile variants of this, but aren't as /horridly/ broken.

valarauca1 · on Jan 10, 2018

    In reality they could probably argue standard
    depreciation on a product and offer the remaining
    value as a discount towards a working product...

This could work for older models, but their flagship $15,000+ each (in bulk trays) are also affected. So its hard to argue depreciation on <2 month old silicon.

MBCook · on Jan 10, 2018

Not a decade or so. Since the Pentium Pro.

So since ‘95 or ‘96.

lunorian · on Jan 10, 2018

Wait if a court forces Intel to replace processors with a safe one does that mean I get new macbook?

StillBored · on Jan 10, 2018

I can't imagine a court doing anything. The cost to intel is simply to high, and its not like they were negligent given that ARM/POWER/etc are all vulnerable to some extent too.

This whole thing is the equivalent of discovering that if someone throws enough nails on the road your car will blow a tire, spin out of control and kill you. With the kneejerk reaction of trying to fill everyone's tires with concrete to avoid the tires blowing out rather than trying to figure out how to keep people from throwing nails on the road (with the idea that spiteful users are more likely on toll roads) in the first place.

derekp7 · on Jan 10, 2018

Actually, it is more like that someone can throw a rock through a car window and then compromise the car door lock. Or, probably more accurately, that a hitchhiker can knock you out of the car, and drive off with it.

Spectre/Meltdown ONLY is an issue if you run some untrusted code on your system. If you avoid this, there is not problem. Yes, we like to be able to run untrusted code (such as in a web browser / javascript), but that is not the fault of a car manufacturer that you like to pick up hitchhikers.

StillBored · on Jan 10, 2018

I'm with you on the untrusted code bit, which is why I think unmapping the kernel should be restricted to untrusted processes. Then it only applies to your browser, the KVM/qemu instances or whatever runs untrusted code.

Yup, this will hit the EC2/etc users hard, but those people have already IMHO given up on absolute performance by putting themselves in shared environments where bad neighbor syndrome can already hit their perf pretty badly.

But for whatever reason (probably because its easier) the current plan just seems to be to use the big hammer.

kardos · on Jan 11, 2018

The big hammer is the pragmatic approach for the short term. Everyone and their dog wants to claw back the lost performance, we're only week past the big reveal.

Your idea of black/white listing processes might bubble up as a solution in some scenarios. Perhaps it could be pledge-like; if you're savvy enough, try implementing it, or fleshing out the details.

CogitoCogito · on Jan 10, 2018

I agree with you that the courts probably won't do much, but you should not group all the CPU manufacturers together. Meltdown mainly (only?) affects Intel. It's spectre that affects basically everyone.

hodgesrm · on Jan 10, 2018

That might just create yet more problems. You can't just plug a new CPU into your motherboard--it has to match the socket, chipset, memory, installed OS, etc.

StillBored · on Jan 10, 2018

Well if they didn't change the socket every year...

Realistically though, they might be able to do it for cpu's that aren't soldered on (think just about every laptop) made in the last year or two, but would they really fab new versions of 10 year old cores? Its not like many of those lines are even running anymore, so they would basically have to redesign/layout and reverify everything.

Probably easier/cheaper just to send everyone a new machine.

hodgesrm · on Jan 10, 2018

Or give a price break on future hardware. The fix is turning out to be incredibly expensive for ordinary users, virtualization vendors, hardware vendors, OS providers, cloud providers, etc.

This incident demonstrates why you really don't want catastrophic bugs in the CPU. The fact that the hardware vendors missed this one makes you wonder what else is out there.

gruez · on Jan 10, 2018

>Or give a price break on future hardware

feels like this would happen:

>intel agrees to give consumers a $30 price break in response to meltdown/spectre

>in other news, intel raises the prices of next generation CPUs by $30

Dylan16807 · on Jan 11, 2018

Make it $60 and I'll take that, all the better for them to get undercut by AMD.

myth_drannon · on Jan 10, 2018

Faulty vehicle it's a potential accident with multiple casualities vs a hacked computer with potential loss of your private emails or whatever.

timonovici · on Jan 11, 2018

What if that computer is running something critical, say a reactor, and elevator or some device in a hospital? Computers are everywhere these days... Todays proof-of-concept becomes tommorow's metasploit module - and could cause large damage even in incompetent people's hands.

tchaffee · on Jan 11, 2018

Or just a case of emissions violation?

nicc · on Jan 10, 2018

I want to be there for question 099.

en4bz · on Jan 10, 2018

A * B * C = X

george3383 · on Jan 10, 2018

Is this a Fight Club reference?

jnordwick · on Jan 11, 2018

I will not be updating. I have yet to see this mythic JavaScript exploit, and I see too many other ways I, as an end user, can be affected.

I haven't even seen a proof of concept exploit that has the same conditions as in the wild. All the POC exploits seems to have been given some assistance in various ways (such as being given root perms or a preknown memory address).

Does anybody have an example of this JavaScript exploit or any exploit that would work in the wild?