Hacker News new | past | comments | ask | show | jobs | submit login

I took a look at the diff linked in the article with code that "we are all running". The top of the diff certainly looks interesting. They remove the bounds check in dict_put() and add a safe version dict_put_safe().

This kind of change is difficult to make without mistakes because it silently changes the assumptions made when code calling dict_put() was originally written. ALL call sites would need to be audited to ensure they are not overflowing the dictionary size.

The diff I am referring to is here:

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=de5c5e41764...




Also because the 'safe' version only checks

  dict->pos == dict->limit
and not

  dict->pos >= dict->limit
if you can get one call of dict_put somewhere to pass the limit, all later calls of dict_put_safe will happily overwrite memory and not actually be safe.


No, because dict_put will update the limit value if the new pos exceed it.


I don't see anything like what you are describing. What line exactly are you talking about?


Wow, that is 1000% obviously malicious


Agree, nice catch. Also, there are many other opportunities in this patch to hide memory safety bugs.

This is the kind of optimization I might have done in C 10 years ago. But coming back from Rust, I wouldn't consider it any more. Rust, despite its focus on performance, will simply not allow it (without major acrobatics). And you can usually find a way to make the compiler optimize it out of the critical path.


I agree, this looks extremely sketchy. Especially because the code is just writing a fully controlled byte in the buffer and incrementing its index.

This would give you a controlled relative write primitive if you can repeatedly call this function in a loop and going OOB.


I think at this point is clear that everybody has to assume that XZ is completely rotten and can no longer be trusted. Is it XZ easy to replace with some other compression tool? Or has it been so widely adopted that is going to take huge effort moving out of it?


There is no reason to assume that. Even if you assume every commit since Jia became a maintainer is malicious, the version from 3 years ago is perfectly fine.

Zstd has a number of benefits over Xz that may warrant its use as a replacement of the latter, and this will likely be a motivating factor to do so. But calling it entirely rotten is going way too far IMO


There is an interesting argument to be made that pre-JT xz code is probably pretty secure due to the fact that the threat actors would have already audited the code for existing exploits prior to exerting effort to subvert it.


I always use "zstd --long=31 -T0 -19" to compress disk images, since that is a usecase where it generally offers vastly superior compression to xz, deduplicating across bigger distances.

XZ offers slightly better compression on average, but decompression is far slower than Zstd.


IIRC memory consumption is generally worse for Zstd at comparable levels of compression. Which, these days, is generally fine, but my point is you can't thoughtlessly substitute the two.


What keeps ringing in my head is the "." that was found that invalidates compilation. I personally don't buy it (but is my opinion).


What do you mean "don't buy it"?


My bad. I thought that the person who made that commit was someone else than JT. Can't delete comment nor self-down-vote it.


Huge effort, because it is the default .deb compressor in Debian for example


Arch Linux has replaced it with zstd in 2020 already. It's doable for the next major release of Debian.


Certainly, but we need an xz decompressor to read the current debian repo versions for the next decades, when they are oldstable or archived.


Decoding is easy.


This is 100% malicious or novice coder. And we surely know it's not the latter.

If you need an unsafe call, you add a dict_put_unsafe(). That again should of course be rejected in a code review.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: