Implementing tile encoding in rav1e

pornel · on April 25, 2019

This is a great example of use of `unsafe` in Rust.

The memory access pattern they wanted is technically safe, but can't be expressed directly in the safe Rust. But that wasn't a blocker, and the Rust language didn't need to be made more complex to handle such case. Instead, the "missing feature" could be added to Rust with a bit of `unsafe`.

The big thing here is that instead of implementing actual encoding using unsafe code, and spreading the risk of unsafety all over the complex parts of the codebase, all of the dangerous parts were contained in a minimal, easy-to-verify abstraction layer. And all other code on top of that remained being safe Rust with the same safety guarantees.

nn3 · on April 26, 2019

That's the positive spin.

The negative spin would be: how to spend a lot of effort to solve a problem you would never have if you didn't use rust. Yes it makes great blogs but it doesn't necessarily make his programs better.

I'm two minds here. On the one hand the safety is nice and it might or might not find a few bugs in advance.

But in a real project do you really have time to go through all this effort every time just to solve something do something legitimate that the rust designer's didn't quite consider?

If he had written it in C he might have had enough time to work on the performance to actually get a performance gain.

I still haven't made up my mind if the trade off is worth it.

GolDDranks · on April 26, 2019

It does make his programs better, because what's the alternative? If he would have used C, it would be equal to "spreading the risk of unsafety all over the complex parts of the codebase". What the writer is doing here is strictly better than that.

rom1v · on April 26, 2019

> The negative spin would be: how to spend a lot of effort to solve a problem you would never have if you didn't use rust.

I can't totally disagree, I sometimes had this feeling during the implementation.

Contrary to other languages, you can't always choose your trade-offs: memory-safety is in practice almost non-negotiable in Rust (you can use unsafe code locally, but you won't implement the whole application in unsafe), so everything else has to adapt. For example, in another language, I probably would have kept slice+stride to avoid many refactors.

In the end, I am quite happy with the tiling structures API, but a lot of work (and [boilerplate](https://github.com/xiph/rav1e/tree/f1c43dbdc52016f67ecf33383...) was necessary to implement them.

unrealhoang · on April 26, 2019

> But in a real project do you really have time to go through all this effort every time just to solve something do something legitimate that the rust designer's didn't quite consider?

That should be rare because if that case of yours is common, someone has already "safetify" (put in a safe interface) in a crate that you can pull in and use.

Secondly, your time spending on "safetify" your unsafe code is just a trade off for time debugging safety issues in the future when you use your unsafe code in the future. Heck, Rust was born also to let people declare the way their "unsafe" code can be used in other to parallelize, because keeping all the unsafe contracts in your head (or even document) is not scalable.

ZeroGravitas · on April 26, 2019

The cliche is "make it work, make it correct, make it fast" but one of the unstated complications is that making something faster without stopping it working correctly can be hard. (Even more so if the output isn't deterministic so you can't directly compare the outputs of the slow and fast approach).

The proof of this pudding is whether later developments are sped up or slowed by this work.

The progress in speed and quality that rav1e makes versus AV1-SVT (which explicitly uses C to attract a wider developer audience) might be interesting to compare, though of course the level of resources that Intel/Netflix/Mozilla and the open source community put into each will be a complicating factor.

pornel · on April 26, 2019

> how to spend a lot of effort to solve a problem you would never have if you didn't use rust

No, you don't have that problem even in Rust. Rust has C-like pointers, so if you're OK with them, you can just use them as if you were writing a C program.

The effort here was not forced by Rust, but was author's choice to get a guarantee the code is safe. And I'd say it's relatively low effort given it's reusable, and it'll stay safe even after code changes. In C you'd instead do meticulous analysis, debugging and add a comment "// careful, don't break this!"

rom1v · on April 26, 2019

> Rust has C-like pointers, so if you're OK with them, you can just use them as if you were writing a C program.

This would not be practicable. Raw pointers in Rust are (probably on purpose) far less convenient to use (no operator + or -, not possible to index an "array" with operator []…).

Moreover, you would lose all benefits provided by slices (iterators, etc.). And locally converting raw pointers to references just to call these methods could lead to undefined behavior (aliasing of mutable references is forbidden even in unsafe code, see https://stackoverflow.com/questions/54633474/is-aliasing-of-...).

IMO, using C-like pointers for a whole Rust application would make no sense (it would be worse than just writing a C application).

drmpeg · on April 25, 2019

Multi processor encoding goes way back to the early days of video compression. The first real-time MPEG-2 encoder chips from C-Cube Microsystems were specifically designed to be used in parallel. Back then (1993), encoding an SD (720x480) image required twelve processors.

Here's a picture of a development board. This is half of a two board sandwich that supported ten video processors. The CL4000 was the first member of the family and was used for the DirecTV rollout. It was not capable of MPEG-2, and for a few months, DirecTV was actually MPEG-1.

http://www.w6rz.net/scorpio.jpg

namibj · on April 25, 2019

Presumably, the mentioned lack of a huge speedup is due to both hyperthreading and CPU clocks depending on how many cores are active (so as to stay within a power and thermal envelope).

Otherwise, profiling would obviously need to be done on the sequential parts to figure out if any of them are amenable to parallelisation, while also being worth the effort based on them taking up a significant amount of time.

Thaxll · on April 25, 2019

And also to be careful when doing benchmark on a laptop since the OS / governor is throttling CPU...

rom1v · on April 25, 2019

> the mentioned lack of a huge speedup is due to both hyperthreading

Correct :)

I just updated the article after similar comments on reddit: https://blog.rom1v.com/2019/04/implementing-tile-encoding-in...

catpolice · on April 25, 2019

A few years ago I implemented a vaguely similar image compression algorithm just for kicks. I basically did a separate binary space partition for each color channel of the image, with the splitting points determined by minimizing an error function. It was extremely simple to implement and worked pretty well, except it represented big curves with little compression.

The fun part was when I dropped the error minimization in the partition choice function and set it to choose them randomly:

https://pbs.twimg.com/media/DMr5LaMUQAA2h2E?format=jpg&name=...

https://pbs.twimg.com/media/DMr4Vc4U8AAOlyn?format=jpg&name=...

lousken · on April 25, 2019

I wonder when the encoder will be good and fast enough so that I can start reencoding my old h264 library, any estimate? I wanted to start next year but I am not sure considering the speeds, it still needs a lot of optimizations.

firethief · on April 25, 2019

Probably never, unless you're also downsampling because you only plan to watch videos on your phone / downgraded your TV forever / are losing your eyesight. Lossy transcoding barely makes sense when the source is a hilariously bad codec like MPEG2 (by modern standards), and even then you lose a lot of texture (the main benefit is inverse telecine and dealing with any other weirdness of the source disc ahead of time). You pay in quality for every time it's been lossily encoded, but only reap the space savings of the last (to a first approximation). You might save a little HDD space, but you'd have very poor quality for the bitrate.

lousken · on April 26, 2019

I am aware that some detail will be lost. However with h265 encoding I've tested about a year ago I could shrink it by 40% without any noticible loss. If AV1 can get me above 50%, the >2TB saving is worth it I think. It'll be a lot easier to handle backups.

Thaxll · on April 25, 2019

Why do they need unsafe for a video encoder / decoder? There is no other way around in their example?

gok · on April 25, 2019

This is perhaps the main point of this article. They needed `unsafe` to implement a safe way of concurrently modifying a single contiguous buffer, which is not allowed in safe Rust.

mtgx · on April 25, 2019

Could such a workaround come to the Rust language in the future. Maybe other programs will need a "safe" way to implement something similar that is currently impossible with the safe methods offered by Rust.

unrealhoang · on April 25, 2019

But why? That’s is exactly the reason why unsafe exists: to open an escape hatch to do something that is impossible to do in safe Rust (and then wrap that under a safe interface). Your safe Rust, standard library, is constructed exactly the same way: based on unsafe. Unsafe doesn’t magically make your code unsafe, it’s just that you tell the compiler to trust you to handle the problem correctly (usually by meddling directly with pointers).

Thaxll · on April 25, 2019

For me the question was more, isn't Rust supposed to have fearless concurrency meaning no "unsafe" part to modify data between threads.

mmastrac · on April 25, 2019

What would most likely here would be abstracting this problem into a library that could isolate "unsafe" in a place where it could be specified, tested and examined in isolation.

TD-Linux · on April 25, 2019

You could add this to the standard library in the future. There is already the 1d version of this (slices). But you don't need to - it would be perfectly fine to live on its own.