Hacker News new | past | comments | ask | show | jobs | submit login
Malicious Subtitles Threaten Kodi, VLC and Popcorn Time Users (checkpoint.com)
467 points by seycombi on May 24, 2017 | hide | past | favorite | 224 comments



Was annoying to find the details.

Looks like PopcornTime was rendering subtitle text as HTML, inside their app (html/js-based), creating an XSS vector (looking at https://github.com/popcorn-official/popcorn-desktop/commit/a..., https://github.com/butterproject/butter-desktop/pull/602). Likely the javascript runtime they're using allows file access and execution of arbitrary executables, enabling the metasploit shell shown in the demo.

For VLC there are a bunch of out of bound reads and heap buffer overflows.

    f2b1f9e subtitle: Fix potential heap buffer overflow
    611398f subtitle: Fix potential heap buffer overflow
    ecd3173 subsdec: Fix potential out of bound read
    62be394 subsdec: Fix potential out of bound read
    775de71 subtitle: Fix invalid double increment.
The article implies that VLC and the others are affected by the same issue (leading to code execution), but according to available information it seems to be completely different issues.

The Kodi issue was a zip archive path traversal (i.e. no protection against zip files extracting files to parent directories).


Thanks for that. I read the article and was really confused at first. I don't do a whole lot of video editing, but I've opened up a .srt file a handful of times and noticed that it was nothing more than timestamps and text. The fact that the article made it seem like this was some kind of universal vulnerability made me wonder, "A simple subtitle file should be opened in read-only mode. Are these programs just reading whatever is in the .srt file and EXECUTING it!?!" That would be beyond horrible.

The fact that it's multiple, independent vulnerabilities makes me feel a little better. I've used Kodi and OpenSubtitles before while watching a movie to search and download subs for the movie without ever leaving Kodi. When it works, it's nothing short of magical.


> The article implies that VLC and the others are affected by the same issue (leading to code execution), but according to available information it seems to be completely different issues.

Yes, those are very different issues.

From what I understood, one is an XSS (popcorn-time), one is a heap-based buffer overflow (VLC), and one is a zip-transveral (Kodi).

And tbh, I don't see how you can exploit the bug for VLC (with ASLR and HEASLR).


Easy, you cannot count with an executable being always compiled and executed in an OS with ASLR and HEASLR enabled.

So it becomes a game of luck getting some users exploited.


Thank you! I was frustrated when I saw this last night and it didn't contain any details. I assumed buffer overflow but different attacks for each is more interesting.


Yeah, the article was a bit poor on details. I expected some libass or other common library/codec vulnerability.


Are those things vlc-specific or is there a common vulnerability shared with the underlying libs (libavcodec, libass?)


If only VLC had been re-written in rust this would never have happened. For shame.


We ban accounts for trolling, so please don't do that here.

Also, you've posted many uncivil and/or unsubstantive comments. We ban accounts for that too, so please don't do that either.


[flagged]


No, that's not the point and now you're antagonizing the moderating team. You were asked, and warned, "Do not do that" so, don't do it under your usual name, or a throwaway. There is no shortage of places on the internet to cause trouble. This isn't one of them.


Yes but it would also not have nearly as many features as it does right now.


Java would work much better for VLC.


No. Java doesn't have fearless concurrency, zero-cost abstractions or move semantics.


"Fearless concurrency" and "zero-cost abstractions" sound a lot like meaningless marketing terms.


Dunno about "fearless concurrency," but "zero-cost abstractions" and "move semantics" are straight off the front page of rust-lang.org. So they're kind of marketing-ish in trying to make you go "hmm sounds intriguing" and click to find out more.


"fearless concurrency" is meaningless marketing, "zero-cost abstractions" is valid term


Feel free to rewrite VLC in Rust. No one is stopping you.


Pretty sure that was sarcasm.


Oh look, it's the Rust Evangelism Strike Force at work again.


Please don't react to provocation by making the thread worse (a.k.a. please don't feed the trolls).


How are you so sure it's trolling? Someone from the Rust Team has defended such behavior as good marketing [0].

[0] https://lobste.rs/s/wq6eov/changes_i_would_make_go#c_ofxtj1


One is never 100% sure, but note that the first part ("Please don't react to provocation by making the thread worse") holds regardless.


I did security research on VLC on Windows a year or two ago. I may be remembering incorrectly, but last I recall every module was protected by ASLR. Which means that remote code execution is not likely because there is no scripting or network comms to dynamically create a valid ROP chain.

I also didn't check for executable heaps at the time but given that all heaps are non executable (which they really shouldn't be executable in VLC) again I don't see how RCE is possible. Maybe there is some way to validate and therefore brute force addresses? I don't know. But there was no VLC POC and I'm sure they would have made one if they could have.

Use VLC it's the most secure media player I've seen.


ROP: return oriented programming ASLR: Address space layer randomization

Having ASLR is not bullet proof to remote code execution, e.g. iOS has ASLR for a long time and can still be jailbroken (which usually involves a code injection etc). The key is info leak, e.g. if you somehow can reliably find the memory location of open() syscall, the memory location of the whole libc can be inferred, and libc is usually large enough to construct a ROP chain. (I haven't work in security area for a long time so correct me if I'm wrong).

The researcher unable to provide a POC for vlc could simply mean it's hard due to ASLR, but it's not impossible.

Also: I believe ASLR is a compiler option (with a supported OS), it should be relatively easy for Kodi and Popcorn Time to start using ASLR.


Most of moderns RCE POCs lift off a scripting engine (VBS for Office, Javascript for browsers, ActiveX for Flash, etc..) in order to facilitate exploitation. The only ones which does not use a script engine are POC exploiting a "network" vuln (like SMB).

Scriptless 0day RCE is still possible in a ROP+ALSR world, but exploitation is a real bitch. Ex : https://scarybeastsecurity.blogspot.fr/2016/11/0day-exploit-...


1) ASLR: address space layout randomization 2) Yeah libc is commonly ropped against (though you'd need to check with a linux guy) 3) Yes ASLR is a compiler option (/DYANMICBASE for windows). For windows a flag exists in the PE header, probably something similar in ELFs. When loaded the modules are fixed up so pointers and such are correct.


That's also why we're perplexed at the supposedly code execution.

Also, the security researcher did not provide a demo for the VLC exploit. Their demo is only on Kodi and popcorntime.

But anyway, security issues means releases.


every module was protected by ASLR.

Address space randomization is not "protection". It's a form of security by obscurity. The odds of an exploit working are reduced, at the expense of more crashes due to exploit failure.

It helps developers ignore bugs, since they can no longer reproduce them.


In my experience, bugs are almost always easier to reproduce with address randomization. It's easier to see a process leave readable/writable memory than it is to see it overrun a buffer and only trash app code.

"Only" security by obscurity is the best we can get in the c/++ world without compiling for a virtual machine.


>Address space randomization is not "protection". It's a form of security by obscurity.

This is somewhat akin to saying "Randomly generated passwords are not 'protection'. They are a form of security by obscurity."

If things are random enough that an attacker is significantly hampered in most cases, that's one measure of security, no?


It is going to vary quite a bit depending on the entropy of the ASLR implementation. Many have only had 8-12 bits of entropy to start with, and you sometimes don't need the full address. It is also important to note that services that crash typically restart, allowing retries (sometimes as many as you want). In this case, one might imagine trying to attack thousands of people: some of them will randomly work (and a lot of users are going to see VLC crash and will retry playing the file a number of times, increasing your probability).


Total facepalm to this comment.

Does modern ASLR increase costs (time, difficulty, money, skill, etc.) necessary for exploitation and decrease benefits (privs, chances of success, etc.)? If yes, then it's a protection. Any security engineer will tell you unequivocally ASLR is a protection. And one of the most successful ones to date.


Keeping the location secret is just like keeping the key secret in encryption. You also wouldn't call that security by obscurity.

Still you're perfectly right that ASLR does not provide perfect safety, but merely makes exploitation way harder.


> It helps developers ignore bugs, since they can no longer reproduce them.

Well, it would crash, so they can reproduce it, no?


Off topic: I love VLC but can't get it to use hardware acceleration on my late 2015 mac. 4k 60fps @ 40mbps consume all CPU if I try to play a lower compression 150mbps video it studders and all my fans turn on. mpv and quicktime play the same videos with 15-20% CPU. The poor performance of VLC on my macOS makes it a no go for me.


Try 3.0, this is fixed.


Kodi 17.2 with the fix for this flaw has now been released:

https://kodi.tv/article/kodi-v172-minor-bug-fix-and-security...


The thing that most amazes my about Popcorn Time is how they find the subtitles. It seems to succeed even when I can't find subtitles myself.

More related to the article, you would think that subtitles are literally the easiest file format in existence to safely handle. It's incredibly well-defined in terms of textual data and times.


> literally the easiest file format in existence to safely handle.

Well, which one of them. There's nearly a hundred different subtitle formats, and each one has a whole set of variants. Just Timed Text alone (XML) can have more layouts than one could count, specially since it's meant to be able to replicate technically all previous industry formats.


Let me phrase it this way: one would expect that the class of file formats for subtitles are easiest to handle (as opposed to say, the class of file formats for images or videos).

On the other hand, images and videos are likely to be handled using some library, which might be better at safely handling the files.


> it's meant to be able to replicate technically all previous industry formats

Even the DVD subtitle format, which is just a mostly transparent image overlaid on the picture? In XML?



They use a hash function to match subtitles.

http://trac.opensubtitles.org/projects/opensubtitles/wiki/Ha...


The problem isn't only about matching subtitles to movies but also where to look for subtitles, e.g. opensubtitles.org, subscene.com, etc.



Very interesting. It contains a few subtitle providers I've never heard before. Thanks for posting it.


I believe it just scrapes them all. I can't remember the last time opensubtitles didn't have a sub I was looking for


Perhaps the most difficult problem is to find a subtitle in multiple languages.


True I have only needed English subs


> More related to the article, you would think that subtitles are literally the easiest file format in existence to safely handle. It's incredibly well-defined in terms of textual data and times.

Depends on the format. SSA for instance can have embedded font and image files, which presumably have much more complex decoders.


It seems subtitles aren't important enough to have reduced the number of formats. From reading the comments, it seems like the world would benefit from a single format with most capabilities and have everyone convert all files to that. Until then, we need players that understand everything.


These are the VLC commits adressing the issue:

https://github.com/videolan/vlc/search?utf8=%E2%9C%93&q=subt...


As usual, the common set of friends we already know since the 80's:

> Fix potential heap buffer overflow

> Fix potential out of bound read

> Fix invalid double increment.


i've never seen a double increment exploited before- it's undefined behavior, but what is the typical route against that?


The double increment itself isn't undefined behavior. Note that the two increments were separated with a semicolon, making them separate statements. It's equivalent to pzs_text += 2;.

The exploit would presumably involve structuring your data so that the excess increment skips over a terminator of some sort. If it's scanning until it hits a zero byte, and you get it to skip over the zero byte, then you have a buffer overflow.


Ahh, that does make sense, i didn't see that.


Use a memory safe language that doesn't require direct pointer manipulation to access string and memory buffers, with an optimizer able to elide bounds checking if proven safe to do so.


I don't seem to find a way to update VLC to 2.2.5 on Ubuntu (or Debian, or Mint for the matter). I understand Canonical does not provide updates in the repos - but the VideoLAN website's download URL for Ubuntu is just "apt://vlc" - it would be nice to be able to download one or more .deb's too.

Do we have to build it from source?


Damn, building from source is nigh impossible. Given the sheer amount of plugin libraries that need to be installed. - https://wiki.videolan.org/Contrib_Status/


You can use snaps, but they are currently broken due to build issues.


Holy crap, that code doesn't look good. I predict we will see more exploits for this project.

Maybe we should stop random people from contributing to complex C projects?


Wouldn't go that far from reading a single commit, but to anyone looking to pick up tips from a well-known respected C codebase: don't ever write

    (*(psz_text + 1 ) ) == '~'
when you can instead write

    psz_text[1] == '~'
Fewer tokens means less overhead for the human reader, and that asterisk-and-add pattern is exactly what the bracket array indexing operator does, so why not use it? This is one of my many C pet peeves, heh.

Also on a more personal note, if you're going to be putting things inside parentheses with whitespace, make it symmetrical.


"random people"? You mean there's some select group we know of that doesn't ever write bugs? (DJB doesn't make a group)


The main VLC developer is an amazing programmer. But if he uses his time to shave cycles off some SIMD decoding algorithm then boring things like file processing is done by random jr. developer.

The problem is that boring stuff can also be very security sensitive.


You are more than welcome to contribute and since you have a very strong opinion it seems you know your stuff, so go for it, nobody is charging a dime to work there in any case.


> you have a very strong opinion it

Yes I do, this is internet after all!

> seems you know your stuff,

Now you lost me :)


That was my hope when C was just gaining market share outside UNIX, and here we stand now.


Look at FFmpeg and all the multimedia libraries and you will be horrified.


I thought they cleaned up after the last round of exploits?


hahah :)

I wish :)


vlc has a bug and yet you talk shit about well developed and fuzzed by google projects. thats why vlc will never be better than mpv.


FFmpeg, VLC, MPlayer, libdvd*, libxvid, x264, libflac, libvorbis and all the other have multimedia library codebases started in the late 90s/early 2000. Noone cared much about security at that times.

All those projects are under-funded, done by volunteers, on countless platforms, doing very low-level stuff, and supporting many formats.

This has nothing to do with one project or another.


thats sad to hear, I didn't know volunteers did so much


Interestingly running VLC 2.2.4 on MacOS 10.12 and checking for updates returns 'VLC 2.2.4 is currently the newest version available.', obviously I downloaded 2.2.5.1 from videolan.org but still odd.


The update will be deployed today or tomorrow in the updaters.


Is that a default behavior or something you chose to do?

What if there's a bigger security fix you need to push to people asap?


It is something that we chose to do.

We usually let between 24hours and a few days before doing an upgrade, seeing the possible regressions.

From tag to release to updates can take only 4hours, if we want enough mirrors.


Well, 10 days later and 2.2.4 is still shown as the latest version when trying to upgrade... :/


Same here. It appears to check http://update.videolan.org/vlc/sparkle/vlc-intel64.xml for updates and the newest version listed there is 2.2.4


Can confirm the same on Windows. I downloaded the newest version manually as well.


2.2.6 is deployed.


AFAICT every plugin to Kodi has full machine access. Subtitles of course you don't expect to install malware but I wish plugins ran in a sandbox


Slightly related to this: where can I find data sanitizers for common file formats (PDF, MP3 and so on)?


I strip all mp3 metadata using the 'id3mtag' tool[1].

  id3 -d *.mp3 ; id3 -2 -d *.mp3
That deleted all tags - v1 and v2 id3 tags.

I don't do this for security - I just don't like mp3 metadata competing with metadata in the filename and most mp3 metadata is laughably bad anyway[2] so I just wipe it.

[1] /usr/ports/audio/id3mtag on FreeBSD

[2] Misspellings, First Last instead of Last, First, ALL CAPS ALL THE TIME and using special characters/unicode that always breaks car stereo implementations.


what counts as sanitizing? How do you know a file is malicious?


Especially with PDFs, my "sanitization" can be your "stripped away all the fonts and functionality - might as well have given me a plain .TXT", and vice versa.


"might as well have given me a plain .TXT""

Yes, please - that sounds fantastic.


I agree - but it's 1.surprisingly complicated for a general solution (positioning and such), and 2.not really a solution for the usual end user (who might appreciate a JPEG instead)


(btw there's `pdftotext`, which is pretty good in most cases)


Read data according to spec, drop stuff that is incorrect and write it back.

For example if MP3 genre field is 999 bytes long cut it down to 32 bytes.


Can anyone recommend a video player written in a memory-safe language for OSX that handles MKV files? Or is the simple truth that the problem lies in the parsers, which are shipped as a library written in C, because no sane developer wants to rewrite parsers for 25 different subtitle formats when writing a video player?


There are none. You can use VLC inside VLC sandbox, but you won't get something perfect.


What about mpv? That's my preferred video player.


mpv is not affected, at least by these four vulnerabilities. They all seem to be specific to each video player, rather than affecting shared code or code in open source multimedia libraries.


While I too prefer mpv, I suspect that there are plenty of vulns in that player.


It's written in C, so I imagine that's almost guaranteed. In this case obscurity helps to protect you, however.


It would be interesting to see which subtitles are using these vulnerabilities and what they are achieving with them. We could estimate how long this has been around.


This is another reason you should use a tool like a parser generator when you have to parse untrusted data, rather than writing your own parser by hand.


Does anyone know if the subtitle hosting services added checks for this as well?


This is interesting to me for reasons outside of anything to do with exploits or malware. A while back I had a bit of a brain fart while playing with my Hue bulbs: would there be a way to use the subtitle track for a video to encode time-controlled data that can be sent to/read by another application that sends these values to a set of Hue bulbs or similar devices for synchronized ambient lighting?

I figured that subtitles were an obvious place to start because you can download them in small files, play them back alongside a video, and they are designed to be "timed out" to synchronize with a video already.

I looked into it for a bit but never really found a way (within my abilities at least) to do anything like this from within a .srt file or similar. I'd be interested in hearing if anyone else has more info on how you might do more with that "framework" than displaying text on screen.


Speaking of Popcorn Time, last I heard there were a couple of forks and doubts about the safety of each and every one.

Is there any more clarity around the situation now?


Wow, that is bad. I'm always amazed by such vectors in supposedly passive formats, like fonts, images, and so on.

There is no excuse that these kind of applications are not completely sandboxed. All you need is some kind of DLL, raw data in, raw pixels out. In case of hardware accelerated codecs, raw pixels in, surface pointer in, nothing out. There is no need to be able to access the filesystem, etc.. To render subtitles on top of the video it's the same.

I wish a fraction of the energy we put into DRM would go into sandboxing instead.


Ha, the famous sandboxing remark. I wish it was that simple!

So, let me share some light on the sandboxing for multimedia (I work on VLC).

If you sandbox an application like VLC, in the current way of doing sandboxing, which we've done for macOS, WinRT/UWP, and snaps, you still need a lot of permissions.

Namely:

- you need to be able to open files without user interactions (no file picker), in order to open playlist, MXF or MKV files;

- you need the same if ever you have a database of files (media center oriented);

- you need raw access to /dev/* to play DVD, CD and other optical disk (and the equivalent on Windows);

- you need ioctl on such devices, to pass the MMC for DVD/Bluray;

- you need raw access to /dev/v4l* for your webcams and be able to control them;

- you need access to the GPU stack, which is running in kernel-mode, btw, to output video and get hw acceleration;

- you need access to the audio stack, also in low-level mode;

- you need access to the DSP acceleration (not always the GPU);

- on linux, you have access to x11 for the 3 above features, which is almost root;

- you need access to /etc/ (registry) for proxy informations, fonts configuration and accessibility;

- many OpenGL client libraries need access to the /etc too;

- you need access to the network, as input and output (think remote control);

- you need access to the system settings to disable screensavers, and adjust brightness;

- you need access to mounts to be able to see the insertion of DVD/Bluray/USB/SD cards and such;

- you need to expose an IPC (think MPRIS on Linux);

- you need to unzip, untar, decrypt, decipher and so on;

- you need access to the fonts and the fonts configuration (see fontconfig).

and I probably forgot one or another case.

The point is, all those features have good reasons to exist and very good use cases; but the issue is that for a media player, it will request almost all permissions except GPS and address book.

And quite a few of them are very close to kernel mode.

So, what is the solution?

Probably do a multi-process media player, like Chrome is doing, with parsers and demuxers in a different process, and different ones for decoders and renderers. Knowing that you probably need to IPC several Gb/s between them.

I've been working on such a prototype, but it's a lot of work... I accept donations :)


Thanks for that. This type of thing comes up all the time. I used to wonder how web sites could be so dangerous, but it becomes clear when you think about all the extra access developers wanted for good reasons - imagine a web browser that didn't have access to the file system, and so on. I still don't like this state of affairs, but I don't have an alternative solution. Wayland should be more secure than X, but they're starting to poke holes in there for various reasons (color picker, warp pointer for compat, etc...).


Not even multi-process. Threads on Linux can have their own seccomp profiles. You don't need to sandbox absolutely everything at the same time either. In this case opening the file in the main, unrestricted app and spawning a new thread that will read from the existing FD and only send you simple, time sorted messages over a shared IPC/pipe is not that crazy.

Other points may be more tricky, and it's a good list of potential issues, but we can start chipping away some stuff right now. There's a lot we can fix without fixing everything at the same time.


> Threads on Linux can have their own seccomp profiles.

Not on Windwows or on macOS.

> new thread that will read from the existing FD and only send you simple, time sorted messages over a shared IPC/pipe is not that crazy.

Of course that does not solve anything, because your demuxer|decoders|output needs access to the FS, have access to kernel-mode and those are the dangerous parts.


> > Threads on Linux can have their own seccomp profiles.

> Not on Windwows or on macOS.

It's a shame, then, that Windows & macOS are holding back security improvements for software running on Linux. I understand (& even agree with!) your desire to have a sandboxing mechanism which runs acceptably on all supported systems; it's just sad that this security mechanism in the Linux kernel can't be taken advantage of in vlc.


Well, no. Because you can do it per-process. I don't see the reason of doing it per threads here.


I'm not sure what you're trying to say. Yes, I meant Linux. Yes, it can solve the issue of separate subtitle files, which this article is about. Read access to an existing FD is not the same as full FS access, and there's no demux involved here.


> Yes, I meant Linux

The demo is on Windows. The goal is to do a sandbox that works on most OSes.

And, it will not solve the decoder issue, since it is on the decoding side, which still has access to the GPU/Aout and the kernel.

> Read access to an existing FD is not the same as full FS access, and there's no demux involved here.

You're totally missing the point here. The issue is demuxers/decoders/output, not really the access.

Reading from an FD or not would not solve the buffer overflow exploitation (if it was actually exploitable).


> Not even multi-process. Threads on Linux can have their own seccomp profiles.

Feels kinda pointless, since all threads in a process share the same memory protection.


They don't have to. Clone can do a lot of magic without full processes.


But then you need to copy the memory from the decoder to the video output or you get back to the same problem to work-on.


No, you can use a shared memory segment for a buffer just for that.

It's more coding, certainly, but it's possible. Security is an option if we wanted it.


That's exactly the point above. See my above comment.


I'm not sure, it sounds like you're saying we'd need to copy memory.

The shared memory segment can be a GPU image buffer, so I don't think that's true.


See comment above with "the solution".

Either you need to have multi-process and correct IPC, or you need to copy.


> Probably do a multi-process media player, like Chrome is doing, with parsers and demuxers in a different process, and different ones for decoders and renderers. Knowing that you probably need to IPC several Gb/s between them.

That's not actually how Chrome's renderer sandboxing works. Both Windows and OS X allow you to share a GPU-resident texture between processes (DXGI shared surfaces and IOSurface respectively), so there's no need to copy any video data.


But you need to pass data from the access to the stream_filter, from the stream_filter to the demuxer, from the demuxer to several decoders, from the decoders to potentially a few video-filters and chroma-converters, and then finally to the output. Each of them need different access policies, and several of them require FS access.

The last part is just one of the issues, very far from all of them.

Seriously, stop thinking that noone has given a thought to the question...


These shouldn't require IPC at GB/s speed either. Modern sandboxes, like the one in Chrome, have a broker process which can open filesystem objects, device objects and sockets (file descriptors or handles) and pass them to a sandboxed decoder/renderer process, so there would be no need to stream filesystem data to the sandbox when the sandbox could do the file I/O itself. Even for Matroska ordered chapters, where the demuxer would have to tell the broker which files to open, the broker could enforce certain rules, such as enforcing that local mkv files only reference other local files, the files are all in the same directory, and that the files are always opened in read-only mode.

As for isolating decoders from video filters and chroma conversion, I'm not sure why that would be necessary, since those shouldn't require any additional privileges. I understand that retrofitting an existing program to use a multi-process sandboxing model is far from easy, and I'm definitely not volunteering to do it, but I don't think there is anything specific about a video player that is harder to sandbox than a web browser.


> I understand that retrofitting an existing program to use a multi-process sandboxing model is far from easy, and I'm definitely not volunteering to do it,

Yes, that's the core of the issue.


I don't think nobody has thought about it, but since you were apparently unaware that there was an alternative to performing several GB/s of IPC for moving buffers around there's obviously some options that haven't been considered. The Chromium sandbox has to deal with every issue you've listed (it's even calibrated to run ffmpeg inside the sandbox, since that's something Chromium needs to do).


> but since you were apparently unaware that there was an alternative

I will refrain from answering to such attacks. As you seem to know better, I'm waiting for your patches.


It's not an attack. Each platform's methods of GPU IPC are pretty sparsely documented. Two months ago I wouldn't have known about them; I only learned by working on integrating Chromium's sandbox into an application that needed to work with the GPU within a sandboxed process.

That doesn't change the fact that none of the things you listed are unsupported by Chrome's sandbox model, and if you only need to establish a barrier around the video pipeline (and not e.g. VLC's ability to notice device status or interact with webcams) you don't even need 3/4 of what Chromium's sandbox has implemented. Like I said, I've actually walked the walk when it comes to using their sandbox for Windows and Linux with a process that needed to access certain user files, the GPU, and even each platform's font server equivalent, so this isn't me just spitballing about some theoretical solution.


What features could the OS offer you (to help your program be "sandboxed") that it currently does not?


I think we can do everything now for the majors OSes, but I'd guess this is a 50-100 man-month work for VLC.


You don't need special fast IPC. Even uncompressed video is fine over standard IPC.


Blurays are 60Mbps.

Then with 40k60 + HDR, displaying is quite a lot of bandwidth.


A 10 year old PC carries 2.1 GiB/s (= 17 Gbps) over bog standard pipes without tuning or parallelism, as measured by "pv /dev/zero | cat > /dev/null". Uncompressed full HD is 1.5-3 Gbps. (Less actually, since codec output is going to be 4:2:2 or similar)

Yeah, you can come up with high bandwidth scenarios like stereo VR 144 Hz 4k HDR running on barely capable hardware. But 99% of users don't require such tricks and never see any upside from the performance-over-security compromise.

Even if you decide basic IPC is not fast enough, a shared memory buffer for raw frame data is reasonably secure too.


Knowing that today we still see bandwidth issues in VLC, even without IPC, I kind of doubt it.


All this means Linux is misdesigned for user apps, forcing low-level code instead of proper APIs. Maybe stuffing everything into the kernel isn't a good idea after all? All these things are exploit attack surfaces.


Interesting.

I am only interested in these features:

- you need access to the GPU stack, which is running in kernel-mode, btw, to output video and get hw acceleration;

- you need access to the audio stack, also in low-level mode;

- you need access to the DSP acceleration (not always the GPU);

- you need access to the system settings to disable screensavers, and adjust brightness;

- you need to unzip, untar, decrypt, decipher and so on;

- many OpenGL client libraries need access to the /etc too;

Is there a lighter version where these features are cut?

- you need to be able to open files without user interactions (no file picker), in order to open playlist, MXF or MKV files;

- you need the same if ever you have a database of files (media center oriented);

- you need raw access to /dev/* to play DVD, CD and other optical disk (and the equivalent on Windows);

- you need ioctl on such devices, to pass the MMC for DVD/Bluray;

- you need raw access to /dev/v4l* for your webcams and be able to control them;

- on linux, you have access to x11 for the 3 above features, which is almost root;

- you need access to /etc/ (registry) for proxy informations, fonts configuration and accessibility;

- you need access to the network, as input and output (think remote control);

- you need access to mounts to be able to see the insertion of DVD/Bluray/USB/SD cards and such;

- you need to expose an IPC (think MPRIS on Linux);

- you need access to the fonts and the fonts configuration (see fontconfig).


The first part are the dangerous parts.


You actually don't NEED a lot of these things I'm perfectly fine with a default / embedded font. I don't have an optical drive A database can be in the local app storage. I'm fine opening a subtitle file myself. Why would I need IPC? Why would I need to unzip anything? If it's subtitle files, it can be done in-memory. Are you sure we need low-level audio?

I don't have a remote, so I'd like it to be disabled by default. I don't need any access to the network.

etc. etc. etc


Those restrictions work for you, but would make VLC borderline useless for me.

> I don't need any access to the network.

90+% of what I use it for comes from my NAS or the Internet.

> I don't have an optical drive

Most of the rest is from optical discs.

> I'm perfectly fine with a default / embedded font. [...] I'm fine opening a subtitle file myself.

It's _fine_ but far from ideal. Both are useful quality of life features.

> Why would I need to unzip anything?

Non-essential, but being able to play video from a ZIP is a useful feature.


Let's play, "why isn't my use case the only use case."


Congratulations, you don't need those things. What about the other 19.999.999* users? Are you sure they don't need any of those things? :)

* Arbitrary number.


That's you. Most users expect one or more of the other mentioned features.


I don't understand why sandboxing the userland is not a thing on the linux desktop at least. When you sanely configure a un*x server you generally at least create a user per application. The web server runs as www, the database with its own user, the ssh process runs as the logged-in user (the daemon runs as root obviously, but only for just as long as it needs before forking a less priviledged child) etc...

But when it comes to the desktop everything runs at $user and that's the end of it. While this makes sense for multi user "mainframe"-style systems, for modern desktops it's an anti-pattern almost. I wish I could run my browser as its own user, my password manager as an other, my code editor/toolchain in an other, the closed source spotify client in a third etc...

It's kind of doable today but it's not exactly friendly to setup. In particular Xorg is not exactly designed with client isolation in mind as far as I can tell, preventing one window from overtaking an other without being too cumbersome is left as as exercise to the reader.

But really, at the OS level I feel like we already have all the functionality we need and we just completely ignore it. On a desktop the critical account isn't really root per se, rather it's the user account that contains all of my data.

Maybe we've just been doing it wrong the entire time and we should just log into our single-user desktop computers as root and then spawn our shells and other applications as various unpriviledged users as necessary (this could easily be scripted in launcher scripts). I wonder if anybody has attempted to do that, but again I don't expect that Xorg would work very well in this configuration.


> Maybe we've just been doing it wrong the entire time and we should just log into our single-user desktop computers as root and then spawn our shells and other applications as various unpriviledged users as necessary (this could easily be scripted in launcher scripts). I wonder if anybody has attempted to do that, but again I don't expect that Xorg would work very well in this configuration.

Replace '...as various unprivileged users' with '...as completely isolated virtual machines' and you've got the gist of what QubesOS does. I haven't tried it personally, but it sounds really interesting.


Thanks to your and andrian's comment I'm currently downloading a QubeOS ISO, I'm curious to see how usable it is.

Using VMs sound a bit more heavy handed than what I had in mind, but I guess on modern machines with good hardware support it should be pretty workable.


If that doesn't work (i've had issues with newer hardware), take RancherOS for a spin.

It's dockerized applications (as close as a VM as possible).


Well, there is firejail now. It does Xorg sandboxing through xpra, I don't know how it affects hardware acceleration though.


Oh that looks grand, thank you for pointing that out. I'll give it a try.


Maybe you'll like QubesOS.


QubesOS, while being a very interesting project, is much more extreme that running different application as different users.


"There is no excuse that these kind of applications are not completely sandboxed."

Woah, what a sense of entitlement! What's your excuse for not having submitted a patch years ago?


Not the OP, but valid responses range from "I have a life" to "it's not my project" to "I don't want to". Odd that you think criticism depends on contribution.


those are likely most of the same reasons these apps are not sandboxed (switching "not my project" with "I'm not payed work on this").


Those are valid reasons not to improve a project, but it doesn't make your project immune to criticism. If you're not going to take the time to make your project better, that's fine, but other people are still free to point out that your project isn't very good or that your project could be much better if you managed your time differently.


Tepix's point was about entitlement, not criticism.

"There is no excuse for ___" definitely crosses the line from criticism to entitlement. :^)


I can't figure out how anyone could read entitlement into that statement.


So open source always gets a free pass? "You should be doing X as what you're doing represents a security threat." "Well, submit a patch then neener neener."

C'mon.


they don't, it's just that the statement "there is no reason for this software not to have a sandbox" is wrong, because, well, there are many reasons.

A correct statement might have been "there is no technical reason that prevents sandboxing, given a ton of work".


Video players are among the most difficult of applications to sandbox. The codecs involved may be in-app software or hardware reached via one of a dozen abstraction APIs. The same is true for audio and frame buffer access. There is a huge amount of code that needs holes poked in the sandbox walls to function.

And realistically once you do that you have another component out there with all that complexity and permissions to exploit. That's exactly what happened with Android. The apps have a clean sandbox, so all the exploits target the mediaserver process instead.


Moving 4K / 10-bit buffers at 30 or 60 fps (because 3D) between processes is not as easy as it sounds.


Both Windows and OS X provide ways for a sandboxed process to draw directly to a GPU-resident texture owned by another process. Chrome's renderer sandbox is a good example of this technique.


A media player is rarely that simple. Before writing to the GPU you have to access the content (file, network), demux, then decode and finally display.

The most dangerous areas are the demuxers and decoders so they have to be sandboxed.

So yes, you're right, but this doesn't solve the problem of moving the buffers between processes.


You can share file handles/sockets on Windows/OSX/Linux via IPC without giving the renderer process the ability to open the files/sockets themselves.


But that's just ONE of the issues. The issue here is that the decoder/parser is in a process that has a too many privileges...


Sure, but then you need a multi-process sandbox, and it's not easy to do.

And the performance is not easy to obtain.


You can co-opt Chromium's. I've been doing that with a project I'm working on and it has worked quite well.

I'm also not sure where you'd lose much performance. If you hand the file handles/sockets and backbuffer to the renderer, you only need enough IPC to synchronize the drawing. Sending small messages on the order of 100 times per second between processes is not going to be a bottleneck.


But you need to pass data from the access to the stream_filter, from the stream_filter to the demuxer, from the demuxer to several decoders, from the decoders to potentially a few video-filters and chroma-converters, and then finally to the output. Each of them need different access policies, and several of them require FS access.

It is not easy. If it was, people would have done it already.


You can get most of the security people are asking for without that level of granularity. If the decoder is exploited and is able to attack the demuxer, that's still worlds better than the decoder being able to run with full user privileges.

Do any of those components need unrestricted/unpredictable file access? Because if they don't you can just open the files in the main process that handles the UI and send them to the sandboxed process via IPC. None of Windows/OSX/Linux do permission checks when file handles are read from, they only check when the file is initially opened.


But the issue is not only file access. File access is a small part of the main problem.


So far I did not know that there is anything in video subtitles, that needs interpretation. What is needed is not a sandbox, but simply code, which stops trying to do weird stuff with something as static as subtitles. They should be a timestamp for the time in the video where the text shall appear plus text itself nothing more. If it does not parse according to a specific format throw that stuff away and read the next line in a subtitles file or simply declare the whole file invalid and be done with it.

Don't try and start doing weird things with something like subtitles and we are fine.

Why does VLC or one of the other programs feel the need to do anything more than that, resulting in gaping security vulnerabilities? Is there any good justification? Or is this again about some overflow with unexpectedly long strings or something like that? (In such case it is the not so careful programming on VLC side that is the problem)

Furthermore the subtitles are often inside the video graphical data itself. I've actually never used a subtitles file. I tried a few times, but every single damn time they were off, and not only off but exponentially off, which made it impossible to get the correct text for all play positions in the video. If you ask me, so far all the subtitle files I tried for any movie suck anyway.

(This is ignoring any subtitle file specifications, which might exist.)


> They should be a timestamp for the time in the video where the text shall appear plus text itself nothing more.

I think the option to have positioning information was a good idea.


AFAIK, fonts aren't a passive format, they contain code which executes in a VM.


At least SRT files are a purely declarative sequence of lines with "time time text".


"Officially", SRT files are only timecodes and text, but most players support html codes directly like <b> <i> <u> to support more formatting options than the basic SRT. I wonder if some of them simply render the text as html and could be vulnerable to similar attacks. I say "Officially" because SRT has no standard, it just evolved through usage and it's a fucking mess, as I'm a software dev working on a subtitling editor software.


I seen a similar problem in another context where a browser engine is used to render some simple HTML in an app for convenience, but then suddenly turns into something exploitable because nobody is thinking about updating the engine when a bug is found in an esoteric (for the app) feature.

Embedded web engines should probably have a minimalistic safe mode.


But SSA is not. It's specifying a font and sometimes even shipping a font inside the SSA file.


I get equally confused anytime Microsoft Office gives the "Files from the internet may contain viruses." warning. How do you mess up a document editor so badly that the document can affect the computer? I know that the answer is Visual Basic, and I know that there are legacy reasons why it will never be removed, but holy cow, it is ridiculous.


It's not just VB. Any code in a program which parses any kind of data, including "passive" data like images, docs and even plain text, is vulnerable to bugs in it's own code. It's often not related to what the data is, but the way in which it is handled.


Please don't fall into the Dunning-Kruger[0] trap by assuming a straightforward task is also easy to perform. These things may very well be complex and include aspects that are not immediately obvious. And even if they don't, even simple code executing simple tasks can be vulnerable to bugs or flawed reasoning without the authors (or tooling) being stupid or naive.

[0]: https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect


And I find this especially true for anything media-related. I am amazed it even works for most user without problem [0]

[0] Obvious problems, like unplayable, etc. Minor problems (chroma placement error, transfer function, etc) seems to occur very frequently.


Basically, regardless of VMs etc, parsing any kind of binary data is inherently dangerous. Old binary formats were bespoke, there weren't always standard libraries to read the headers etc, so everyone wrote their own, with their own bugs. A common attack vector was simply using invalid lengths on header fields, causing a stack or buffer overflow to fool the host into executing the binary data as code.

That's one of the reasons these days there's a tendency to use text-based representations like JSON, but of course anything size-sensitive such as images and movies is still generally binary.


Well, M$ Office makes sense, since it's macros.

What I really don't understand is Acrobat Reader. It has a "Protected View", which is the first WTF - .pdf-s are read-only, so there should be absolutely zero active code running anyways. What's the next, much bigger WTF WTF WTF is that you need to exit protected view to print the document.

How can the program read and render the document on screen, but not print it?! How is this even possible?


PDF can contain executable code (JavaScript to be exact), access remote URLs, and has several separate modes of display (which means that document rendered for printing is different than what you see on the screen).

PDF is an old complex format with a lot of features used in a lot of special cases that go light years beyond looking at a simple text file. It's the reason for all the issues, but keeping it useful as it is and magically waving away all issues is not really easy.


Javascript is part of the PDF standard (yeah...).


PDF is a very fancy wrapper around post-script (massive over-simplification). Post script is a Turing complete language. As such, PDF is essentially code.


If Popcorn Time renders all subtitles as HTML, would an exploit work if the subtitles were embedded in video container? Seed latest hit on Pirate Bay, root a lot of boxes. Yikes.


Is Media Player Classic affected?


Not according to this bug report: https://trac.mpc-hc.org/ticket/6169


here is how it looks in real time:

https://www.youtube.com/watch?v=vYT_EGty_6A


Does this also work for Android versions of Kodi et al?


Android does have a sandbox, so impact should be pretty limited if ever exploitable.


does this work on Linux and Mac OS? or is it limited to Windows systems?


I can't say for these vulns specifically, but in general, if software is vulnerable on one OS, it is very likely also vulnerable on other OSs. The differences aren't that big. Exploits generally have to be written for each OS separately, though.


It's sad that VLC checks updates over HTTP and HTTPS


VLC updates are signed with asymetric encryption.

HTTP or HTTPS does not change that.


HTTPS would increase user privacy by not leaking application details though.


Indeed, but that's not what GP is referring to.


What does the "IPS Signatures" section mean?


This is the sourced post http://blog.checkpoint.com/2017/05/23/hacked-in-translation/

The ingenuity that goes into RCE exploits never ceases to amaze (and terrify) me. Can't wait for more details to be released.


Hollywood is resorting to shitty tactics


I would be impressed if this were actually "Hollywood". It's better than e.g. the RIAA lawsuits.


I'm not sure where you're getting this from; did you read the article?


Clearly VLC should be rewritten in Rust.


Looking at the bug fixes done in VLC, Ada or Modula-2 would be enough, although there are plenty of options actually.

Rust isn't the only alternative to write native code safer than C will ever allow.


Don't know about Modula, but have you tried Ada? The usability of it is nowhere near modern languages IMO. We learned a lot about nice code since then :-)


Some of us like readable languages not composed of hieroglyphs.

Ada was and still is a quite modern language, designed for software development done by large teams, where I can several years later still understand what I wrote.


Modula 2 is much like C in it's close-to-the-metal performance abilities.

On the downside, if you want to call it that, is a more prominent syntax (keywords instead of curlies, upper-case keywords, etc).

On the upside it lacks any unsafe operations, except for dealloc. In addition, it has actual modules in lieu of includes, hence it's blazingly fast to compile and/or recompile. It'a a pity it didn't catch on, the language lacked a company to back and promote it. AT&T promoted C, Apple promoted Objective C, Microsoft promoted VB...


> Apple promoted Objective C

Actually Apple promoted Object Pascal, but then they decided to cater to the growing UNIX market and replaced the Mac OS SDK with C and C++ (PowerPlant) one.

https://en.wikipedia.org/wiki/MacApp


I mean, Ada was spec'd to run flight computers for military aircraft and similar mission-critical stuff. If you want something that is secure, Ada can do it. It just won't offer many creature comforts in doing so...


Clearly its the low-level code that VLC calls is a problem, you can rewrite VLC but other apps will happily run it again and again.


Yes, they clearly have the resources to do that. Maybe you can propose a prototype in Rust?


Sarcasm can feel good, but it poisons the well of civilized discussion. Kindly refrain.


The original comment to rewrite VLC in rust was mostly sarcasm. Sure, if every app and library was rewritten in rust without using the unsafe features, we'd see a lot fewer of these kinds of bugs. But it's going to be a long long time before we live in that kind of world. That doesn't mean people can't start today - imagine if the top 5 codecs were written in a safe language, then they could warn users about "less safe" content for anything else without pissing off too many users. Maybe. And that alone is a big job.


So? Replying to sarcasm with more sarcasm makes things worse yet. It's still wrong.


Yep, or just use something a bit better-developed. VLC has always been a notoriously poor-quality codebase.


While I think Rust would be s good choice, the project could benefit from a rewrite even in the same language.


Treat data as data. Taking the Subrip format as an example, everything starts out fine so long as there is good bounds checking on the purely textual data.

Then, however, some dipshit decides to extend the format by adding tags for things like bold, italics, underline etc. This is completely unnecessary for subtitles because the emphasis can be inferred from the dialogue. The unnecessary complexity increase the potential for vulnerabilities.

Then some total dickhead decides to add an HTML5 tag, for no reason whatsoever, and it all goes to hell.

This is illustrative of the problem with most software: the absence of a clear-headed benevolent dictator to say, "no; you are an idiot; we're not doing that."


    This is completely unnecessary for subtitles because
    the emphasis can be inferred from the dialogue.
Seems useful for deaf people


It also seems like you could use it for applications like karaoke.


For certain films it can be vital that subtitles can be styled in multiple ways.

For example: I recently watched the movie "The Handmaiden" which includes both spoken Korean and Japanese. The language the characters speak in any given situation is relevant to the story. If all the subtitles were the same I would not have noticed this destinction.


> Then, however, some dipshit decides to extend the format by adding tags for things like bold, italics, underline etc. This is completely unnecessary for subtitles because the emphasis can be inferred from the dialogue.

Emphasis of an entire line can be inferred, but how can emphasis within a line be inferred when you don't know which utterances within the line correspond to which words in the subtitles (which, if you need subtitles because you don't know the language being spoken, you won't)?

While uncommon, I've occasionally seen font variants used for emphasis on professional subtitles for that reason.


> (which, if you need subtitles because you don't know the language being spoken, you won't)?

Or even more extreme, if you need subtitles because you are deaf.


Then people add full BASE64 fonts inside the subtitles.

Fonts that have a virtual machine in...


Formatting really adds a lot of depth that is important for deaf/hard of hearing people. Maybe you should try thinking about people with disabilities some time.


> Treat data as data.

That doesn't solve the problem. http://langsec.org/


These exploits will go nowhere without a catchy name ala HEARTBLEED...

I vote for SUB-DURAL HEMATOMA


> The attack vector relies heavily on the poor state of security in the way various media players process subtitle files and the large number of subtitle formats.

Well, last years exploits against iOS, Android and Ubuntu where all related to media metadata processing. It is only natural that the same folks screw up this one too.


What same folks? iOS, Android and Ubuntu are not developed by the same people. More than that, it's not like these apps are actually developed by Apple, Google or Canonical.

Plus you're dissing some very complex projects. I think you're underestimating the complexity of the work these "same folks" are doing.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: