Hacker News new | past | comments | ask | show | jobs | submit login
Firefox Local Files Theft – Not Patched Yet (quitten.github.io)
114 points by akeck on July 3, 2019 | hide | past | favorite | 73 comments



IMO, the approach taken by Chrome and other browsers is over restrictive, basically killing off the file:// protocol. Already now it's impossible to load wasm files in the file:// protocol in Chrome, and I think this also had implications on using wasm in electon.

Ultimately, file:// is a great, cloud free method of having web pages and applications and it should stay that way instead of forcing everything to be networked and reliant on third party computers or domain names.

There are so many reports, documents, etc that live as non-networked html files. E.g. rust documentation (cargo doc --open) is generated as html files on-disk and then just displayed without the need for a webserver. Starting a localhost webserver is in fact less secure than file because now every user on the computer has access instead of just the users with read-access to the files. This has thankfully been fixed in Chrome OS though.

Responsive web apps are not the solution as they still need "seeding" via the network. Maybe some kind of standardized format where you have a glorified zip file with some metadata and when you double click it, it opens in the web browser which starts a web server in the background that runs a specially designated js in that zip file and which can accept usual fetch requests and has read access to the entire zip file. The browser's "UI" would then communicate with the server via well-known protocols.


Learning web programming is amazing, since you can just start writing a .html file and open it in a browser. There is no need to learn how go set up a local http server. Chrome truly is killing that experience; you can't even load javascript modules without http involved now outside of Firefox.

I really hope Mozilla doesn't "fix" this "vulnerability" by destroying the experience of people learning web programming like Google did.


Is it really that hard to run `python -m http.server` in your repository?


I started learning HTML at age eight. No one in my family had experience with programming or using a terminal (there was an HTML book on discount at the book store). I was able to get started because it was dead simple, and only used tools I was already familiar with: the web browser, Notepad, and regular files.


I had exactly the same experience.


For someone getting started with programming? Very hard - it steepens the learning curve significantly


This. It's not only "open Notepad and save this text file, then open it in the browser".

Leave the complexity of learning what Python is, the 2 vs 3 issue, 32/64 bit versions, appending locations to your path, CLI, etc., for when the user is ready to start with dynamic web pages.


Luckly, the modern Python installers for Windows do all that.


When I was about 13 I wanted to learn C++. I bought a book, installed the software from the CD and tried compiling the first example. It failed with a cryptic error about a file not being found. I gave up on programming for a number of years.

The book had failed to mention that I need to include the compiler in my PATH.


Even worse, the code examples of (in)famous Bjarne Stroustrup were missing neccessary #include's and contained deadly typos.


Honestly, if that's all that took you to give up on programming, it would not have gone on much longer anyway, especially with something like C++.


So what was I supposed to do with no expert on hand and no Internet? I went back to the bookshop and asked there. They didn't know either, so I returned the book.

I returned to programming when I learned about Quickbasic. That just worked and had an extensive help system that was suitable for self study.


I don't think that's very fair. Some people are extremely gifted mathematicians with incredible talent for algorithmic thinking, yet can be totally shut down by build configuration bullshit. It's a huge pain point in C-land.


Please note, python is not available out of the box on Windows machines as in macOS, most of Linux distros.

Personally, as a person who used python in my programming job, I don't remember when for last time or even I have ever typed mentioned by you command to start a local server.


>Please note, python is not available out of the box on Windows machines as in macOS, most of Linux distros.

It (almost, kind of) is now, actually. Running python on a recent Windows 10 box without python installed will prompt you to install it.

https://devblogs.microsoft.com/python/python-in-the-windows-...


As much as I prefer the file protocol, an other option would be:

$ busybox httpd -p 8080 -f

-p for port (also possible to bind to specific IP: -p 127.0.0.1:8080), -f for foreground, also optional -h for served directory (default is .).

But the difference between doing almost nothing, a double click once and then it stays there even between restarts, to giving a specific command in a terminal is monstrously huge.


No it is not, and it has been easy to run a webserver on Windows for ages (Apache, Roxen, IIS, OmniWEB, etc).

However the barrier of entry is still higher than simply editing a .html file and loading it from the browser with file://

The question is, what is more important? Shouldn't this functionality be optional at the very least?


Unless you're on a mac, or various Linux distros where python 2 is shipped by default still.


Mac comes with PHP, so you can do

    php -S localhost:3000 -t ./


And now you have: php -S localhost:3000 -t ./ on mac python -m http.server on some versions of linux, python -m SimpleHttpServer 8000 on others, and any of the above on windows.

How is that easier than putting file://path/to/foo.html into my browser window, as someone getting started in web development?


Why should I install Python, if I want to do a bit of JavaScript? That's not really the point. Of course it's not complicated to setup XAMPP, LAMPP or whatever.


If you run Linux, you won't have to install Python - it will be there by default in almost every distribution. It hink this is why GP chose this example - there's a multitude of other options to instantly start a web server.

Edit: Here are 16 for example: https://gist.github.com/willurd/5720255


Most people are running Windows. That means most people trying to learn how to program are running Windows. I wouldn't want to figure out how to set up a web server on Windows myself, and absolutely wouldn't expect someone learning to program on their own to be able to art it up without getting frustrated.

Even for people on Mac and Linux, having to start by learning the terminal is a pretty big and unnecessary obstacle to just getting to writing some JavaScript. Sure, you and I think it's easy, but it's a completely different paradigm which requires practice to get comfortable with.

Even if you can be there right by someone learning to program, and help them set up a web server, they will have to remember how to do that when they want to play with programming while you're not there. That's actually quite a lot to remember when you don't yet understand what the commands mean; you think of it as "go to the directory, start SimpleHTTPServer", but they have to memorize the text they have to type. Maybe you're the one who cd'd into the correct folder for them, and they remember to type `python -M SimpleHTTPServer` in the terminal you created; they will then have problems they don't know how to solve when they go home and open a terminal and type the command and it doesn't work because they weren't in the correct directory.


Absolutely, I don't think we are in disagreement at all. I just wanted to mention the availability of simple web servers (on Linux, as you rightly state).

On topic, I agree that this is throwing the baby out with the bathwater. A compromise would perhaps be an explicit whitelist of allowed directories to serve via the file:// protocol, set by the user (while disallowing traversal upwards of these, of course).


It ends up definitely being a tax on developer efficiency overall. It also makes web development more hostile to newbies because they have to figure out how to configure a server or hop into some third-party toolset on the web to start experimenting. With the tendency to ban use of new features on non-HTTPS origins (hi again, Chrome) a newbie developer will soon have to have both a server and a https cert before they can even start experimenting. If they host on a third-party https domain now third parties are responsible for malware that might get hosted there.

It just sucks.


Regarding HTTPS, localhost is considered a "secure" origin: https://developer.mozilla.org/en-US/docs/Web/Security/Secure...


What are your thoughts on file:// URLs being opened/executed in a separate enclave from the network, with no access to the network?

This would allow the convenience of file:// and IMHO eliminate many if not all of the risks.


It would improve security, but ultimately as long as the file "enclave" can execute javascript, it has access to enough side channels to communicate any data it collects to the outside, so I guess the security concerns wouldn't be fully resolved.


Can you elaborate on these side channels?


One example: if two programs share the same CPU and have access to CPU resources, then one program could transmit information to another program by putting the CPU under load and removing the load again in some kind of morse code fashion. In computers with fans this would probably be noted by users but not every computer has fans. It's not a very high bandwidth way but enough to get ssh keys transmitted within a few hours of connection.


This won't allow to load images, files from CDN etc.


Correct. IMHO accessing the local filesystem is incompatible with accessing the network from a security standpoint.

I’m fine with having a very clear toggle to allow this behavior for developer types, but this should default to secure.


I think this summarizes the issue and is exactly the correct solution.

There's a lot of the comments along the lines of "just run python -m SimpleHTTPServer" - but doing that makes your computer just as vulnerable as allowing file:// in the first place, in theory. It's only the very awkwardness of doing that that makes it any safer. Instead of hiding this fundamental incompatibility between security and local accessibility behind a layer of inconvenience, better to drag it out into the open and label it and fully support it. While you're at it, you can slacken off some of the other restrictions in "local mode" as well.

A browser is the modern VM, for better or worse. It should function locally, standalone.


This case makes me wonder what if traditional filesystems would not separate files and directories. So that there would be just files which may have data and sub files. Then html file could include all js files etc. directly inside it so that we could give multi file permissions without leaking it to siblings.


The feature you are looking for already partially exists. Look for "extended attributes" (for most filesystems) or "alternate data streams" (for NTFS).

But in this case, it's much simpler to make a tar or a zip with the HTML and all necessary files. eg. that's how Python "eggs" work.


> I think this also had implications on using wasm in electon

It should not. Electron runtime is not a browser and developer can tweak its security settings.

I would suggest another idea for local files: grant access to folder whose name matches HTML file name, with suffix. For example: file some-page.html and directory some-page_files.


> It should not. Electron runtime is not a browser and developer can tweak its security settings.

Yes, that's what I meant. Sure you can override chromium, but you have to. https://github.com/adamvoss/electron-webassembly-hello/commi...

> I would suggest another idea for local files: grant access to folder whose name matches HTML file name, with suffix.

That would certainly help. It would restrict some multi-html-file projects a bit though that use shared resources. Probably can be helped by having a folder named html-fetchable-files or something.


Perhaps this should be addressed but the fact it requires the user to download and open a file means it's of limited use to attackers. If you can persuade a user to do that then you needn't bother with browser exploits to get access to a user's files.

Edit: but wait, it can only access the directory the file was downloaded to? So that's almost certainly restricted to the user's download folder. That makes even less useful.


The downloads folder is a perfect target actually. Lots of documents end up downloaded as pdfs, I certainly have enough documents to dox me. Anyone who does their e-mailing through a browser will most likely have a bunch of exciting attachments in their downloads folder.

You'll probably also find a bunch of installers in the downloads folder, I could imagine a sophisticated attacker looking for installer .msi's or .exe's for software which is known to have vulnerability.


Sure but you can get much more if you use a more exploitable file type in the first place. Why would you willing jump back into a browser's sandbox when you've just persuaded the user to bypass it entirely?


It's not so much why someone would choose this over that, as it is what attack vectors are added to which surface. Why wouldn't a bad actor be willing to jump back into a leaky sandbox?


Because they already have the run of the user's profile. Why add additional complexity for less access?


Because you may of had zero access rather than some, for example a web dev who wouldn't click on an .exe but would open an .html file without a second thought. More access isn't necessarily always the end goal either.


If someone is knowledgeable enough to not open a shady exe file, they'll probably not simply open any shady files, including doc, ppt, and html


Nah, people are dumb (exhibit A: myself) and overly trusting of parsers/sandoxes.


Not true for html files. They are widely regarded as harmless.


I have never seen anyone saying HTML files are harmless, and would definitely never say it myself.


Easy solution: Firefox gets two sandbox modes for file://;

If the directory is a subdirectory or top level to the default Download Directory or a recent "Save To.." location then the file will be sandboxed like in Chrome and show a warning.

If outside any such directories, it works as normal.


Does the vulnerability permit access to a directory index, in order to identify the filenames of exciting attachments to transmit?


Apparently not. You can create a test.html file with the following content:

    <script>
      fetch(".").then(r => console.log(r));
      fetch("/").then(r => console.log(r));
    </script>
The console show errors when these URLs are loaded:

    TypeError: NetworkError when attempting to fetch resource. test.html:2:1


Yeah. The Downloads directory inevitably accumulates sensitive data over time, especially lacking automatic cleanup/expiration of files. For this reason alone, Firefox should follow the Chromium/Edge policy.

Running Firefox via snapcraft, I've come to realize that desktop Linux systems keep moving the problem around without fully solving it (though I'm quite grateful for the real security benefits of snaps).

Traditional users/permissions are ineffective because all your important data is readable by your ostensibly unprivileged user.

Then SELinux/AppArmor enforcement comes along, but it's of limited effectiveness because you end up with free-for-alls (for convenience's sake) like ~/Downloads.


I think SELinux/AppArmor is a good idea for services like web/ftp/irc/ssh servers, wayland compositors, notification daemons (like dunst), external devices managment daemons (like udiskie) etc. But it isn't a good solution for applications which a user could try to use to read or change arbitrary things in the system for no malicious reasons. But, as I understand it, the Flatpak model of having the application sandboxed and then only allowing it to access files which are selected by the user in the GTK dialog is a great solution. It's just that I haven't seen one for command line applications. E.g. how should we protect ourselves from someone exploiting a vulnerability in vim to access our private documents, which we could potentially want to edit with vim ourselves?

EDIT: it seems there is some progress on making a Firefox Flatpak: https://bugzilla.mozilla.org/show_bug.cgi?id=1441922


> Yeah. The Downloads directory inevitably accumulates sensitive data over time, especially lacking automatic cleanup/expiration of files. For this reason alone, Firefox should follow the Chromium/Edge policy.

> Then SELinux/AppArmor enforcement comes along, but it's of limited effectiveness because you end up with free-for-alls (for convenience's sake) like ~/Downloads.

I keep this clean when I am finished with a file I usually have a term up and will move it to somewhere in ~/ or ~/Documents

I think I might use AppAmor to secure this like TAILS does https://tails.boum.org/contribute/design/application_isolati...

I found these profiles which will act as a good basis https://github.com/mk-fg/apparmor-profiles/blob/master/profi... it seems that Mike Kazantsev (mk-fg) has abstracted it it a bit more into other files.

The ones that come with apparmor look ancient https://gitlab.com/apparmor/apparmor/blob/master/profiles/ap...

> E.g. how should we protect ourselves from someone exploiting a vulnerability in vim to access our private documents, which we could potentially want to edit with vim ourselves?

I think for me it would be about starting with high risk applications.

If you look at https://github.com/mk-fg/apparmor-profiles/tree/master/profi... you notice things like steam, skype, etc.


But still seems horrible to me.

If you're supposed to peer review the static web page designed by your coworker (the one who was hacked), it may steal things from your work folder.


Compatible with spec and previously a feature. I agree that it should be patched but Chrome's policy has historically made local web development and testing a pain in the ass because now you have to configure a web server correctly and work around Chrome's broken cache-control policy (have fun ctrl-f5'ing all the time or opening devtools to disable cache, which deoptimizes your JS)

I struggled with this for years and opened multiple bugs about it. If you're trying to release software to end-users that requires them to configure a whole damn web server you get really tired of it.

Disabling content loading from file:// is one solution, but I feel like the 'this was loaded from the internet' flag already used for downloaded EXEs and DLLs is a perfectly reasonable alternative that would work just as well without breaking things for local development. (To be clear, you don't offer the option to bypass that, since it would only be a security vulnerability)


Does`python -m http.server 8000` not work for local development?


You need way more configuration than that, python's http server doesn't handle keepalive and threading correctly (or didn't when I last used it, maybe that's fixed in 3).

I historically had to use IIS or Apache, and at work we used a full Apache install. If you used python your performance was much worse if your application worked at all.

For C# apps you can at least just spin up a HTTPD directly in your app using system APIs, but it's also kind of flaky.


I'm just surprised that file:// works better (when it's allowed).


It was very fast, at least for a while. It might be slow now.


Killing javascript access to file:// would be annoying for quick experimenting. Maybe denying network access to locally loaded JS, so that nothing could be transferred out is possible solution?


You need to spin up a server if you want to use modules anyways.

Well, it's not a big deal. Just run `python3 -m http.server` or install "http-server" via npm.


This is not something a kid learning basic html/js can do easily, since he probably uses windows, and doesn't even know how to navigate in the command prompt.

The "click on the html file on the desktop" just works(TM)


Can we at least disallow iframes containing file:// URIs? Essentially treat them as if they had X-Frame-Option: DENY.


I am not a Web programmer, so what I suggest might be nonsense. Please correct constructively :)

The central statement seems to be:

> “Our implementation of the Same Origin Policy allows every file:// URL to get access to files in the same folder and subfolders.”

What about the following simple mitigations:

* Access by file:// URL to the root and user's home directory proper are forbidden by the browser implementation (Because all subdirectories of the home directory is always a bad idea, root even worse so) Hard-coded, error dialog, period, no way to override short of patching the browser or mounting trickeries.

* Access to any other subdirectories is allowed, but will cause a warning that by proceeding you grant access to the directory and all it's subdirectories. The warning cannot be suppressed by normal settings, but appears only once a day. An average user will not get it frequently, so there is a chance that they read it (and if they click OK without reading it's their problem). And for the Web developer clicking once per day is still easier than configuring an Apache. And ideally they know why they are clicking it.


Another good reason to use Firejail. You tell it which folders FF gets to access, and Bob's your uncle.


But if this is only leaking the directory you choose to download, Bob's already your uncle?

Issue is that people don't and don't want to choose a fresh/clean-enough directory every download.


It's easier to create a new user account and run firefox from it. Something like: xhost +si:localuser:firefox && su - firefox -c 'DISPLAY=:0 HOME=/home/firefox firefox 2>/dev/null' && xhost -


Yeah, because that's a lot easier than just running "firejail firefox".. And does your snippet work on Wayland?


To avoid this, I use Firejail [1] with a custom rule set to limit the browser's access only to Downloads and its own profile folder. Does anybody know of anything similar for MacOS?

[1] https://firejail.wordpress.com/


From the article, access is already limited to the directory in which the HTML file is located, so this wouldn't help.

Either it would be in the Downloads directory, where it already has access, or it would be in another directory and you wouldn't be able to open the file at all unless you disable your sandbox.


So, what is it different to download random exe and execute it?

Isn't that accessible by design as the dir html allocated is also the root path of that page?

Beside that, aren't both windows and macos warn you when you open random file downloaded from internet?


Mozilla's latest tactic appears to be to trash Chrome semi-weekly on HN and then try to bring in users to Firefox. Then, later that week a new vulnerability is discovered in Firefox.


This breaks the site guidelines by being flamebait, by insinuating astroturfing, and by using HN threads to fight some sort of pre-existing battle. You've done this before, and we've had to warn you before. If you keep doing it we're going to have to ban you, so please stop.

https://news.ycombinator.com/newsguidelines.html




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: