Hacker News new | past | comments | ask | show | jobs | submit login

Why would the user do that? You don't move away a program's data dir while it's running and expect everything to just work, right?



There are programs that rely on this working, which would continue to work with unveil(2) if it wasn't for this artificial limitation.

Think e.g. a daemon that processes data in /var/run/something and uploads it somewhere else, and gracefully handles failure if the directory disappears (e.g. an admin had to remove all existing data). Due to the unveil(2) interface such an action would require a full restart of the daemon, but that's not a normal limitation on *nix systems.

I can't think of any reason for it to work this way except that it's exposing the underlying limits of the implementation, or if they're making the trade-off of tying it to the inode so `mv ~/myapp ~/myapp.back` will continue to allow access to files under ~/myapp.back, but they don't document that.


Even though the function's name is unveil, I'm thinking that this feature is for denying access rather than allowing. In that sense, I think it's a useful protection measure that the new directory doesn't suddenly become visible in your example -- smells like an attack vector. Also, I would guess that doing these checks via pathname every time files are accessed would have to be pretty bad for performance.

If such a feature is wanted, I would think that the daemon can just call unveil again when it's doing an automatic recovery.


Perhaps it's for performance reasons. I don't see how it would be an attack vector to permit /some/path/dir to be read after the "dir" is re-created or replaced. You'd guard against that with standard nix ACLs. If someone can replace the directory they can also modify it.

    > [...]the daemon can just call unveil again[...]
Not if it previously called unveil(NULL, NULL), which is a significant part of the security selling point of this feature. If so it'll need a full restart.

You also won't be able to tell if the directory can't be accessed or if it truly doesn't exist, since syscalls will return ENOENT instead of EACCES in this case. So the state machine to recover from this becomes complex. You might do a full restart just to find that no, the directory really doesn't exist anymore, rather than getting replaced.

On the other hand this is consistent with how chroot(8) works, but in that case you're cd'd to the directory in question, which you'll loose access to if it gets replaced (standard nix semantics, nothing to do with chroot per-se).

Overall I really like these simple security restriction mechanisms OpenBSD is adding, and wish these sorts of APIs were available on more popular OSs, but if say Linux cloned this I hope they don't carry over this caveat, since it's not consistent with how path access works in general.


I think in linux you could already do this if you wanted to.

unprivileged user namespace + mount namespace -> mount some tmpfs, bind-mount (optionally read-only or noexec), pivot_root and you got an isolated view of the filesystem. You could write a wrapper around that which provides a streamlined API in the fashion of unveil.

It's what sandboxes and containers runtimes do. Instead it could also be made into a security library which processes could use as part of their startup sequence.

Basically, the linux building-blocks (seccomp, namespaces, privileges) are more low-level and you need to assemble pledge/unveil-like abstractions from them. But I am not aware of any established library that does that in a convenient fashion.


The difference between pledge and seccomp, or unveil and namespaces, is that pledge and unveil limitations are not inherited.

Intuitively you would think that makes them less secure. But inheritance means that it becomes impossible to use system utilities. On OpenBSD /bin/sh is pledge'd; you could never do the same thing using seccomp because it would make the shell useless. At best you'd dynamically seccomp a single shell session according to a particular task, and you'd have to do it from the invoking process. Unsurprisingly, nobody bothers doing this.

So it's impossible to emulate pledge, and impractical to emulate unveil, on Linux. At this point, almost every program on OpenBSD uses pledge. No configuration. No option switches. It's done. Everything works like before, except now the security risk of bugs in all that code are substantially mitigated.

All the criticisms of pledge and unveil miss the basic point of these interfaces--to be easily useable by developers and easily used in all programs, not as low-level primitives to be used to write libraries to be used to write tools to be used by sysadmins for sandboxing services.


in linux, systemd can do that when starting a service. the config is in a unit file, not a library

probably better, as such code is likely to be very unportable. even if you standardise the calls in a library, on linux the right paths will probably vary between distributions and architectures


Pledge and unveil allow you to do privileged setup within the process, e.g. obtaining a bunch of file descriptors, or doing other things that need caps, then holding onto them and then dropping those privs.

That's not possible with an external container/sandbox launcher, you need to interleave it with the control flow.

Seccomp&co are too low-level. Jail launchers are too coarse.


If the daemon is going to call unveil on every file access then what purpose does unveil serve?


The trick is to unveil all the files the server needs to operate on and then lock further unveiling.

This way bugs/exploits of the server can't suddenly go read/write to i.e. /etc/passwd.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: