I use this all the time for mocking / testing REST-ish services, since otherwise...

jerf · on July 24, 2021

Check your local "listening" API. Unix permits you to just ask for "a port that is open", in a range defined by your kernel that is usually left untouched by everything else and will have no forwarding associated with it. Then you don't have to worry about the port selection process.

However, this is a prime example of a feature that many people writing abstraction layers around networking leave out of their layer, due to either not knowing it exists or not knowing the use case and figuring that nobody could ever have a use for it. So you may very well find it is not available to you (or not available conveniently) in your preferred environment. (See also "what are all these weird TCP flags? Eh, I'm sure nobody needs them. Abstracted!")

If you have Unix sockets working, by all means stick with them, but if you need sockets this feature can help. You can even run multiple such tests simultaneously because each instance will open it's own port safely. (I have test code in several projects trust does this.)

krferriter · on July 24, 2021

If call bind with port 0 in Linux or Windows or OSX the OS will bind to a random unused port within the unrestricted port range. Whether there is any forwarding or firewall associated with that port is outside the scope of the code that opens the port. If the application is assigned a random port such as 55555 there could be forwarding or a firewall on that port at the OS level that the application is unaware of. If an application is opening random listening ports it just needs to record that information somewhere so clients can actually connect to it. In the case of using this for testing localhost servers the application can just print out the port number or save it to a file so that you can look it up easily without going through netstat or something.

Opening a random listening port to serve content to clients which initiate communications to it is less common than the use case of initiating a communication to a listening port elsewhere and sending content through that socket, though both of these use cases open random local ports.

kortilla · on July 24, 2021

But that’s a worse solution unless you feel the need to test the kernel’s tcp stack.

You’re back to opening local ports that anyone on the box can access. You can work around that by isolating the tests into a container/namespace, but now you have more stuff to orchestrate.

Finally, the problem with binding to 0 and letting the kernel pick a port is now you have to wait for that bind event to happen to know which port to connect to from your test side. With domain sockets you can set that up in advance and know how to communicate with the process under test without needing a different API to get its bound port number.

rwmj · on July 24, 2021

It's definitely the last point which is the main problem. We can start the server side and tell it to pick a port, but then we have to somehow communicate that port to the test / client, and often that channel of communication doesn't really exist (or it's a hack like having the server print the port number on stdout - which is what qemu does). Unix domain sockets by contrast are an infinite private space that can be prepared in advance.

jerf · on July 24, 2021

I don't have a problem running my test code on boxes with hostile people logged in to them, nor do my socket connections offer them anything they couldn't already do if they have that level of access. You sound like you may have a very particular problem, and if this is your situation I'm not convinced "Unix sockets" are the answer anyhow... you seem to have bigger problems with unauthorized access.

kortilla · on July 25, 2021

Isolation on a machine is a very basic security primitive. Binding to localhost circumvents all of that and you end up with vulnerabilities like this: http://benmmurphy.github.com/blog/2015/06/09/redis-hot-patch

Remember, anything you put on localhost can be reached by your browser (unless you use iptables with the pid owner check) and arbitrary webpages you are on can hit those endpoints in the background.

The solution you are offering is worse both from a security and a usability standpoint.

nerdponx · on July 25, 2021

FYI- that link returns 404. "There isn't a GitHub Pages site here."

MawKKe · on July 25, 2021

If you run tests through the loopback interface, you should be able to use any of the 127.0.0.0/8 addresses, instead of 127.0.0.1 (a.k.a ”localhost”). Each address has its independent port space, so you would avoid collisions/contention.

toast0 · on July 25, 2021

You can use the whole 127/8, but only if you've added the address to your loopback interface.

Unix sockets skip all of tcp though, so I'd recommend them vs loopback, if possible. I hear Linux short circuits a lot of tcp for loopback, but there's probably still a bunch of connection state that doesn't need to be there. FreeBSD runs normal tcp on loopback, and you can get packet loss if you overrun the buffer, and congestion collapse and the whole nine yards. Great for validating the tcp stack, not the best for performance; better to skip it.

edoceo · on July 25, 2021

Hmm. I've not added 127.1.2.3 to lo interface but I can ping it. Start a listening server and connect to it. I didn't have to add anything in 127 - because it's a /8

MawKKe · on July 25, 2021

I just tested this and it seems to work by default on Linux but not on FreeBSD. On both the address is configured as 127.0.0.1/8, i.e the whole block. Go figure.

This discussion [1] sheds some light on the matter, but does not fully explain the reasons for differing behaviour across OS's.

[1]: https://serverfault.com/questions/293874/why-cant-i-ping-an-...

nousermane · on July 25, 2021

Interesting. Linux also does that with any other address/subnet added to "lo":

  # ping -c 1 10.10.10.10
  1 packets transmitted, 0 received

  # ip addr add 10.10.10.1/24 dev lo

  # ping -c 1 10.10.10.10
  64 bytes from 10.10.10.10: time=0.023 ms
  1 packets transmitted, 1 received

edoceo · on July 25, 2021

The 127 I get. This one seems odd though.

toast0 · on July 25, 2021

It seems to me (based on reported behavior), that Linux is special casing the loopback interface: when you add an address with a netmask to the loopback interface, it considers all of those addresses as local addresses.

As opposed to a normal interface where only the specific address is a local address, but the rest of the network specified are accessible through that interface.

Maybe one /8 wasn't enough addresses for you, so you added more? Doesn't seem like an unreasonable way to behave, even if it's different than BSD behavior; it certainly makes it easier to use lots of loopback addresses.

koolba · on July 24, 2021

The annoying thing with Unix sockets is making sure to delete them and recreate them. IIRC, there’s no file system equivalent of SO_REUSEPORT.

aktau · on July 24, 2021

UNIX domain sockets also support an abstract namespace, not a part of the filesystem [1].

An excerpt of [2]:

> The abstract namespace socket allows the creation of a socket connection which does not require a path to be created. Abstract namespace sockets disappear as soon as all open instances of the socket are removed. This is in contrast to file-system paths, which need to have the remove API invoked in code so that previous instances of the socket connection are removed.

It worked quite well when I tried it (also in addition to using SO_PEERCRED for checking that the connecting user is the same as the user running the listener in question).

[1]: https://unix.stackexchange.com/a/206395/33652

[2]: https://www.hitchhikersguidetolearning.com/2020/04/25/abstra...

koolba · on July 24, 2021

That’s fantastic. Too bad it’s Linux only though. Is there an alternative for other systems outside of adding a delete step first?

Agingcoder · on July 24, 2021

I second this, I have been using this in production for years.

It's reliable, and very convenient.