Before the DNS: how yours truly upstaged the NIC's official HOSTS.TXT (2004)

bradknowles · on Feb 7, 2020

And in 1995, when the Defense Information Systems Agency was building the classified SIPRnet, the principal network manager wanted to use HOSTS.TXT tables instead of the DNS, because they thought it would be easier. They also wanted to use random network numbers pulled out of their ass, for the same reason.

Fortunately, I got wind of this, and as the DISA.MIL Technical POC, I had a meeting with them. Ultimately, I was able to convince them of the folly of their ways, and to use real DNS servers as registered from the NIC, and to use real network numbers as registered from the NIC.

The kicker was that I knew they ultimately wanted to be able to connect the classified SIPRnet to the “unwashed masses”, through a mythical “multi level secure gateway” that the NSA was supposedly building which would theoretically keep the truly classified stuff from touching the unclassified stuff.

But how would you route packets from one side to the other, if you had colliding network numbers because one side just pulled random numbers out of their ass?

How would you connect through to the unclassified side, if you weren’t using the “real” DNS?

Yes, this was 1995, and they still thought HOSTS.TXT files were a good idea.

To those of you who have used SIPRnet, you’re welcome.

paranoidrobot · on Feb 7, 2020

There are, unfortunately, people who still think this way.

I worked for someone a few years ago who wanted me to build tooling manage the hosts file across several hundred devices.

When I said no that that's exactly what DNS servers are for, they got shitty and went on about how DNS was unreliable and the cause of so many outages.

geofft · on Feb 7, 2020

I wish I had a term for this sort of confusion. On the one hand, yes, DNS is the cause of many outages, and yes, in particular uncommon circumstances ("small networks" and "having an unusually good system for syncing text files that you're already keeping at five nines", etc.), syncing a hosts file might be reasonable. But most DNS outages aren't about DNS itself as a protocol, and won't be addressed by syncing the hosts file - they're about stale information in DNS, or incorrect information being stored in DNS, or something else like that. Whatever system you build to replace DNS is going to be prone to the same problems.

I have seen the same line of thinking in many other contexts - "Why don't we just replace problematic component X with other component Y" where X is, genuinely, not great but Y would need to be shaped just like X and our problems with X are actually its shape, and if you were able to reshape Y, you might as well just reshape X and not bother deploying Y. But I'm not sure this fallacy has a name.

paranoidrobot · on Feb 7, 2020

> But most DNS outages aren't about DNS itself as a protocol, and won't be addressed by syncing the hosts file

Correct.

> I'm not sure this fallacy has a name.

In this case it was a mistrust of any kind of automation. The line of thinking that if the automation can do all of this, what will they do?

DHCP was removed from the network and over a thousand devices were set to use static IPs because someone had plugged in a device with a DHCP server on it. It was DHCPs problem, not that the network switches were not configured to block DHCP Responses except from authorised hosts.

Every machine had it's OS and software manually installed, because imaging 'never worked right'.

Active Directory wasn't used, because AD was a 'single point of failure', and so local accounts on everything.

Then when AD was eventually rolled out (and with it DNS/DHCP/etc)... Every machine had the printers manually configured, because group policies 'never work'.

Spivak · on Feb 7, 2020

But we're sitting here misunderstanding what the hosts people actually want from DNS and why they're frustrated with the current state of it.

Failure to contact the name server should not cause a failure in name resolution. It should mean that the server isn't getting record changes.

Caches suck for this use-case. Great for the public internet but terrible for intranets. I want DNS clients to replicate all my internal zones from the masters and then serve their own queries. Records don't change that often so "bad things" would happen far less often if the failure mode was the client was behind on replication.

It's the same story with DHCP and AD. If DHCP worked so that leases were basically indefinite and servers would keep using the last address they were given we would end up with fewer total problems. All server addresses are reserved anyway and change on human time. Just propagate the changes when they happen instead of having the client ask every time.

This is what all the rsync people are ranting about. They want a DNS system where the server pushes record updates to the clients. We're over here building chat systems based on message brokering and WebSockets but doing it for name resolution is suddenly weird?

paranoidrobot · on Feb 8, 2020

> "Records don't change that often so "bad things" would happen far less often if the failure mode was the client was behind on replication."

That's... an entirely environment specific situation whether that's true or not.

If you're doing modern CI/CD practices with k8s, etc - those records might be valid for minutes at a time.

In many situations it's better to specifically get a DNS Resolution failure than to get out of date records.

That at least should be noticeable and obvious to the application and anything/anyone monitoring/using it.

> They want a DNS system where the server pushes record updates to the clients.

This is what Zone transfers between DNS servers are.

Running a DNS Server on each client in your network seems like an overly complicated situation but hey, if you want to do that - you can. I suspect it'll cause more problems than it solves by just having a set of well monitored/configured DNS servers.

Spivak · on Feb 8, 2020

> Better to get resolution failure.

But this is just as env specific! The point is that the “copy hosts” people are basically saying in their env that serving potentially stale records is the preferred failure mode.

> Zone xfer!

Yes! I just with the software support was better/more mature for “caching servers” (i.e clients) to act as slaves for zones rather than catching requests in-flight.

> Running a DNS server on each client...

This is what Ubuntu has done for ages with dnsmasq and now systemd-resolved. Every Linux server these days is running a DNS server.

This isn’t a “solving problems“ thing. This is a “what sounds happen in the event of a failure of those well configured and monitored DNS servers.” You can’t just be like “just never fail” as a solution.

erikmolin · on Feb 7, 2020

>But I'm not sure this fallacy has a name.

I've also noticed it, and I've attributed it to the fact that the devil is in the details - you see all the intricacies and problems in the details of the solution you know, but only the shiny facade of the other proposed solution. It's basically the same idea conveyed by "The grass is always greener on the other side of the fence" - that's not a very handy name though.

geofft · on Feb 7, 2020

Yeah, it's partially a grass-is-greener thing, but partially a fundamental failure to realize that you're allergic to grass, and even if the grass really is greener, that won't actually solve your problem.

pixl97 · on Feb 7, 2020

https://en.m.wikipedia.org/wiki/XY_problem

ramses0 · on Feb 7, 2020

JWZ Problem? (now you have two problems...)

https://www.jwz.org/blog/2014/05/so-this-happened/

geocar · on Feb 7, 2020

For federated names DNS has won. There’s no point arguing how good or bad DNS is because it’s a necessary evil at this point. I wouldn’t use it in a new protocol or on a fully internal network though and that’s because nearly everything is better than DNS.

For internal networks this is obvious. I had hundreds of machines syncd up on my hosts file in the 1990s so global DNS outages didn’t affect my internal connectivity. Win.

If you make a mistake in DNS you either wait for caches to discover your correction or you ring up a bunch of other sysadmins and get them to restart named. If you rsync hosts files you just run rsync again. Easy.

If you are trying to diagnose a name problem, you check the hosts file. If you use DNS even for your internal name discovery you have to check every resolv.conf every listed nameserver and compare with going to the roots. What’s the point? This runbook is long while running rsync is easy!

New protocols want IP mobility and cloud discovery- what does DNS do here? Causes problems that’s what. What do you do if you want to address a service instead of a machine? Know any browsers using SRV records?

And what about the home user? My printer is dell12345.local but how does DNS know that? The dhcp server could’ve told it but what stops clients from lying? Who signs the SSL certificate for my printer? How does layering on one more piece of bullshit like DNS help anything?

So not only is DNS not an obvious choice for anything new now, it wasn’t an obvious choice for this new thing then (naming federation). Communication was slower then and tech/science has always been a certain amount of cargo cult/echo chamber. These guys don’t know what they’re doing but they’re exhausting and they’re going to do it anyway. So we end up with the worst thing that could possibly succeed: DNS.

So yeah. DNS problems are absolutely because of DNS and almost anything will work better for any specific use-case. That means part of the reason this “fallacy” doesn’t have a name is because it isn’t a fallacy. Some things just suck, and seeking out a workaround sometimes any workaround needs to be considered a cry for help instead of browbeating them with how great DNS is and hope they end up with Stockholm syndrome.

jamespo · on Feb 7, 2020

"If you are trying to diagnose a name problem, you check the hosts file. If you use DNS even for your internal name discovery you have to check every resolv.conf"

If you are using a distributed hosts file you have to check every hosts file on every host and that it's in sync.

geocar · on Feb 7, 2020

Nonsense. You already have something that copies the hosts file to every host. It does that check as a product of copying the hosts file. rsync is old, and even before that we had rcp!

What you don't have, and have never had, is the ability to check all of your recursive resolvers from every machine and easily compare the results. rsh might've failed for other reasons that you have to check first. Nameservers might be ok, but ethernet hub might be broken. Might be temporary. Might be an active cache poisoning attack. Might be out of ports. You just don't know. DNS is always a moving target. A huge amount of energy is put into building monitoring and recording the results of monitoring and looking back on previous observations, and it's still not perfect: nagios can say everything is okay, and still service was down. Sometimes you never find out why.

A better way to think about it is: push don't poll. You know when the master hosts file changes and you know how long it takes to push it to all the machines. Polling the DNS servers just in case it changes is silly.

icedchai · on Feb 7, 2020

The entire Internet uses DNS for billions of hosts. To say it won't work for a tiny internal network seems a bit strange.

Also, if you can push files with rsync, you can write a script to SSH to every host and check its DNS settings. Pretty simple stuff.

DNS isn't "new" at this point. I remember configuring old SunOS boxes in the 90's, switching them from NIS to DNS. Exciting times.

geocar · on Feb 8, 2020

> The entire Internet uses DNS for billions of hosts. To say it won't work for a tiny internal network seems a bit strange.

Ok. Then look at it this way. A DNS server effectively has to look at its own hosts file and publish the results in response to online queries.

Assuming the failure rate of getting answers from a hosts file is constant, why exactly do you think those online requests have a negative failure rate? That’s what would be required for DNS to beat hosts files!

If we’re talking about different networks then the hosts files do not have the same failure rate, and that’s the first time DNS (or something like it) could be better.

> DNS isn't "new" at this point. I remember configuring old SunOS boxes in the 90's, switching them from NIS to DNS. Exciting times.

Confessions from the cargo cult. Oh well. Hopefully the above makes it clear why this was a mistake.

If not, I’m not sure what else to say. In 1996 I was converting sunos boxes at that time back to hosts files and yp because the failure rate of DNS was higher and the previous guy left. What else could I do? If I’d told my boss we needed to keep using DNS because “it works for the Internet” I would’ve gotten laughed at. We had real work to do!

icedchai · on Feb 8, 2020

Your time would've been better spent fixing your DNS servers, adding a second one for redundancy. If you told me you couldn't make DNS work in 1996, I would've laughed, figured you were just inexperienced. If you told me you couldn't make it work today, I'd ask HR to get the paperwork ready.

geocar · on Feb 9, 2020

Wow.

You would choose a failure rate of nonzero over a failure rate of zero and threaten a coworker with an HR trip for disagreeing with you?

I'm so glad I don't work with you.

icedchai · on Feb 9, 2020

Ok. Let's go back to 1985 and use host files. Who wants to join me?

troquerre · on Feb 7, 2020

Agree DNS has won today but I don’t think the current infrastructure will be what the world is using 10 years from now. ie We're trying to improve the security of the Internet by replacing Certificate Authorities with a distributed root of trust. DNS is currently centralized and controlled by a few organizations at the top of the hierarchy (namely ICANN) and easily censored by governments. Trust in HTTPS is delegated by CAs, but security follows a one of many model, where only one CA out of thousands needs to be compromised in order for your traffic to be compromised. We're building on top of a new protocol (https://handshake.org, just launched 4 days ago!!) to create an alternate root zone that's distributed. Developers can register their own TLDs and truly own them by controlling their private keys. In addition, they can pin TLSA certs to their TLD so that CAs aren't needed anymore. I wrote a more in-depth blog post here: https://www.namebase.io/blog/meet-handshake-decentralizing-d...

icedchai · on Feb 7, 2020

For internal networks, you set up a couple of local DNS servers. These servers are set up as slaves for all your local zones. DHCP provides those servers to the clients. Done. This is very simple and has worked for decades.

geocar · on Feb 8, 2020

These are Internet hosts.

You don’t run dhcp and update your dns from a client supplied unauthenticated field unless you’ve got no idea what you’re doing.

icedchai · on Feb 8, 2020

That's good. DNS works with the Internet. ;)

I'm talking about providing a the DNS server list to update the client's resolv.conf, not updating the DNS server's host entries dynamically from the clients.

geocar · on Feb 9, 2020

> I'm talking about providing a the DNS server list to update the client's resolv.conf, not updating the DNS server's host entries dynamically from the clients.

Why?

This isn't what anyone else is talking about.

creeble · on Feb 7, 2020

Rsync doesn't replace DNS. It is a file copying mechanism, not an addressing system.

When an IP address changes, no amount of rsync will fix it. The IP address is simply out of band.

geocar · on Feb 8, 2020

If someone can change your IP addresses on you, I have news for you: you’re not on the Internet but on someone else’s network who is on the Internet. I got my addresses from ARIN so my host numbers only ever changed when I thought it was convenient.

But whatever. You’re still wrong: Rsync is instant. DNS requires waiting for caches or an admin task to bounce the nameservers.

QuinnyPig · on Feb 7, 2020

…did we work at the same place?

rconti · on Feb 7, 2020

I worked at the same place too.

paranoidrobot · on Feb 7, 2020

Unfortunately not, Corey.

sneak · on Feb 7, 2020

Well, if you’re going to plan do one impossible thing, you might as well plan to do n impossible things. :)

(As it shook out, despite their classified networks having physically separated hardware, BULLRUN still has its own Wikipedia page. :D)

JohnFen · on Feb 7, 2020

I am part of a group that runs a small "private internet" -- that is, a tcp/ip network that adheres to most internet RFCs and provides most of the same services as are available on the internet, but operates independently of the internet itself.

Right now, we don't run a DNS -- we go old-school with a master hosts file as was done before DNS existed on the internet. For our situation, it's the easiest solution and is entirely manageable, since there aren't a ton of domain names and the hosts file rarely needs updating.

But it's clear that eventually this will no longer be sustainable. Kudos and thanks to the author for blazing this trail for us. It will make the eventual shift to a private DNS much easier.

DonHopkins · on Feb 8, 2020

>This was easy because in those early days the Network Control Program known as NCP (this was before TCP/IP) would broadcast messages called RSTs to every possible host address on the network when they booted.

Note that NCP used 8 bit host addresses, so it wasn't as if you couldn't write a program to connect to all 256 possible addresses and see if a computer answered. But that would be rude.

https://en.wikipedia.org/wiki/Network_Control_Program

The author's top level web page says:

https://iconia.com/

>from the keyboard of geoff goodfellow

I'm glad to learn that Geoffrey has finally upgraded his tty to a real keyboard! He used to always use this From address:

https://iconia.com/TELECOMDigestV2.33.txt

From: the tty of Geoffrey S. Goodfellow

oldandcold · on Feb 7, 2020

As a one-time DNS dev, this story is at once heartwarming and terrifying. Still the best thing I've read in a while! LOL!

gmiller123456 · on Feb 7, 2020

>I would then telnet or ftp to these nameless hosts and see what host name the operating system login prompt gave me or what host name the ftp server announced in its greeting. I would then plug this information into my systems host table.

So, you removed what little security there was at the time to make sure that the machine pointed to by a host name actually had some connection to what the hostname implied. This is just one step below making the HOST.TXT file world editable. The NIC had a good reason for not adopting your method.

fanf2 · on Feb 7, 2020

Remember at the time this was a closed network of less than a thousand computers, and there was no secrecy about what was connected to the network. For example, knowledgable users were able to identify where a computer was from its name, because sites had distinctive naming schemes https://elists.isoc.org/pipermail/internet-history/2020-Febr...

flomo · on Feb 7, 2020

Keep in mind these hostnames were generally quite descriptive of their network location, and people could vouch for that. There was no typo squatting or fishing going on.

This guy's host file became popular mostly because local sysadmins were too lazy to add their own hosts.

flingo · on Feb 7, 2020

What're you implying? That someone would change their hostname to take over someone else's site?

gmiller123456 · on Feb 8, 2020

Not really "taking over", more like password harvesting. You just report a hostname at login that you don't own, the maintainer will update the hosts file to point to your system. People will gladly connect to your system and type in their usernames and passwords.

Ayesh · on Feb 7, 2020

Ayesh · on Feb 7, 2020

Author clearly says it was for their own use, and the rest of the Net just decided to prefer it.