Hacker News new | past | comments | ask | show | jobs | submit login
Making a DNS query in Ruby from scratch (jvns.ca)
184 points by guiambros on Nov 6, 2022 | hide | past | favorite | 59 comments



Making an ICMP echo query in Python from scratch:

https://github.com/jaysoffian/eap_proxy/blob/78a058ffe67c253...

The dnspython package is pure python and it's a lot of code, but it supports pretty much everything related to DNS:

https://github.com/rthalley/dnspython


One thing that always drives me crazy about these is that it's not really a great example of how to actually write a query. "Copy values from wireshark" makes a bunch of assumptions, the main one being "the query you saw is the kind of query you want."

Just blindly copying flags off of a packet isn't really a good practice for software development. What if the packet was not recursive? I'm sure 99.99% of the people reading that have no idea that non-recursive packets were possible, or even what it means.

Instead, read RFC 4.1.1 and figure out what the flags mean and why you'd use one or the other.

https://www.rfc-editor.org/rfc/rfc1035

And by reading the actual format document you can start looking for vulnerabilities. For example, what happens when the various COUNT numbers aren't correct in a response? What if you get an error response code but there is response data? What if the implementor used signed instead of unsigned numbers for COUNT?

This is a bit better than SO, but not by much. You're not writing a DNS query, you're hacking someone else's DNS query and sending it over the wire.


I can't really ready Ruby, but it seems to me that the code fragment for implementing domain name compression does not handle compression loops.


Yes, avoiding loops in DNS name decompression is the first thing I check in new DNS code. I think Julia updated this post after publication with a caveat about compression loops and other potential problems. (I think the only other missing thing is checking the name is not too long.) So from my point of view this tutorial passes with flying colours because it clearly says that name compression is the worst part of the DNS wire format, and that there are lurking dangers if you are writing something more serious than experimental code for learning.


Great post! I feel like Ruby can make a comeback if a lot more people use it for more than just Rails


I really, really like Ruby-the-language, and prefer it over python. I don't love that it's joined at the hip with Rails.

compared to Python, Ruby has:

  - first-class symbols (yes python has sys.intern but it would take a PEP giving them a pithy syntax to make them usable, plus python has 25 years of stdlib and libraries using "strings" or enums for constants instead of :symbols)
  - procs/blocks and better-than-python lambdas
  - "open classes" / monkey-patching of builtins (for better or for worse)
  - trivial metaprogramming with method_missing (for better or for worse)
some of these make fun one-off projects easier or faster, some of them would be less welcome in large, mature codebases.


So I'm a Ruby fan and I largely agree with you. I started dicking around with Stable Diffusion recently and was almost immediately reminded of so many things I dislike about Python.

But just to be a bit contrary:

- I don't see a huge value in symbols. In Ruby they are literally just static strings which means they use memory you'll never get back – potentially important if you're e.g. parsing something large into a hash and symbolizing the keys. If you have to put a non-alphanumeric character in a symbol you still need to use quotes.

- Procs, blocks, and lambdas – yes.

- Metaprogramming and monkey patching? dfjasdjldfjkdfjlkfdjldfoh4houfhufl. A double edged sword at best and 100% not something I'd want to see in a larger codebase. Javascript folks largely learned this lesson with the shift from Prototype to jQuery. You can do some really neat-o things but they're almost always unintuitive to the uninitiated.


I think monkey patching is a nice tool to have. I once had to deal with a vendor provided gem that assumed the product was configured in a specific way (it was not) deep in an internal method. Patching that one method was all I needed to do to get it working.


exactly. And when writing a test suite, I monkey-patch at will, which makes it sooo much easier to have self-contained tests without side effects or complicated dependencies.


You can dynamically create symbols and expect them to be garbage collected as of Ruby 2.2. They only become "immortal" if you use them to define a method, instance variable, or constant. So the common use case of hash keys is fine. One important advantage of symbols versus strings is that Symbol#== is constant time.


Ah hah, you've discovered how long it's been since I've thought about Ruby performance.

  One important advantage of symbols versus strings is that Symbol#== is constant time.
At that point though why not use integer constants?


syntactic sugar- you don't have to pre-declare them or worry about conflicting values. And occasionally the symbol.to_s method is useful in logging or whatnot. That's pretty much it.


> A double edged sword at best and 100% not something I'd want to see in a larger codebase.

the real footgun is dependencies that monkey patch things you don't know about / don't expect, and then the order you require those dependencies becomes important. augughghhhhh I hate that part.


Monkey patching I agree with but IMO it's unfair to lump metaprogramming into that same bucket.


The ruby proc passing style seems so ergonomic in curly paren style languages that I just can't understand my I haven't seen it anywhere else.


Haven't used Ruby in years, the one thing I miss in Ruby is decorators.


Don’t call it a comeback—Ruby hasn’t gone anywhere.

I get it that the hype around Ruby and Rails has—thankfully—subsided but Ruby is even better today than it was then.


It's much better today.

Sure when it was hype a decade ago you'd get lots of flashy tools and library every other days, but a large part of it was really wonky.

Now that the dust settled, the tools that remain largely used are much higher quality.


That and it's still by far the language backing the most startups that made it big.

Ruby gets results.


> backing the most startups that made it big [10 years ago]

This is not the case today. And most of those startups also had to then migrate to Go/Java/etc. to scale.


There's an argument to be had that they would not have been successful enough to need to scale without the speed of development provided by Ruby.


Yep, also a lot just migrated some components to more suitable languages but the for the most part remained Ruby.

You can scale anything if architected right.

You can't make languages fast and fun to write in if they aren't that to begin with. There's a lot to be said for dev morale and productivity.


Which? Apart from the twitter case, which was ages ago and largely overblown, there's no case that I can remember of ditching ruby for tech X because "for scale". There are lots of examples of companies using other tech rather than ruby, but usually for different reasons (python ML tooling, go for some performance bottleneck, Java for hiring).


Twitch, LinkedIn, Soundcloud, Grab, Parse, and Deliveroo. GitHub and Stripe are both transitioning off.

I.e https://news.ycombinator.com/item?id=20349004

"I worked at Twitch from 2014 to 2018... No more Ruby on Rails because no good way was found to scale it organizationally; almost everything is now Go"


I love that Ruby’s main goal is developer happiness. I feel like Matz has really delivered on that promise, even after 30 years.


Ruby never went away. It just used to have an extreme amount of hype and now is a mature and, dare I say, slightly "boring" language.


With Python being so much more common, Ruby would have to have something really remarkable in order to do that. Does it?


Ruby is the closest thing to Aspect Oriented Programming that I’ve seen. Primary driver of the reason the Gem ecosystem is so good.


But is it better enough to warrant a switch? Python has multiple inheritance (enabling “mixin” classes), metaclasses and decorators, all of which can be used to solve the problems which AOP aims to solve. Not to mention numerous modules to make AOP easy, if that is what you want. Again, it might be easier in Ruby, but is it easier enough?


I’ve read a few Python books and dove into it. All I can say is that I enjoy programming with Ruby. I keep coming back to it despite multiple other languages.

I love Elixir as a language but I still find myself coming back to Ruby frequently.

Python exists, but there’s nothing about the language that makes me want to use it. Quite the opposite. I find myself avoiding it whenever possible.

As a prominent Python dev told me, “It’s the okayest language out there.”


Could u tell me more details the "opposite" things here in details ?


By far the biggest thing for me is package/environment management. All of the tools I've used just suck. Pip, virtualenv, conda. For me, at least, getting started with anything non-trivial in Python involves grinding my teeth and slogging through whatever unpleasantries. Recently I've run into problems where some stuff seems to not work between different minor versions of Python 3. Ruby is generally easier and more portable – that a large subset of Python folks have standardized on a model / management tool like Conda that's not portable is something I can't say anything civil about. I can't think of any other language that's done something so boneheaded.

Beyond that Python is opinionated. In a lot of ways this is an improvement over e.g. Perl. However enjoyment is largely predicated on liking the opinions, if you don't it's not fun. For instance I wanted to write a multi-line lambda recently (mostly to make it easier to read). With Ruby and Rust I can do this pretty easily. With Python? No dice. Sure, there are good reasons to not make a lambda a multi-line ordeal but sometimes I just want to.


The things that attracted me to Python were:

1. “One way of doing everything”

2. Had a “Rails like” framework with Django

3. If I wanted to dip my toes in an AI library it would be a good choice

4. Access to a lot of system level libraries.

These all fell apart for me after getting into it.

1. As the other commenter suggested, there’s not even one clear way to handle package management. It’s a mess. In other languages, I’ve never had a good development experience with Docker because native is usually so much smoother, but with Python native is a bit of a mess with everyone recommending different tooling. This was compounded because I got into it during the last year of Python 2 so library segmentation was a bit of an added problem.

2. I’m hugely opposed to the way Django handles database changes compared to standard Rails migrations. This is personal preference.

3. I never had time to get into it, but figured if I ever did I’d setup something more dedicated for it. Plus Elixir released so much tooling around this space that I’d probably just go that route anyway.

4. I haven’t found anything system level that I need that I can’t get from Ruby or just calling directly from the command line.

Those were the biggest items.

It’s just not for me. I say that knowing full well that there are plenty of people who don’t like Ruby.


Unfortunately no. There's been a sight increase in interest ever since Ruby 3 but something else is needed for a spark


See, Rails is the one thing that I have never actually used Ruby for despite using it for a decade. Ruby is a fantastic language for systems automation, gluing things together and CLI apps.


Useful article - thanks for writing.

A bit of rant - what annoys me on similar "examples" like articles - those have none to 0 error handling and not even assuming things may go bad. Doing bind() with hope it will always succeed, connect without timeout - for me as a newbie such things are important to be provided in examples as well.


"If you wish to make an apple pie from scratch you must first invent the universe.”

-Carl Sagan

With that said, it was cool to see a lower-level explanation than just calling a library.


Making multiple DoH requests in shell script from a list of domainnames.

  DOH_SERVER=eu1.lavate.ch
  (echo example.net;echo example.org;echo example.com) > 1.txt
  yy048 < 1.txt|yy049|yy050|sed "s,.*,https://$DOH_SERVER?dns-query?dns=&," \
  |yy025|openssl s_client -connect $DOH_SERVER:443 -ign_eof 
This uses HTTP/1.1 pipelining and only a single TCP connection. Good netiquette.

Instead of openssl one can use, e.g., stunnel, or an https proxy, plus netcat.

The sources for yy025, yy048, yy049 and yy050 are just short text files compiled with flex. If preferred, yy025 can be replaced a little more shell script and tr(1).

No Ruby, no Python, no Go, or other large languages. Just shell utilities.

Parsing catenated responses can be done with ldns, e.g., drill.


Off-topic, but am I the only one that's annoyed by the lack of publish dates in blogs?


It's in the URL: https://jvns.ca/blog/2022/11/06/making-a-dns-query-in-ruby-f...

and the HTML source also includes a machine-readable element:

    <p class="meta">
      <time datetime="2022-11-06T08:31:53" pubdate="" data-updated="true"></time>
    </p>
For my own journal, I tuck human-readable metadata inside a <details> block (which defaults to hidden), with the title in the nested <summary> (which defaults to visible). Thus, it's available, if visitors activate the title to reveal it.


You can blame SEO. Old content is not ranked as well as new content so it’s better to remove the date and pretend the articles are recent.

The world may be a better place without SEO.


The search engine can't tell it already has it in its index? Also a sibling pointed out that it's in the source. That doesn't seem right.


It's only an issue because search engines use date as a proxy for if the page is still relevant instead of judging it by the content.


I'm the opposite and am annoyed at the obsession with dates causing people to ignore anything written more than a year or two ago. Wait until you find out how old DNS is!


Note: While educational, making a DNS query without DNSSEC verification in 2022 is like making a HTTP query without certificate verification (or without HTTPS support).


This is so wildly untrue I'm wondering if you wrote it just to prod someone to jump in here and start the DNSSEC argument. Less than 4% of North American names are signed. Virtually nobody uses DNSSEC.

Further, this code implements a stub resolver querying 8.8.8.8 --- in that scenario, there is no DNSSEC verification, as you know. For stub resolvers, the kind your browser or OS uses, DNSSEC condenses down to a single bit in the header that the server uses to say "trust me, I did DNSSEC".


(I don’t need to prod you to comment on DNSSEC; you seem to be able to find any and all mentions of DNSSEC here quite well on your own.)

> Further, this code implements a stub resolver

Fair enough, but…

> DNSSEC condenses down to a single bit in the header that the server uses to say "trust me, I did DNSSEC".

…they did not ask (in the query) for DNSSEC verification, nor did they check the bit in the response.


You ignored the part about nobody using it in the first place. There’s nothing to verify.


He said “Less than 4% of North American names are signed.”. Don’t you wonder why he specified North American names?


Because it's easy to grab that statistic and a lot more annoying to get the global one, especially because global deployment stats count "zones" and not delegations from TLDs. But there are almost twice as many signed domains in .COM (DNSSEC uptake: 1.6%) than there are in .NL, and the number of signed delegations drops rapidly after .NL (from 3.5MM to 1MM in .CH, to below 1MM in .BR; by the time we hit .UK, the graph is hard to read. My point being: adding up all the signed European names (which are signed automatically at registrars as security theater) isn't going to get you a more attractive uptake percentage.

It's possible that the reason I said "less than 4% of North American domains" is that I simply made a mistake, and should instead have said "less than 4% of all domains". Again: .COM has a 1.6% uptake. There are years in the last ~4 where DNSSEC uptake fell in .COM.

DNSSEC is moribund.


> DNSSEC is moribund.

For how many years have you been saying that? Meanwhile, from what I can tell, DNSSEC usage keeps going up.


Not so much, no. Now, could you acknowledge the comment I just wrote? It's less than 4% of all domains. So: what were you trying to imply when you pointed out that I'd said "North American domains"? And, now that I've corrected the comment, would you still have said it?


I can’t find any good statistics either, so I did not comment on any specifics. I am simply wary of overly specific qualifications with no obvious reason for their specificity; most often, these sorts of arguments are made in order to mislead readers. I don’t know what the actual numbers are.

All I can say is that from personal experience when working at a registrar and DNS service provider, the number of people asking about and requesting DNSSEC is increasing all the time, and show no signs of decreasing. Also, all registries (i.e. TLDs) are also all pushing for registrars and DNS service providers to provide DNSSEC, so there is demand from both sides. Note: I do not have any financial incentive to push DNSSEC; in fact, strictly speaking, DNSSEC makes my job harder.

Also, as I have mentioned before, I have never seen anyone argue against DNSSEC with any persistence (in industry interest groups, at conferences, etc). Except you, here on HN. And you really seem to have it in for DNSSEC, even going so far as to keep making arguments against the crypto, not only while it was obvious that it could (and would) be fixed, but even making the same argument after it was actually fixed. You keep shifting your arguments, but keep arguing against DNSSEC with whatever you can find. This does not make you look credible. And your sole remaining argument, that DNSSEC has low usage, is not a very good one, if it is in fact the case that the usage is actually (on the whole) increasing.


the number of people asking about and requesting DNSSEC is increasing all the time

The number of people not requesting it is increasing all the time too.


I'm not sure why I can't reply to the comment next to mine, but quite a few .gov sites use DNSSEC, so there's at least some point in using it.


It's not unusual to validate unconditionally in recursive resolvers, even for clients that did not set the AD bit or the DO bit.


Redirecting from HTTP to HTTPS is also quite common, but that does not make it OK to just make HTTP requests all the time.


I don't know much about DNSSEC, it's something I enable to get a green tick in whatever tool I use on the day to check that I configured DNS correctly.

But HTTP without HTTPS is useful if you need to go really fast, same goes for taking DNS out of the loop and going straight to the IP.

The way I see it, it's all about your threat model. Who/what are you defending against.


In a stub resolver like this, it’s setting a bit and checking a bit. Not worth arguing about threat models; just do it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: