Unrelated: I read the title of this link and I knew who wrote it before even checking the website. Love Julia's writing. Always something from scratch and explained really well.
I think a more typically structured recursive resolver is built around a read-through cache structure. Each step of the query asks the cache for more records, which you may or may not already have fetched; if you have, they're just served from the cache.
So "Q: how do you handle resolver loops?" is "A: you handle them."?
I mean, probably true and accurate for many things. But if so that's not really a standard for how to handle it. It just means that, to be compliant, you need to figure it out and doing something about it.
Er. Not with that explanation. The cache won't be hit until a result is found, instead each request produces another request back to through the cycle to try to get that request. Hence DoS.
In this implementation yes, in a full implementation the standards expect the client to handle multiple different scenarios that can result in loops as an error.
A query on a localhost name server is processed in a matter of milliseconds, so it's really not going to cause a DoS. When you put your name server on the public internet, it becomes a network address, which you can only get from an authoritative source. But even then, the response is going to be processed by a DNS query processor before it is handed over to your application layer, so you're going to see that it's processed before it is presented to the end user.
Many years ago I wrote a "resolver" using only 1. a custom filter program that parses domain name from stdin for its TLD and sends a query the appropriate registry nameserver to get the authoritative nameserver for the domain, and 2. a series of tiny shell scripts, each about 385 characters or less. There are about 10 scripts in total. Each does a specific task.
What was interesting is that, for names not already in held in a cache, this system was able to beat the speed of full-blown DNS resolvers like BIND, dnscache, pdns_recursor, unbound, etc.
The reason it worked is that DNS resolution tends to follow certain patterns. There were 43 different paths DNS lookups take in the real world that I discovered. Some of them are extremely common. Some of them are extremely rare. These paths are simply the reflection of how people configure DNS. The most common pattern is also the most efficient. It only requires three queries. The least common patterns require many more queries. Arguably some of them could be called evidence of "DNS misconfiguration".
The program "brute forces" resolution by trying the most common patterns first. If it succeeds, it exits. If it fails, it will retry using the next most common pattern.
When the size growth of the root.zone exploded thanks to new gTLDs, this made the size of the custom filter that parses the TLD from stdin ridiculously large. That said, some of those new TLDs are rare to see and some are not even in active use. Is it even worth including them. The same question arises with all the different patterns a DNS lookup can follow. Some are so rare and so inefficient, why bother "supporting" them when writing a resolver.
The problem with DNS resolution IMO is that people believe there should be almost no rules in how someone configures DNS. If a website operator wants to force a user to make 18 DNS queries to get the IP addresss of their website, when it could easily be done in 3-5 queries, she can do that, and people expect a DNS resolver to be able to deal with that scenario. From a user perspective, this is stupid. Obviously these scenarios make for slower lookups. As the number of dependencies increases, i.e., number of different nameservers involved, the likelihood of failure increases. Not to mention this makes writing a DNS resolver more complicated.
From comments I read on HN, it sometimes seems that some programmers yearn for complexity. Simple, linear things that work fast and reliably are not interesting. Lord only knows how much we all suffer as a result of that perversion. One HN commenter called it a complexity fetish.
The trick to beating the caches was in part that the filter program had the IP addresses of all the TLD nameservers hardcoded into it. That was one less query I had to make. A cold cache that did not have the TLD nameserver address would have to make that query.
I used to worry about the size of this filter program but when I see how people routinely use double-digit MB binaries today, including some very large Go ones for DNS stuff, I guess I could have carried on and ignored it. I could have hardcoded the full spectrum of new gTLDs. It would probably be no larger than the large binaries people use today.
The filter program was generated directly from the root.zone file. I think root.zone only changed twice in the eight years I was using this method. Both times it was some rarely used ccTLD replacing a nameserver. Regenerating the filter program, i.e., compiling it, took about 60 seconds.
At the time, I was working on another filter that hardcoded the addresses for the "most common" registrar nameservers. Then people started adopting DoT and DoH, I lost interest in unencrypted authoritative DNS^1 and stopped using DNS altogether. Now I store selected bulk DNS data in the memory of a TLS forward proxy. I eliminated the variable delay of DNS when making HTTP requests and I got network-wide "ad blocking" as a side benefit.
1. I awlays thought about starting a registrar that offers encrypted authoritative DNS via DNSCurve, using the CurveDNS forwarder. Why. Because no one else ever did it. I use CurveDNS in the "home lab" and it works flawlessly.
Simple python version which does a bit more validation (checks response matches query) and also supports CNAMEs and different record types (37 lines - using dnslib library) - though there is still a lot of logic missing (handling failed NSs, IPv6 only NSs, cycles etc)
from dnslib import DNSRecord,DNSError,QTYPE
ROOT_NS='198.41.0.4'
def find_cname(a,qtype,cname):
for r in a.rr:
if QTYPE[r.rtype] == qtype and r.rname == cname:
return str(r.rdata)
elif QTYPE[r.rtype] == 'CNAME' and r.rname == cname:
return find_cname(a,qtype,str(r.rdata))
return resolve(str(r.rdata),qtype,ROOT_NS)
def resolve(name,qtype='A',ns=ROOT_NS):
print(f"dig -r @{ns} {name} {qtype}")
q = DNSRecord.question(name,qtype)
a = DNSRecord.parse(q.send(ns))
if q.header.id != a.header.id:
raise DNSError('Response transaction id does not match query transaction id')
for r in a.rr:
if QTYPE[r.rtype] == qtype and r.rname == name:
return str(r.rdata)
elif QTYPE[r.rtype] == 'CNAME' and r.rname == name:
return find_cname(a,qtype,str(r.rdata))
for r in a.ar:
if QTYPE[r.rtype] == 'A':
return resolve(name,qtype,str(r.rdata))
for r in a.auth:
if QTYPE[r.rtype] == 'NS':
return resolve(name,qtype,resolve(str(r.rdata),'A',ROOT_NS))
raise ValueError("Cant resolve domain")
if __name__ == '__main__':
import sys
def get_arg(i,d):
try: return sys.argv[i]
except IndexError: return d
print(resolve(get_arg(1,"google.com"), get_arg(2,"A"), get_arg(3,ROOT_NS)))
- Namespace system strongly favours single-word names
- Type embedding allows for single-dot access (not this.path.is.long())
- Zero-elements allow for elegant, zero-line initialization and existance checks, alleviating in part the need safe navigatorion operators.
Also I'd argue that the lack of map and filter are programmer nudges towards more elegant, compact solutions, but that's hard to prove. I've been annoyed with the lack of standard ways of doing things as well, but most often it turned out that I was simply doing it wrong, and Go was carefully designed to make the wrong way annoying to use. The compactness doesn't stem from using fewer lines to express simple constructs like if else, but the overall code organisation and structure that are the result of nudges like this one, as well as the code encapsulation they use (no classes, but interfaces and structs) and acyclic dependencies.
Generics will come out with the next version, though.
> No map or filter functions (meaning you have to implement these in like 5-6 lines of code)
I like that Go doesnt have these. Many programmers seem to want these big bloated programming languages. So thats what you end up with, a bunch of slow bloated programming languages. Personally I just want something like C, but with a couple of minimal extra features (maps, built in package management, bounds checking). For me, Go is the closest thing I have found to C, without many of the C pitfalls.
> Golang relies on built-in generator comments to help alleviate all of the typing.
I have been programming Go for a few years, and I have like one file in all of my code that uses a generator comment. That file is to interface with the Windows API. Regarding pure Go code, I dont use generator comments at all, so I think this comment is overblown.
> I like that Go doesnt have these. Many programmers seem to want these big bloated programming languages. So thats what you end up with, a bunch of slow bloated programming languages.
Those features are added to remove bloat from your source code.
I don't think you understand what bloat is. Just because an end user program has less lines of code, doesn't mean bloat is reduced. That code gets made up in the API, or the language itself.
Well, obviously. GP used the phrase “compact and pretty”. It’s a bit of a stretch to interpret that as an objective statement. The charitable interpretation is that GP meant it to be subjective.
Edit: I got confused which post you were replying to. My apologies.
For anyone who want's something tiny, complete and secure check out djbdns.
I use it for a 700 user org as a secondary DNS. Never goes above single digit CPU usage (on an AWS nano instance) and currently has 834 days of uptime.
FWIW, there are mechanisms to live-patch the kernel, there were some out of tree approaches (ksplice, kGraft, kpatch) but nowadays, the kernel has native support that those (kGraft and kpatch IIRC) now can use:
Most enterprise distros provide a service for that, as the actual work is to create the binary patch fixing the security issues at hand, as one can not always just use the upstream version, e.g., if that introduces internal ABI changes or changes locking (order) - as then you'd need to patch X sites atomically at once to ensure nothing falls apart, can be done but hard to get right.
So yes, if you're willing to put in the money or work you can have systems that run for years and still are just as secure as those that frequently reboot into new updated kernels.
I know it starts like a shallow dismissal, but mentioning that code uses another DNS package as a dependency (miekg/dns) has a point. I originally missed that, and I believe many probably too.
I don't think author can claim "in 80 lines of code" when the code depends on a much bigger library than itself.
The original title was "Writing a DNS resolver in 80 lines of Go".
The comment might seem like its being dismissive with the current title "A toy DNS resolver", but at the time of writing it was not just a shallow dismissal.
This is a surpassingly facile putdown. `miekg/dns` is being used here as a DNS request/response parser library. There's almost nothing interesting about parsing DNS messages. The interesting logic in the DNS is the recursive lookup, which is the part this post captures.
As far as I know (we use the same library for our authority servers) `miekg/dns` doesn't even do recursive lookups.
It's not just that, though. It's also that recursive lookups are the probably the most mystifying aspect of DNS. DNS message parsers are straight-line code; there are lots of them, including in POSIX libc. What you don't have in POSIX libc is a function that does a complete from-the-root recursive lookup for a name; most DNS "client" implementations stop at "send the request to the local recursive cache".
To see that function implemented in such a tiny amount of code is itself super interesting.
Essentially, what you're saying is that the article disappointed you because it purported to be about making pizza, but didn't first specify how to make the oven.
I disagree. The entire client connection is being handled by the imported DNS library:
func dnsQuery(name string, server net.IP) *dns.Msg {
fmt.Printf("dig -r @%s %s\n", server.String(), name)
msg := new(dns.Msg)
msg.SetQuestion(name, dns.TypeA)
c := new(dns.Client)
reply, _, _ := c.Exchange(msg, server.String()+":53")
return reply
}
func main() {
name := os.Args[1]
if !strings.HasSuffix(name, ".") {
name = name + "."
}
fmt.Println("Result:", resolve(name))
}
While I agree it's impressively simply laid out and as an exercise good to see people getting hands-on with underpinning protocols (I'm a fan of this), it also does its job as a follow up to her previous blog post on the subject.
People taking umbrage with the title as posted here (and the original site title) are IMO not wrong. The article title has been changed to "A toy DNS resolver" now, so obviously it was contentious enough that she thought to change it.
What do you think an "entire client connection" is? It's a read and a write from a net.Conn. Read client.go's (c *Client) exchange() (the lowercase one). It's 30 lines of code, and that includes stuff like EDNS0, TSIG, and read deadlines that have nothing to do with the functionality the author is demonstrating.
If the author took the time to write a 15 line roundTrip() function that called m.Pack(), udpConn.Write(), udpConn.SetDeadline(1s), and udpConn.Read(), would your argument here evaporate? Then it's not a very good argument, is it?
>Essentially, what you're saying is that the article disappointed you because it purported to be about making pizza, but didn't first specify how to make the oven.
No it's more akin about him saying he made a pizza, but all that was done was import a pizza and then sprinkle some pepperoni on it.
Imagine if I said I made an HTTP server in 6 lines.
I think it would be reasonable for people to say that I didn't make a HTTP server in 6 lines.
Just how I used a library to make a HTTP server me and I handled requests/responses, the author of this article used a library to handle DNS requests/responses. The 80 lines figure for making a DNS resolver is misleading when there are 26k lines of Go code (according to cloc) for handling DNS that are just in a library instead of the main file.
You have managed simultaneously to radically underestimate the complexity of Node and Express's HTTP handling, while at the same time radically overestimating the complexity of what miekg/dns is being asked to do here.
The majority of the 20,000 lines of non-test code in miekg/dns are just handlers for record types the author of this article didn't need; much of the rest of it is stuff like zone file parsing and DNSSEC signatures, which again have nothing to do with what the author wrote.
What you conspicuously won't find in miekg/dns: a recursive lookup routine.
>What you conspicuously won't find in miekg/dns: a recursive lookup routine.
Sure, but it implements the building blocks that you need to create one. Those building blocks are going to be where a good chunk of code is going to be. (Yes, there is a lot of code from that library that will be unused)
More than a little unfair when the following is in the article:
I’m not going to write this completely from
scratch – I think parsing DNS packets is really
interesting, but it’s definitely more than 80
lines of code, and I find that it kind of
distracts from the algorithm.
Which, seems legit to me. This article was a fun read to refresh on exactly what goes on in resolving basic records.
I hereby do fully deny that there is anything the least bit clickbait-ish about this article, which is super interesting and all the more an achievement for its brevity. It demystifies what is the most annoying thing about trying to code the client side of DNS, and the fact that it uses the same DNS message parsing library every other Go programmer does isn't of the slightest import --- if the article hadn't used miekg/dns, some other lowbrow comment would be taking the author to task for reinventing the message codec wheel.
The world needs more simple recursive lookup examples, and absolutely does not need more explication of DNS message parsing, any more than than an article talking about etcd's Raft implementation would benefit from a hand-rolled implementation of Protobufs.
It's (sadly) telling that all the dismissals are focused on packet parsing, and not like, bailiwick checking, which is understandably missing, but also really important.
This is the key case I wish HN allowed editorializing the title. Regardless how good the article is (or isn't) when they have titles like this it's inevitable the conversation will be about how the title was overreaching. If people were encouraged to make titles like this into something like "Making a simple DNS resolver with miekg/dns in Go" or even just "A DNS resolver in Go" when posting we'd avoid this multiple times a day recurring problem.
Or maybe it already is encouraged and people just aren't aware this counts as "misleading" or "linkbait" since it's not well described what the cutoff point is.
Either way both titles like this one and discussions of have long gotten old.
Edit: looks like the article moved to "A toy DNS resolver", excellent choice of wording IMO.
"Making a simple DNS resolver with miekg/dns in Go" would have given me the false impression that the article was about how to use miekg/dns. "A DNS resolver in Go" would have given me the false impression that it was about a full-fledged DNS resolver meant for integrating with Go applications. The current title at least pushed me in the direction of understanding what the article is about, which is demonstrating how DNS resolution works using a short Go program.
That wouldn't be a false impression, this article is indeed about how to use miekg/dns to make a resolver. Your comments on the other title already apply to the current title as is, it's one of things people hate about these titles but at least the cut title avoids the line count/"nothing is from scratch" debates.
Increasingly more accurate titles always accepted of course though but one thing for sure is the class of "<x> in <y lines>" are demonstrably a continuous problem to the point they are talked about instead of the articles they represent, they are not just minor wording nits one could come up with after staring at a title long enough.
Edit: I really like the title the article itself changed to "A toy DNS resolver".
A small nit but it's actually important for the HN Rules of Titles: 'edit' and 'editorialize' are different things. You can sometimes edit a title, for instance if it's misleading or clickbait. You can't editorialize in titles no matter what.
The more general take I have seen is that it would give a "priority comment". That can result in problems like that but also more commonly just cause the comments to all act like a response to the "main" comment rather than general discourse on the topic. I can definitely see that though I also see the poster having a large amount of that control already in most cases since they get to pick the source they post.
I smiled at this, though the Sagan quote "If you wish to make an apple pie from scratch, you must first invent the universe" springs to mind. At some point to be reasonable you have to cut off the lower layers of abstraction or else it's madness.
Unbound leans on libresolv, so I don't think it's cheating to lean on external libraries - most of the hard part about resolvers is all the crufty RFCs you have to support anyway, not the basic nuts and bolts of resolving an A record.
I normally don't like comments like this but I think in this case your point stands. DNS is not 'easy' in that you have troves of RFCs and undocumented but expected behaviors to follow, etc. miekg/dns does all of the heavy lifting here.
Writing a resolver in it is a fun project, but bragging about the line count is silly.
miekg/dns literally only handles building/parsing the dns packets. Which may be a complicated activity and perhaps for you equals "doing all the heavy lifting" But it is absolutely not doing the "recursive resolve" algorithm at all here. I think parsing network packets is interesting but it's not the point of the article and saying that miekg/dns is doing all the heavy lifting doesn't give the article or it's author enough credit.
Honestly the comments of this type on this article sound more like you didn't actually read the article and are just knee jerk putting someone down. I hope that is not the case but maybe it would be good to examine why the message is coming off so distastefully.
miekg/dns is doing almost none of the heavy lifting here, which you can verify for yourself by counting the number of record types this code actually needs to parse to function, and comparing it to the number of record types miekg/dns supports and how much of the code exists just to dispatch to random record type handlers.
Correct; none of that is the heavy lifting of a recursive lookup. One way you know that is that similar code lives in every libc, but no libc I know of has a recursive resolver; the pattern of fobbing off complicated recursive queries to cache servers is so common that there's a name for library-style resolvers: "stub resolver".
You're not going to win this argument, no matter how many synonyms for "encode" and "decode" you come up with.
You have one extra "your" the final clause of that last sentence.
I'm not messing with you; as someone who does an unfortunate amount of DNS hacking, it is crazymaking to see so many people express the opinion that the hard part of doing DNS is just formatting the records. The argument you're presenting is a little like saying that "malloc" would be doing the heavy lifting in a C implementation of a graph minimum cost spanning tree.
This also isn't just an aesthetic argument. There is something profound about it. No language standard library I'm aware of includes a recursive lookup, despite the fact that you've pegged it as a "12 line for loop". They all in some way or other include DNS message codecs, but not the recursive lookup, despite the fact that it would be immensely useful to be able to write programs that did recursive lookups directly rather than relying on the system's configured recursive resolver. The reason for that is at least partly that recursive lookup is mystifying and spooks library implementors.
I've had the displeasure of writing both a series of DNS codecs and a recursive lookup routine. The codecs I've done throughout my career, going back to like 1997 with exploits for the Kashpureff cache poisoning bug. The recursor I finally got around to writing just a couple months ago, because recursive lookups are freaking complicated.
That the author got this recursive lookup so small that it broke everyone's brains is just more reason to be interested in this article. It's certainly not a reason to dismiss it. The reactions on this thread are pretty embarrassing.
I'm sorry to say this, but if the recursive lookup can be implemented with a for and a 3-way switch that a CS 101 student can write, it's really not doing the heavy lifting. It may be interesting to know about it, it may be the case that multiple resolves don't have the implementation, but it's a trivial piece of code, let's not idolize it.
For me personally, having the power of hitting an endpoint and receiving useful information is really satisfying. Creating the request and parsing the reply are probably 90% of that process. And frankly, that was the first think I was looking for when I skimmed the article. "Are DNS replies really that easy to parse?". The fact that I need to make a switch on the reply and potentially make a recursive request somewhere else is trivial once you get the actual useful info from the remote.
I'm not criticising the article or the title. It's my opinion that parsing is more important, I'm not faulting the author. But I'm having a hard time accepting your arguments.
It's not my argument that cs 101 students can't write recursive resolvers. Everyone can, and more people should, just like more people should write emulators, compilers, database engines, TCP/IP stacks, ray tracers, and computer algebra systems. Not because writing them is a huge achievement for humanity, but because people writing them and then documenting the process is vitally important for demystifying these things, in an era where most programming is just piping data from one database to another and we've lost touch with so much basic computer science.
My argument is that DNS codecs are not the interesting or tricky part of writing recursive resolvers.
Having had the pleasure of writing a bunch of DNS codecs, I'm having trouble even conceptualizing what's interesting about writing one. I don't think most people look at DNS and think "I could do this, but for the difficulty of constructing an NS record".
I'd add to this that if you just want to demonstrate the encode/send/receive/decode part, that too can be done in a nearly trivial amount of code.
I know because I've done it (I wrote a an authoritative DNS server from scratch for a registry platform we did for .name).
To write a full fledged library for it for a resolver (an authoritative server can take plenty of shortcuts depending on purpose) is complex, but I absolutely agree the basics that you'd demonstrate if writing a toy one to demonstrate would be just a distraction.
Cannot second this enough. miekg/dns is probably tens of thousands of mature lines
of code covering all aspects of the DNS protocol. This feels like bragging about creating an minimal http client but really you just used libcurl to do the heavy lifting.
It is nothing at all like using libcurl to do an HTTP request, as libcurl handles the semantics of the HTTP protocol, and not just marshaling and unmarshaling HTTP requests and responses.
miekg/dns is also 20,000 lines of code most of which are not relevant to the task of doing a recursive lookup.
Seems more like writing an article where you explain how HTTP caching headers work by using libcurl to write a program to mirror a website. The purpose isn't showing off, it's education.