DNSSEC Interlude 1: Curiosities of Benchmarking DNS over Alternate Transports
Short version: DNS over TCP (or HTTP) is almost certainly not faster than DNS over UDP, for any definition of faster. There was some data that supported a throughput interpretation of speed, but that data is not replicating under superior experimental conditions. Thanks to Tom Ptacek for prodding me into re-evaluating my data.
Long version:
So one of the things I haven’t gotten a chance to write a diary entry about yet, is the fact that when implementing end-to-end DNSSEC, there will be environments in which arbitrary DNS queries just aren’t an option. In such environments, we will need to find a way to tunnel traffic.
Inevitably, this leads us to HTTP, the erstwhile “Universal Tunneling Protocol”.
Now, I don’t want to go ahead and write up this entire concern now. What I do want to do is discuss a particular criticism of this concern — that HTTP, being run over TCP, would necessarily be too slow in order to function as a DNS transport.
I decided to find out.
I am, at my core, an empiricist. It’s my belief that the security community runs a little too much on rumor and nowhere nearly enough on hard, repeatable facts. So, I have a strong bias towards actual empirical results.
One has to be careful, though — as they say, data is not information, information is not knowledge, and knowledge is not wisdom.
So, Phreebird is built on top of libevent, a fairly significant piece of code that makes it much easier to write fast network services. (Libevent essentially abstracts away the complex steps it’s required to make modern kernel networking fast.) Libevent supports UDP, TCP, and HTTP transports fairly elegantly, and even simultaneously, so I built each endpoint into Phreebird.
Then, I ran a simple benchmark. From my Black Hat USA slides:
# DNS over UDP
./queryperf -d target2 -s 184.73.1.213 -l 10
…
Queries per second: 3278.676726 qps# DNS over HTTP
ab -c 100 -n 10000 http://184.73.1.213/Rz8BAAABAAAAAAAAA3d3dwNjbm4DY29tAAABAAE=
…
Requests per second: 3910.13 [#/sec] (mean)
Now, at Black Hat USA, the title of my immediate next slide was “Could be Wrong!”, with the very first bullet point being “Paul Vixie thinks I am”, and the second point being effectively “this should work well enough, especially if we can retrieve entire DNS chains over a single query”. But, aside from a few raised eyebrows, the point didn’t get particularly challenged, and I dropped the skepticism from the talk. I mean, there’s a whole bunch of crazy things going on in the DKI talk, I’ve got more important things to delve deeply into, right?
Heh.
Turns out there’s a fair number of things wrong with the observation. First off, saying “DNS over HTTP is faster than DNS over UDP” doesn’t specify which definition of faster I’m referring to. There are two. When I say a web server is fast, it’s perfectly reasonable for me to say “it can handle 50,000 queries per second, which is 25% faster than its competition”. That is speed-as-throughput. But it’s totally reasonable also to say “responses from this web server return in 50ms instead of 500ms”, which is speed-as-latency.
Which way is right? Well, I’m not going to go mealy-mouthed here. Speed-as-latency isn’t merely a possible interpretation, it’s the normal interpretation. And of course, since TCP adds a round trip between client and server, whereas UDP does not, TCP couldn’t be faster (modulo hacks that reduced the number of round trips at the application layer, anyway). What I should have been saying was that “DNS servers can push more traffic over HTTP than they can over UDP.”
Except that’s not necessarily correct as well.
Now, I happen to be lucky. I did indeed store the benchmarking data from my July 2010 experiment comparing UDP and HTTP DNS queries. So I can show I wasn’t pulling numbers out of my metaphorical rear. But I didn’t store the source code, which has long since been updated (changing the performance characteristics), and I didn’t try my tests on multiple servers of different performance quality.
I can’t go back in time and get the code back, but I can sure test Phreebird. Here’s what we see on a couple of EC2 XLarges, after I’ve modified Phreebird to return a precompiled response rather than building packets on demand:
# ./queryperf -d bench -s x.x.x.x -l 10
Queries per second: 25395.728961 qps
# ab -c 1000 -n 100000 ‘http://x.x.x.x/.well-known/dns-http?v=1&q=wn0BAAABAAAAAAAABWJlbmNoBG1hcmsAAAEAAQ==’
Requests per second: 7816.82 [#/sec] (mean)
DNS over HTTP here is about 30% the performance of DNS over UDP. Perhaps if I try some smaller boxes?
# ./queryperf -d bench -s pb-a.org -l 10
Queries per second: 7918.464171 qps
# ab -c 1000 -n 100000 ‘http://pb-a.org/.well-known/dns-http?v=1&q=wn0BAAABAAAAAAAABWJlbmNoBG1hcmsAAAEAAQ==’
Requests per second: 3486.53 [#/sec] (mean)
Well, we’re up to 44% the speed, but that’s a far cry from 125%. What’s going on? Not sure, I didn’t store the original data. There’s a couple other things wrong:
- This is a conflated benchmark. I should have a very small, tightly controlled, correctly written test case server and client. Instead, I’m borrowing demonstration code that’s trying to prove where DNSSEC is going, and stock benchmarkers.
- I haven’t checked if Apache Benchmarker is reusing sockets for multiple queries (I don’t think it is). If it is, that would mean it was amortizing setup time across multiple queries, which would skew the data.
- I’m testing between two nodes on the same network (Amazon). I should be testing across a worldwide cluster. Planetlab calls.
- I’m testing between two nodes, period. That means that the embarrassingly parallelizable problem that is opening up a tremendous number of TCP client sockets, is borne by only one kernel instead of some huge number. I’m continuing to repeat this error on the new benchmarks as well.
- This was a seriously counterintuitive result, and as such bore a particular burden to be released with hard, replicative data (including a repro script).
Now, does this mean the findings are worthless? No. First, there’s no intention of moving all DNS over HTTP, just those queries that cannot be serviced via native transport. Slower performance — by any definition — is superior to no performance at all. Second, there was at least one case where DNS over HTTP was 25% faster. I don’t know where it is, or what caused it. Tom Ptacek supposes that the source of HTTP’s advantage was that my UDP code was hobbled (specifically, through some half-complete use of libevent). That’s actually my operating assumption for now.
Finally, and most importantly, as Andy Steingruebl tweets:
Most tests will prob show UDP faster, but that isn’t the point. Goal is to make TCP fast enough, not faster than UDP.
The web is many things, optimally designed for performance is not one of them. There are many metrics by which we determine the appropriate technological solution. Raw cycle-for-cycle performance is a good thing, but it ain’t the only thing (especially considering the number of problems which are very noticeably not CPU-bound).
True as that all is, I want to be clear. A concept was misunderstood, and even my “correct interpretation” wasn’t backed up by a more thorough and correct repetition of the experiment. It was through the criticism of others (specifically, Tom Ptacek) that this came to light. Thanks, Tom.
That’s science, folks. It’s awesome, even when it’s inconvenient!