The DNSSEC Diaries, Ch. 6: Just How Much Should We Put In DNS?
Several years ago, I had some fun: I streamed live audio, and eventually video, through the DNS.
Heh. I was young, and it worked through pretty much any firewall. (Still does, actually.) It wasn’t meant to be a serious transport though. DNS was not designed to traffic large amounts of data. It’s a bootstrapper.
But then, we do a lot of things with protocols that we weren’t “supposed” to do. Where do we draw the line?
Obviously DNS is not going to become the next great CDN hack (though I had a great trick for that too). But there’s a real question: How much data should we be putting into the DNS?
Somewhere between “only IP addresses, and only a small number”, and “live streaming video”, there’s an appropriate middle ground. Hard to know where exactly that is.
There is a legitimate question of whether anything the size of a certificate should be stored in DNS. Here is the size of Hotmail’s certificate, at the time of writing:
-----BEGIN CERTIFICATE----- MIIHVTCCBj2gAwIBAgIKKykOkAAIAAHL8jANBgkqhkiG9w0BAQUFADCBizETMBEG CgmSJomT8ixkARkWA2NvbTEZMBcGCgmSJomT8ixkARkWCW1pY3Jvc29mdDEUMBIG CgmSJomT8ixkARkWBGNvcnAxFzAVBgoJkiaJk/IsZAEZFgdyZWRtb25kMSowKAYD VQQDEyFNaWNyb3NvZnQgU2VjdXJlIFNlcnZlciBBdXRob3JpdHkwHhcNMTAxMTI0 MTYzNjQ1WhcNMTIxMTIzMTYzNjQ1WjBuMQswCQYDVQQGEwJVUzELMAkGA1UECBMC V0ExEDAOBgNVBAcTB1JlZG1vbmQxEjAQBgNVBAoTCU1pY3Jvc29mdDEUMBIGA1UE CxMLV2luZG93c0xpdmUxFjAUBgNVBAMTDW1haWwubGl2ZS5jb20wggEiMA0GCSqG SIb3DQEBAQUAA4IBDwAwggEKAoIBAQDVCWje/SRDef6Sad95esXcZwyudwZ8ykCZ lCTlXuyl84yUxrh3bzeyEzERtoAUM6ssY2IyQBdauXHO9f+ZEz09mkueh4XmD5JF /NhpxdpPMC562NlfvfH/f+8KzKCPhPlkz4DwYFHsknzHvyiz2CHDNffcCXT+Bnrv G8eEPbXckfEFB/omArae0rrJ+mfo9/TxauxyX0OsKv99d0WO0AyWY2/Bt4G+lSuy nBO7lVSadMK/pAxctE+ZFQM2nq3G4o+L95HeuG4m5NtIJnZ/7dZwc8HuXuMQTluA ZL9iqR8a24oSVCEhTFjm+iIvXAgM+fMrjN4jHHj0Vo/o0xQ1kJLbAgMBAAGjggPV MIID0TALBgNVHQ8EBAMCBLAwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMB MHgGCSqGSIb3DQEJDwRrMGkwDgYIKoZIhvcNAwICAgCAMA4GCCqGSIb3DQMEAgIA gDALBglghkgBZQMEASowCwYJYIZIAWUDBAEtMAsGCWCGSAFlAwQBAjALBglghkgB ZQMEAQUwBwYFKw4DAgcwCgYIKoZIhvcNAwcwHQYDVR0OBBYEFNHp09bqrsUO36Nc 1mQgHqt2LeFiMB8GA1UdIwQYMBaAFAhC49tOEWbztQjFQNtVfDNGEYM4MIIBCgYD VR0fBIIBATCB/jCB+6CB+KCB9YZYaHR0cDovL21zY3JsLm1pY3Jvc29mdC5jb20v cGtpL21zY29ycC9jcmwvTWljcm9zb2Z0JTIwU2VjdXJlJTIwU2VydmVyJTIwQXV0 aG9yaXR5KDgpLmNybIZWaHR0cDovL2NybC5taWNyb3NvZnQuY29tL3BraS9tc2Nv cnAvY3JsL01pY3Jvc29mdCUyMFNlY3VyZSUyMFNlcnZlciUyMEF1dGhvcml0eSg4 KS5jcmyGQWh0dHA6Ly9jb3JwcGtpL2NybC9NaWNyb3NvZnQlMjBTZWN1cmUlMjBT ZXJ2ZXIlMjBBdXRob3JpdHkoOCkuY3JsMIG/BggrBgEFBQcBAQSBsjCBrzBeBggr BgEFBQcwAoZSaHR0cDovL3d3dy5taWNyb3NvZnQuY29tL3BraS9tc2NvcnAvTWlj cm9zb2Z0JTIwU2VjdXJlJTIwU2VydmVyJTIwQXV0aG9yaXR5KDgpLmNydDBNBggr BgEFBQcwAoZBaHR0cDovL2NvcnBwa2kvYWlhL01pY3Jvc29mdCUyMFNlY3VyZSUy MFNlcnZlciUyMEF1dGhvcml0eSg4KS5jcnQwPwYJKwYBBAGCNxUHBDIwMAYoKwYB BAGCNxUIg8+JTa3yAoWhnwyC+sp9geH7dIFPg8LthQiOqdKFYwIBZAIBCjAnBgkr BgEEAYI3FQoEGjAYMAoGCCsGAQUFBwMCMAoGCCsGAQUFBwMBMIGuBgNVHREEgaYw gaOCDyoubWFpbC5saXZlLmNvbYINKi5ob3RtYWlsLmNvbYILaG90bWFpbC5jb22C D2hvdG1haWwubXNuLmNvbYINaG90bWFpbC5jby5qcIINaG90bWFpbC5jby51a4IQ aG90bWFpbC5saXZlLmNvbYITd3d3LmhvdG1haWwubXNuLmNvbYINbWFpbC5saXZl LmNvbYIPcGVvcGxlLmxpdmUuY29tMA0GCSqGSIb3DQEBBQUAA4IBAQC9trl32j6J ML00eewSJJ+Jtcg7oObEKiSWvKnwVSmBLCg0bMoSCTv5foF7Rz3WTYeSKR4G72c/ pJ9Tq28IgBLJwCGqUKB8RpzwlFOB8ybNuwtv3jn0YYMq8G+a6hkop1Lg45d0Mwg0 TnNICdNMaHx68Z5TK8i9QV6nkmEIIYQ32HlwVX4eSmEdxLX0LTFTaiyLO6kHEzJg CxW8RKsTBFRVDkZQ4CtxpvSV3OSJEEoHiJ++RiLZYY/1XRafwxqESMn+bGNM7aoE NHJz3Uzu2/rSFQ5v7pmTpJokNcHl8hY1fCFs01PYkoWm0WXKYnDHL4+L46orvsyE GM1PAYpIdTSp -----END CERTIFICATE-----
Compared to the size of an IP address, this is a bit much. There are three things that give me pause about pushing so much in.
First — and, yes, this is a personal bias — every time we try to put this much in DNS, we end up creating a new protocol in which we don’t. There’s a dedicated RRTYPE for certificate storage, called CERT. From what I can tell, the PGP community used CERT, and then migrated away. There’s also the experience we can see in the X.509 realm, where they had many places where certificate chains were declared inline. For various reasons, these in-protocol declarations were farmed out into URLs in which resources could be retrieved as necessary. Inlining became a scalability blocker.
Operational experience is important.
Second, there’s a desire to avoid DNS packets that are too big to fit into UDP frames. UDP, for User Datagram Protocol, is basically a thin application based wrapper around IP itself. There’s no reliability and few features around it — it’s little more than “here’s the app I want to talk to, and here’s a checksum for what I was trying to say” (and the latter is optional). When using UDP, there are two size points to be concerned with.
The first is (roughly) 512 bytes. This is the traditional maximum size of a DNS packet, because it’s the traditional minimum size of a packet an IP network can be guaranteed to transmit without fragmentation en route.
The second is (roughly) 1500 bytes. This is essentially how much you’re able to move over IP (and thus UDP) itself before your packet gets fragmented — generally immediately, because it can’t even get past the local Ethernet card.
There’s a strong desire to avoid IP fragmentation, as it has a host of deleterious effects. If any IP fragment is dropped, the entire packet is lost. So there’s “more swings at the bat” at having a fatal drop. In the modern world of firewalls, fragmentation isn’t supported well as you can’t know whether to pass a packet until the entire thing has been reassembled. I don’t think any sites flat out block fragmented traffic, but it’s certainly not something that’s optimized for.
Finally — and more importantly than anyone admits — fragmented traffic is something of a headache to debug. You have to have reassembly in your decoders, and that can be slow.
However, we’ve already sort of crossed the rubicon here. DNSSEC is many things, but small is not one of them. The increased size of DNSSEC responses is not by any stretch of the imagination fatal, but it does end the era of tiny traffic. This is made doubly true by the reality that, within the next five or so years, we really will need to migrate to keysizes greater than 1024. I’m not sure we can afford 2048, though NIST is fairly adamant about that. But 1024 is definitely the new 512.
That’s not to say that DNSSEC traffic will always fragment at UDP. It doesn’t. But we’ve accepted that DNS packets will get bigger with DNSSEC, and fairly extensive testing has shown that the world does not come to an end.
It is a reasonable thing to point out though , that while use of DNSSEC might lead to fragmentation, use of massive records in the multi-kilobyte range will, every time.
(There’s an amusing thing in DNSSEC which I’ll go so far as to say I’m not happy about. There’s actually a bit, called DO, that says whether the client wants signatures. Theoretically we could use this bit to only send DNSSEC responses to clients that are actually desiring signatures. But I think somebody got worried that architectures would be built to only serve DNSSEC to 0.01% of clients — gotta start somewhere. So now, 80% of DNS requests claim that they’ll validate DNSSEC. This is…annoying.)
Now, there’s of course a better way to handle large DNS records than IP fragmenting: TCP. But TCP has historically been quite slow, both in terms of round trip time (there’s a setup penalty) and in terms of kernel resources (you have to keep sockets open). But a funny thing happened on the way to billions of hits a day.
TCP stopped being so bad.
The setup penalty, it turns out, can be amortized across multiple queries. Did you know that you can run many queries off the same TCP DNS socket, pretty much exactly like HTTP pipelines? It’s true! And as for kernel resources…
TCP stacks are now fast. In fact — and, I swear, nobody was more surprised than me — when I built support for what I called “HTTP Virtual Channel” into Phreebird and LDNS, so that all DNS queries would be tunneled over HTTP, the performance impact wasn’t at all what I thought it would be:
DNS over HTTP is actually faster than DNS over UDP — by a factor of about 25%. Apparently, a decade of optimising web servers has had quite the effect. Of course, this requires a modern approach to TCP serving, but that’s the sort of thing libevent makes pretty straightforward to access. (UPDATE: TCP is reasonably fast, but it’s not UDP fast. See here for some fairly gory details.) And it’s not like DNSSEC isn’t necessitating pretty deep upgrades to our name server infrastructure anyway.
So it’s hard to argue we shouldn’t have large DNS records, just because it’ll make the DNS packets better. That ship has sailed. There is of course the firewalling argument — you can’t depend on TCP, because clients may not support the transport. I have to say, if your client doesn’t support TCP lookup, it’s just not going to be DNSSEC compliant.
Just part of compliance testing. Every DNSSEC validator should in fact be making sure TCP is an available endpoint.
There’s a third argument, and I think it deserves to be aired. Something like 99.9% of users are behind a DNS cache that groups their queries with other users. On average, something like 90% of all queries never actually leave their host networks; instead they are serviced by data already in the cache.
Would it be fair for one domain to be monopolizing that cache?
This isn’t a theoretical argument. More than a few major Top 25 sites have wildcard generated names, and use them. So, when you look in the DNS cache, you indeed see huge amounts of throwaway domains. Usually these domains have a low TTL (Time to Live), meaning they expire from cache quickly anyway. But there are grumbles.
My sense is that if cache monopolization became a really nasty problem, then it would represent an attack against a name server. In other words, if we want to say that it’s a bug for http://www.foo.com to populate the DNS cache to the exclusion of all others, then it’s an attack for http://www.badguy.com to be able to populate said cache to the same exclusion. It hasn’t been a problem, or an attack I think, because it’s probably the least effective denial of service imaginable.
I could see future name servers tracking the popularity of cache entries before determining what to drop. Hasn’t been needed yet, though.
Where this comes down to is — is it alright for DNS to require more resources? Like I wrote earlier, we’ve already decided it’s OK for it to. And frankly, no matter how much we shove into DNS, we’re never getting into the traffic levels that any other interesting service online is touching. Video traffic is…bewilderingly high.
So do we end up with a case for shoving massive records into the DNS, and justifying it because a) DNSSEC does it and b) They’re not that big, relative to say Internet Video? I have to admit, if you’re willing to “damn the torpedos” on IP Fragmentation or upgrading to TCP, the case is there. And ultimately, the joys of a delegated namespace is that — it’s your domain, you can put whatever you want in there.
My personal sense is that while IP fragmentation / TCP upgrading / cache consumption isn’t the end of the world, it’s certainly worth optimizing against. More importantly, operational experience that says “this plan doesn’t scale” shows up all over the place — and if there’s one thing we need to do more of, it’s to identify, respect, and learn from what’s failed before and what we did to fix it.
A lot of protocols end up passing URLs to larger resources, to be retrieved via HTTP, rather than embedding those resources inline. In my next post, we’ll talk about what it might look like to push HTTP URLs into DNS responses. And that, I think, will segue nicely into why I was writing a DNS over HTTP layer in the first place.