Home > Security > DNSSEC Interlude 3: Cache Wars

DNSSEC Interlude 3: Cache Wars

DJB responds! Not in full — which I’m sure he’ll do eventually, and which I genuinely look forward to — but to my data regarding the increase in authoritative server load that DNSCurve will cause. Here is a link to his post:

List: djbdns
Subject : dnscurve load

What’s going on is as follows: I argue that, since cache hit rates of 80-99% are seen on intermediate caches, that this will by necessity create at least 5x-100x increases in traffic to authoritative servers. This is because traffic that was once serviced out of cache, will now need to transit all the way to the authoritative server. (It’s at least, because it appears each cache miss requires multiple queries to service NS records and the like.)

DJB replies, no, DNSCurve allows there to be a local cache on each machine, so the hit rates above won’t actually expand out to the full authoritative load increase.

Problem is, there’s already massive local caching going on, and 80-99% is still the hitrate at the intermediates! Most DNS lookups come from web browsers, effectively all of which cache DNS records. Your browser does not repeatedly hammer DNS for every image it retrieves! It is in fact these very DNS caches that one needs to work around in the case of DNS Rebinding attacks (which, by the way, still work against HTTP).

But, even if the browsers weren’t caching, the operating system caches extensively as well. Here’s Windows, displaying its DNS Client cache:

$ ipconfig /displaydns

Windows IP Configuration

linux.die.net
—————————————-
Record Name . . . . . : linux.die.net
Record Type . . . . . : 5
Time To Live . . . . : 5678
Data Length . . . . . : 8
Section . . . . . . . : Answer
CNAME Record . . . . : sites.die.net

So, sure, DNSCurve lets you run a cache locally. But that local cache gains you nothing against the status quo, because we’re already caching locally.

Look. I’ve been very clear. I think Curve25519 occupies an unserviced niche in our toolbox, and will let us build some very interesting network protocols. But in the case of DNS, using it as our underlying cryptographic mechanism means we lose the capability to assert the trustworthiness of data to anyone else. That precludes cross-user caching — at least, precludes it if we want to maintain end to end trust, which I’m just not willing to give up.

I can’t imagine DJB considers it optional either.

It’s probably worth taking a moment to discuss our sources of data. My caching data comes from multiple sources — the CCC network, XS4All, a major North American ISP, one of the largest providers to major ISPs (each of the five 93% records was backing 3-4M users!), and one of the biggest single resolver operators in the world.

DJB’s data comes from a lab experiment.

Now, don’t get me wrong. I’ve done lots of experiments myself, and I’ve made serious conclusions on them. (I’ve also been wrong. Like, recently.) But, I took a look at the experimental notes that DJB (to his absolute credit!) posted.

So, the 1.15x increase was relative to 140 stub resolvers. Not 140M. 140.

And he didn’t even move all 140 stubs to 140 caches. No, he moved them from 1 cache to 10 caches.

Finally, the experiment was run for all of 15 minutes.

[QUICK EDIT: Also, unclear if the stub caches themselves were cleared. Probably not, since anything that work intensive is always bragged about.]

I’m not saying there’s experimental bias here, though the document is a pretty strong position piece for running a local resolver. (The single best piece of data in the doc: There are situations where the cache is 300ms away while the authoritative is 10ms away. I did not know this.)

But I think it’s safe to say DJB’s 1.15x load increase claim is thoroughly debunked.

Categories: Security
  1. January 7, 2011 at 7:48 am

    But I think it’s safe to say DJB’s 1.15x load increase claim is thoroughly debunked.

    That’s not really true. It’s more correct to say that the load increase claim isn’t adequately tested.

    The correct way to resolve this debate is to perform a serious test. For those affected by the debate, it seems insane to me that such a test isn’t being designed and performed.

    Regardless, with the rollout of DNSSEC it seems that discussion of its alternatives is a bit like asking what color to paint a condemned building: unless DJB’s design addresses the failures of DNSSEC (failures, not just weaknesses), the conversation is moot, simply because no one has the political capital to make another major change to DNS so soon in the absence of overwhelmingly compelling evidence that DNSSEC won’t achieve its design goals.

    • January 8, 2011 at 12:41 am

      At this point, we have deeply credible data that intermediate caches do in fact prevent a tremendous amount of data from reaching authoritative servers.

      The only dataset that suggests they don’t, is completely non-credible (they don’t even eliminate intermediate caches! They go to 10 caches instead of 1!).

      You’re right that we haven’t destructively tested this. But at this point, there’s so little evidence in favor of DJB’s position that the burden has to be on him to provide some better evidence. I don’t think it’s possible.

      • January 8, 2011 at 7:57 am

        But at this point, there’s so little evidence in favor of DJB’s position that the burden has to be on him to provide some better evidence.

        100% agree. But more than that, he also needs to (a) find problems with DNSSEC that DNSCurve doesn’t have, and (b) produce compelling evidence of such problems.

        I really respect DJB as a practical mathematician and as a developer, but he’s still growing when it comes to network engineering.

  2. Lennie
    January 7, 2011 at 10:19 am

    Their is one type of DNS-traffic where caching does not add as much at the per-RR level. Mailservers and their DNS-blacklists. Where you receive connections from many different IP-addresses from different botnets. And you look up an IP-address in the blacklist ones and cache it but almost never ask for it again. Also spammers use many new domains. Where I work I see a difference from 70%+ to 40%+ when comparing cache-hits between a ‘normal’ cache and a cache which is used by mailservers which do these kinds of lookups.

    • January 8, 2011 at 12:24 am

      Lennie: Those RBL servers would buckle under the crypto load from dnscurve.

      short TTLs would address those blacklist replies by dropping them from
      the cache quickly.

      • January 8, 2011 at 12:38 am

        If the RBL data varies enough, DNSSEC (w/ online signing) ain’t going to be much fun for them either.

  3. bill manning
    January 8, 2011 at 4:50 am

    Dan Kaminsky, or perhaps DJB was disingenious, got it wrong. DJB did -NOT- run the experiment referenced here: http://www.cisco.com/web/about/ac123/ac147/archived_issues/ipj_12-2/122_dns.html… That was something I ran. And yes,
    it was a fairly small subset with limited OS and SW version diversity, as well as a limited suite of target lookups. So yes, the dataset is biased. The points remain: more caching closer to the application reduces the DNS attack surface, and many authoritative servers have enough over provisioning to absorb the increased query load.

    • Lennie
      January 11, 2011 at 9:41 am

      I just read your article, the first thing I noticed was a small style issue. It says that stub resolvers send priming queries. That sounds kind of strange, becasue in DNS nomenclature priming queries are those queries send by a cache which request the nameservers of the root. I think I know what you meant, but it is a bit confusing. Anyway, I would have called the cache a recursor, because a cache is a very general term.

      This brings me to the part of article I really missed, how many of these 140 stub resolvers had their own cache ? For example Windows since Windows 2000 all have their own cache in the stub resolver. And did these also get cleared ?

      For example becasue all the machines are turned off at night and you cleared the recursor cache every morning before turning on the machines with the stub resolvers.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: