Towards The Next DNS Fix
Ultimately, I can’t at all complain about armchair engineering. The whole point of Source Port Randomization as an interim fix was to get things to the level that we could all have the big messy discussion about what to do now, without being illuminated by the actively burning state of the DNS infrastructure.
Now. When it comes to fixing DNS, we have to operate under the same constraint as when we suggest fixes to web browsers. Just as you’re not allowed to break the web, you’re not allowed to break DNS. There are indeed many things we could do to make the web a safer place, “if only a bunch of people would re-code their web sites”. That is, unfortunately, a naive approach that doesn’t actually lead to things getting any safer. If nobody will deploy the fix, it’s just as if the fix didn’t happen.
We needed this DNS fix to happen.
As I’ve said a couple of times, Dan Bernstein was right. Source Port Randomization (SPR) is not perfect — I’m pretty embarrassed that we didn’t recognize how common interactions would be with firewalls — but it’s a remarkably flexible and thorough improvement to the status quo. When I said in my talk that there’s fifteen ways around the TTL, I wasn’t kidding. From magic query types that are uncached by a recursive server, to nonexistent query types that are ignored by an authoritative server, there may not be a TTL to override. Or perhaps the attacker actually provides records for 1.google.com, 2.google.com, 3.google.com, and so on. In other words, the attacker might not even try to overwrite the NS for a domain — he may just want to get a domain in. How would this be useful? Consider the web security model, and Mike Perry’s research on cookies. 1.google.com will collect the cookie for Google just fine.
Or perhaps, as in the case of Google Analytics and Facebook and most large, CDN hosted sites, the actual TTL to override needs to be small, for reliability and scaling purposes.
In all of these situations, Source Port Randomization — a solution forged in 1999, long before we recognized all these problematic variant attacks — poses a significant barrier to attack. It’s not a panacea, but it was never said to be one. The hope, and it’s not unreasonable, is that it’s a lot easier for secondary defenses to detect and correct for a flood of billions of packets, than a couple of thousand. SPR’s purpose was to provide a safer environment for an active discussion that would hopefully yield better fixes. And that’s what it’s doing!
So, lets finally start talking about the better fixes that are emerging. Specifically, the problem is — how do we stop the blind attacker who’s willing to send us four billion packets in order to pollute a name? Four major strategies are, at least from what I’ve seen, making real strides towards a better fix.
1) DNSSEC. Say what you will about the perceived technical and political impossibility of this actually happening, but wow there’s been progress these last few weeks: Besides lots of excited chatter that the roots are finally going to get signed, .GOV seems to be throwing some pretty serious resources at making DNSSEC happen. I’m neutral thus far on all the post-SPR solutions, and I’m really, aggressively neutral on DNSSEC. The reality is there’s no harder task in all of IT than building a PKI, and the inescapable reality is that DNSSEC is a new identity infrastructure on the order of X.509. It does solve the problems though, at least for the authoritative servers that opt into it, and the side benefits of having the system fixed in this particular way are rather compelling.
2) Layered Point Fixes. This is the approach Nominum is taking: Basically, they’re bundling every point fix they can, and actively getting themselves into the position with their customers that as new bypasses are discovered, they can react quickly. For example, when Nominum receives a packet with an incorrect TXID, they switch to TCP for that particular query. This constrains an attacker in two ways: First, they must force as many lookups as there are fake responses. In other words, instead of being able to send 99.8% fake responses for each forced request, the attacker must send 50% requests, 50% responses. Second, the attacker is constrained to the query rate that Nominum will actually send queries to a particular domain.
That alone, is not enough. A slightly less efficient attack does not a fix make. And so they port randomize. But that too, is not enough — at least not for the long term. And so they’re systematically building filters that attempt to detect as many weird variants as possible and attempt to address them on an attack-by-attack basis.
It’s certainly my preference to have a comprehensive fix. But, pragmatically, I can’t deny that Nominum’s approach is yielding an increasingly harder target.
3) Attack Mode. I’ll admit, this one appeals to me — that’s a change, I used to be a pretty staunch opponent, as I expect many people to still be. But bear with me for a second. Probably the most consistent signal of a blind cache poisoning attack is a spike in the number of responses received per second with an incorrect TXID (and, if you’re monitoring the network, incorrect destination port). Even with a fully non-response upstream name server, this signal still survives, as the attacker needs to guess transaction IDs and ports and is going to for a very long time guess wrong. This appears to hold true for all variants, known and even suspected. Now, the concept of the SPR interim defense is that the brute force will either go too slow to be relevant for an attacker, or fast enough that the raw traffic levels will be noticed by even trivial network monitoring.
We can do better monitoring of DNS traffic with an IDS rather than just a traffic monitor, but you know who’s in a really good position to notice this attack? The name server itself. There’s no reason, inside the name server, that we can’t adapt to the attack — and change our posture to compensate.
Imagine for a moment that we monitored the absolute number of packets received with at least the wrong TXID. (Depending on how we manage sockets, we might not see all the packets with the wrong source port. We may not need to, or if we do, we can do so fairly trivially with libpcap filtering for source port 53.) Assuming we were indeed receiving too many packets with the wrong transaction ID, we could deem ourselves…under attack. What now?
I’ll tell you what we probably shouldn’t do: Rate limit, either for all IP addresses, or for those that are specifically being spoofed. (Remember, DNS servers enforce source address on incoming packets so they can correctly calculate bailiwicks — whether a particular server is allowed to speak for a given name in the first place.) The problem with rate limiting is that, while it works very well to slow an attacker down, it also provides an attacker with a very consistent way to implement targeted denial of service attacks against DNS infrastructure. Just flood bad replies, and the real reply will consistently get dropped.
A lot of security people are willing to tolerate DoS, in lieu of data corruption. On one level, yes, it’s true, I’d rather have no service than corrupted service. On the other, no service is in and of itself bad for business. A trivial DoS that takes out Google for an ISP is more than just a problem — it’s a deployment blocker.
Again. If nobody deploys your fix, it’s like you didn’t even write it.
That being said, DNS is a cruel mistress. Due to the chained nature of DNS, reliable DoS attacks actually enable data corruption, by allowing an attacker to break the chain. This has already been shown to cause headaches when an IPS blocks traffic to an authoritative server (mentioned earlier, and described in depth in my 2005 Black Ops talk). But there are also implications to DNS clients, who will themselves now end up with nothing in their cache because a rate limited server couldn’t collect the data in the first place.
So, we shouldn’t drop traffic. What can we do? Perhaps, switch to TCP during the attack? We know Nominum does this, at least on a per-query basis, when it detects an attack for that particular query. So there’s some precedent. But the resistance and nervousness around anything that allows you to force large numbers of servers to switch to TCP, for any reason, is significant. It’s also impossible to ignore that a decent portion of recursive name servers cannot get 53/tcp out of their network, and that there are even a good number of authoritative name servers that refuse to host their DNS records over TCP.
There’s much less fear around debouncing — at least, well scoped debouncing. This is just the technical way of saying, if you’re not sure about something, look it up twice. You do need to make sure you get the same answer back both times — or else an attacker just forces you to debounce, and hopes he gets his contrary answer in both times. And there remains interesting questions about what to do when the answers legitimately differ, because they come from a CDN that shuffles responses on a per-response basis for load balancing. What now? I’d like to avoid TCP, and triple and quadruple querying is only a little more likely to generate multiple queries with the same reply. One option is to make use of this trick thought up by this neat new nameserver Paul Vixie showed me — I can’t find it right now, but I’ll put a link up once I do. The idea he had was to wait around a few hundred milliseconds, seeing if a real server would show up with another reply. If so, there’s an attack. Now, when he did this, he was doing it all the time, so it was killing performance on DNS for all users of the protocol (again, deployment blocker). But we’d only be doing this in attack mode.
Yes, I think Akamai would accept slightly slower DNS resolution during an active attack against their particular names, on the particular name server that’s being attacked.
There is one funny variant we’d need to handle, if we were to depend on the real name server exposing the fake reply. What if the real name server is non-responsive, for whatever reason? I think the answer here is to handle situations where no answer comes back, by then and only then refusing to accept any packets from that IP address for ten seconds. In other words, if a query fails, and nobody replies successfully, blackhole that server for ten seconds. Legitimate servers have an easy way around this DoS — actually respond to that first query — so I think it’s the one DoS I can accept.
One matter that hadn’t really come up was scope. There are three scopes we can defend against: Per-query, per-NS, and global. In other words, we can apply attack mode logic, whatever it may be, to one specific query, all queries to a name server that we see under attack, or all queries in the world. My suspicion is that unless we actively detect attacks against just an absurd number of name servers (in other words, if the absolute number of incorrect TXIDs is not accounted for by any particular NS, thus meaning an attacker who doesn’t care which names he poisons as long as he gets someone), then per-NS scope is good.
I don’t like per-query, due to variants that it’s just not going to cover. There’s some controversy here too, though, “query-fate-sharing” scares people a little.
So, in summary, all this ends up collapsing to some variant of:
Monitor the absolute rate of packets received with the wrong TXID, and possibly Port. (BIND already does this — check the stats code.)
If the rate of packets exceeds some threshold — possibly dynamically set by the number of outstanding queries per second — start tracking which IP’s are “sending” packets with the wrong TXID/Port.
If there are too many NS’s to track, go into global attack mode. Otherwise, go into per-NS attack mode for those NS’s, for ten seconds. Hold this attack mode open as long as the spawning incorrect TXID/Port behavior continues, plus twenty seconds. (This prevents twiddling attack mode on and off really fast, which defeats the purpose.)
During attack mode, debounce within the scope of that attack mode. If two answers are received that disagree, issue a single query, and make sure one and only one reply comes back. If no replies come back, suppress queries to that address for some small number of seconds.
The actual thresholds and constants would need to be figured out, but that’s roughly something I’m liking right now. Sure, it looks complicated, but amusingly it’s still the simplest of the solutions listed thus far!
4) Case Sensitive DNS Responses (or ‘0x20’). This is David Dagon’s concept, and it’s interesting. The concept is that DNS ignores case (www.foo.com is http://wWw.FOO.coM) but preserves case (if you ask for http://wWw.fOO.coM, you’ll get back http://wWw.fOO.coM). So if we want more bits of randomness — if we want to get past 4 billion packets into more-packets-than-have-ever-been-sent-in-history — maybe we can use this trait. As mentioned earlier, the problem with 0x20 is that an attacker can select names that don’t have enough case sensitive characters to add entropy. Specifically, you can have numbers in a DNS name! And so, when an attacker forces lookups for:
0x20 can only provide one additional bit of entropy — and it’s not clear that one a is even required (it’s there to deal with the complaint ‘well, we’ll just detect completely numeric domains’). And since all the above names have to be queried against the root servers, whoever corrupts those names gets to include whatever extra records he wants, because they’re all in bailiwick. This is the exact problem that DNSSEC has — securing http://www.foo.com doesn’t just require securing foo.com, you also have to secure com and the roots themselves. (XQID thought they got around this. So close, but no. I’ll post why later — this post is about fixes.)
Bottom line, 0x20 can’t secure the roots when there’s not enough characters to add sufficient entropy.
That being said, almost all real world names do have enough characters in them to add lots of entropy. In fact, of all the non-DNSSEC solutions, 0x20 is the one that can not only work for the common case, but survive without source port randomization. (The attack mode above just doesn’t work well enough when the attacker has a 1/65K chance of winning.) It does need some coverage in those synthetic cases where there’s not enough entropy, or even in the real world cases of very short domains (ibm.com, for example).
Well, we have an entire debouncing framework described for Attack Mode. Could we debounce when we don’t get enough entropy from the name? Or perhaps we do so only when we detect 0x20 under attack, or is deployed on a network that from either the authoritative or recursive side canonicalizes away the case variation?
I’m not sure what the exact fix looks like. But what’s clear to me is now is what I was pretty sure of back in March: The real fix, the comprehensive fix, is not going to be trivial. It may be DNSSEC, it may not be, but it’s not going to be a one-character call-it-a-day point fix. Say what you will about Source Port Randomization — conceptually, it’s several orders of magnitude cleaner than everything that’s yielding fruit now. Dan Bernstein’s solution is good. Doing better — by crypto, by filtering, by defending ourselves, or by another entropy source — will be hard.
Not impossible, but not the sort of thing 16 engineers in a room could pragmatically hope to accomplish.