Home > Security > Details

Details

I thought it would be helpful, given things going on out there, to write up a guide to the attack that people could provide to their management. A lot of people are going to have to violate procedure and work extra hours. Maybe this will get a few pizzas approved 🙂

==

DNS is a system for, among other things, finding out what number to use when “calling” somebody on the Internet. Since there’s lots of people, in lots of places, there can’t just be one directory. Often, when you ask one server for a number, it tells you to go somewhere else. And when you go there, you might be sent to a third destination. This process — “recursion” — is repeated over and over, until you finally have the number for that name.

Of course, on the Internet, you aren’t really going anywhere. What’s actually happening is that you’re sending messages out, and receiving replies back. What prevents a bad guy from providing his own replies, with his own fake numbers for whatever you were looking for?

Not much — but not nothing.

DNS can be thought of as a race: A request is sent. A good guy and a bad guy both want to get their replies to be trusted. The good guy has an advantage: He sees the request, and inside of it he can find a secret number, somewhere between zero to sixty five thousand. The race is not won until someone crosses the finish line with the secret number, and while the bad guy could guess the number, he has only a 1/65,536 chance of guessing correctly. Worse, the winner of the race gets to say how long it will be until the next race! The numbers can work out that it would take months, even years for the bad guy to finally win a race.

However, there are three problems. The first two were somewhat known. The second is very new.

First, the bad guy holds the starter pistol. He decides when the request goes out — meaning, he may not know *what* the secret number is, but he actually knows the race has started before the good guy does.

Second, the bad guy is not alone. He can have as many “runners” in the race as he likes — the race is only over when someone arrives with the correct secret number. The bad guy can try wrong number after wrong number, and until the good guy shows up with the right one, he can keep trying again and again. If he can squeeze a hundred numbers in, the odds drop from one in sixty five thousand to one in six fifty five.

But those are still long odds, and if he loses, he might have to wait a day to try again.

Or he might not.

What’s new is that the bad guy doesn’t actually have to wait to start another race. DNS is actually more of a relay race than a sprint. Remember, you send a request to a server, and you might get a reply that says “www.foobar.com? Sure, here’s the IP address to use.” Or, you might get a message that says, “www.foobar.com? I don’t know, ask ns1.foobar.com, here’s its address.” That’s recursion. It’s not a bug, or a rarely used feature. DNS is always sending you to different servers to find a record — this is how the servers that run .com work.

Now, there is a limit: Not just any other name will work — or else, I could return to you “www.foobar.com? Oh, that’s hosted at http://www.google.com, and here’s its address”, and you’d believe me. (Eleven years ago, that actually worked.) But names near http://www.foobar.com — 1.foobar.com, 2.foobar.com, 3.foobar.com — are referred to as “in-bailiwick”. A referral to a name in-bailiwick must be trusted.

And so, the attack. If someone’s trying to attack www.foobar.com, he doesn’t pull out the starter pistol for that particular name. After all, the server might not be willing to go out looking for http://www.foobar.com for hours. No, he declares races for 1.foobar.com, 2.foobar.com, 3.foobar.com, and so on.

The bad guy will probably lose these races. The odds, even with a hundred-to-one advantage in the number of “runners”, are against him.

But he can run as many races as he wants. And eventually, he’ll win one of them. And when he does win — when the bad guys guesses the secret number from 0 to 65536 — he won’t just provide an answer for the random name that won. He’ll simply feign ignorance: “83.foobar.com? I don’t know, ask www.foobar.com, here’s its address. Oh, and remember this for the next week.”

He won the race. He gets his say.

Now, there have been some problematic DNS attacks in the past. Amit Klein was able to guess the secret number the good guy would return with. Joe Stewart was able to cause many secret numbers to be accepted. But neither of the attacks could override a race that had already been won. Once a name server is storing — caching — the number for a given name, it simply won’t run another race for that name. Why should it? It knows the number!

Joe’s attack needs another race for http://www.foobar.com. Amit’s attack needs another race for http://www.foobar.com.

In my attack, we never race for http://www.foobar.com. We race for another name entirely. It’s a problem. It required a lot of work to address.

===

Incidentally, some people wanted more details on the numbers. Here’s what I can say:

1) Sweeping the net’s open recursive name servers — yeah, that ain’t great. But if nobody’s using ’em, nobody’s vulnerable. And if it’s an open recursive name server on the Internet, there’s a good chance nobody’s managed it for several years. I’m working on load measurement hacks for these.

2) Lots and lots of important places haven’t patched.

3) I still haven’t gotten my testing script to correctly handle iptables and pf randomization. This is getting worked on — damn you creative people and your tricks! 🙂

4) From July 8th to July 9th, 4242 of 5000 tests actively run by users behind unique name servers showed that server to be vulnerable. That’s about 85%. Today, July 25th, the last 5000 tests (about the last six hours) from unique name servers show only 2503 of 5000 vulnerable — just above 50%. Now, I’m not going to deny. There’s selection bias. It’s a limited sample. There are tons and tons of unpatched ISPs. This is all true.

You know what? A lot of people did a lot of work to make that number drop. More needs to be done, but 13 days made a difference, and it’s awesome to see it.

Categories: Security
  1. July 24, 2008 at 9:32 pm

    Thanks for the checking tool!

  2. Andrew
    July 24, 2008 at 10:36 pm

    Thanks for the pizza!

  3. July 24, 2008 at 10:50 pm

    Is there anything I can do to patch this on the client-side? That would be grand (if possible).

  4. July 24, 2008 at 11:13 pm

    Sure, the use of a random source port reduces vulnerability for this problem. But why does the resolver not drop the request after the first answer with a false sequence number? Are there valid situations where a resolver gets an answer from the right IP address for the right domain with a wrong sequence number that justify further waiting instead of returning NXDOMAIN?

  5. otmar
    July 24, 2008 at 11:20 pm

    Chel, if you do that, you open yourself up tp the mother of all denial of service attacks.

    Regarding patches being applied: have a look at http://www.cert.at/static/cert.at-0802-DNS-patchanalysis.pdf

  6. Jesse Cantu
    July 24, 2008 at 11:27 pm

    @Dan

    I respect you work very much although I am not so sure about your disclosure method chosen, but surely you’ve heard that a thousand times by now.

    I was reading Jon Longoria’s article on the drama recently erupting around this dns finding at http://thereformed.org/2008/07/23/unintentional-betrayal-or-faux-ignorance/ and it raised a question for me.

    Did you ever consider using the US Code to your advantage in regards to the commitment made to you by Thomas Ptacek of Matasano Chargen not to reveal the details of your finding? I reference the scenario as described in Jon Longoria’s article at http://thereformed.org/2007/10/25/us-code-gives-twenty-for-free/ which was linked in the aforementioned article. Would that be/have been a waste of your time and resources considering the potential damage it might have caused to your partners or blind marketing/promotion campaign with the vulnerability?

  7. JMer
    July 24, 2008 at 11:50 pm

    Why does the tool say that the server is vulnerable if you run it the second time ? While the first time it said the server appears to be non-vulnerable.

    Thanks.

  8. tom
    July 25, 2008 at 12:05 am

    Hi Chel:

    “But why does the resolver not drop the request after the first answer with a false sequence number? Are there valid situations where a resolver gets an answer from the right IP address for the right domain with a wrong sequence number that justify further waiting instead of returning NXDOMAIN?”

    Because if you do that, and I send your name server bad responses a few times a second “from” google’s authoritative name servers you’ll never see anything from google again. My bad response will always get to you before the real response and then you’ll drop the request, ignoring the real response.

    If source port randomization is running then it’s more feasible, but in general it’s a good way to leave yourself open to a DOS.

  9. July 25, 2008 at 1:20 am

    This is a great article, however I am having diffculty finding information about the patch. I downloaded a file from Microsoft for my local DNS server. It appeared to install in just a few seconds and offered no further instructions. The Check My DNS button on this website is saying my server is still vulnerable. If that is correct, and if this is any indication of the end-user diagnostic experience, then I forsee a lot of confusion on the horizon.

    Robert Chapin
    Chapin Information Services

  10. July 25, 2008 at 2:24 am

    I just wanted to say thanks for finding the exploit and then for the getting people to do something about it. I know that there’s been a lot of whining, but hopefully you also know that there are many more grateful people.

    Thanks also for the checking tool.

    Jeff

  11. July 25, 2008 at 2:26 am

    Many thanks for all info you provide!

  12. Smee Jenkins
    July 25, 2008 at 3:01 am

    You should also mention the recommended fix: the random source port. You can explain this by saying, it lowers the chances of the bad guy winning the race, from one chance in 65536, to one chance in 4 million.

  13. July 25, 2008 at 3:32 am

    Very interesting article…very well explained and easy to visualize.

  14. Roy Arends
    July 25, 2008 at 3:51 am

    Three more observations:

    1) The price is not just http://www.foobar.com, but foobar.com. With this attack it is possible to re-route all traffic for foobar.com domain to an attackers nameserver. If that works for foobar.com, it works for com as well.

    2) If the race is won, indeed the attacker sets the new TTL. To make sure the race will not run again, the attacker request information once per ttl, which causes a lookup to the hackers’ DNS servers (see the previous point), which re-sets the TTL again. Win one race, win forever.

    3) I imagine for very popular domains, hackers will not just race the real authoritative server, but also each other.

    As for Chel’s comment: For the attack it doesn’t matter if the resolver drops the request upon a false reply. There will be a new race a second later. The solution for the resolver is to add more entropy to its request.

  15. Martin Gerner
    July 25, 2008 at 4:06 am

    Great work! It’s just sad that it got out early – but at least those 13 days were enough for many (if not all).

  16. July 25, 2008 at 4:08 am

    Well, just like banks that lock accounts for too many invalid password attempts, that would then introduce a denial of service vulnerability. You could spoof a packet (or several) from a legitimate name server’s IP address and make the resolver not trust that name server any more.

  17. Felix
    July 25, 2008 at 4:13 am

    @Chel: That would make the resolver vulnerable to DoS attacks. You could just spam it with answers to foo.com and no one would be able to access foo.com because the resolver is returning NXDOMAIN.

    Right?

    Felix

  18. July 25, 2008 at 5:15 am

    I don’t know why you are still trying to hold out details. The cat is out of the bag. Forget BlackHat just post your details.

  19. Ray
    July 25, 2008 at 5:54 am

    It’s confusing. I think matasano’s explanation is better.

  20. July 25, 2008 at 6:22 am

    Thanks for the stats – hopefully you’ll be able to keep us updated as numbers of unpatched servers continues to fall.

    At what point should we consider naming and shaming ISPs that are failing to update?

    Also, am I correct in thinking that just because you’ve patched your clients and servers, doesn’t make you safe? If your (or your ISPs) DNS server doesn’t have the answer and sends the request on to an unpatched (hacked) server then presumably you could still be returned a false reply (and worse your caching server would also cache that false reply)?

  21. July 25, 2008 at 7:51 am

    That’s a useful description for people (like me) who know the basics. But for explaining the DNS basics to non-technicals, I try this:

    1. The DNS is like the “white pages” of the Internet. Your computer uses the DNS to look up the phone numbers of computers from their names. There are 4 billion computer phone numbers. Every computer has a “phone number” called an IP address.

    2. The DNS (white pages) is distributed all over the Internet, a page here, a page there, and there’s a system for finding the right page for each name you want to look up. You browser (etc.) uses this look-up system to recursively locate the right “page” of the DNS.

    3. When the right page of the DNS is found, it is cached in a computer somewhere so that you can look it up more quickly next time. Typically that cache will be managed by your ISP or by your IT staff. It’s “in the middle”, between your computer and the computer you are trying to reach.

    Of course, then cache poisoning is fairly easy to explain in those terms. Just a thought….

  22. James Van Artsdalen
    July 26, 2008 at 10:08 pm

    […]Or, you might get a message that says, “www.foobar.com? I don’t know, ask ns1.foobar.com, here’s its address.”[…]

    It seems one could ignore the FQDN ns1.foobar.com and just use the address given, resulting in a single poisoned FQDN (the one being looked up, not some arbitrary FQDN of the attacker’s choice). That’s not a fix but it’s a further improvement yet.

    In other words, don’t cache the nameserver FQDNs unless there actually is a downstream query on a nameserver FQDN.

  23. kme
    July 28, 2008 at 2:12 am

    The fix could be improved by making the cache records purpose-specific.

    That is, instead of the nameserver just adding “www.foobar.com A 6.6.6.6” to its cache, it adds “[to be used for connecting to the name server for 86.foobar.com only] http://www.foobar.com A 6.6.6.6″ to its cache.

    Yes, this would result in some additional query traffic.

  24. Link
    July 28, 2008 at 12:04 pm

    Correction for you.

    “This process — “recursion” — is repeated over and over, until you finally have the number for that name.”

    According to RFC 1034, this process is called “Iteration”. Recursion is when the DNS server makes a query to another DNS server on behalf of the client.

  25. me
    July 28, 2008 at 1:34 pm

    So if I run this from inside the don’t-want-to-identify-my-company’s networked system, will the test correctly identify a problem?

    It says:

    Your name server, at nn.nnn.nnn.nnn, appears vulnerable to DNS Cache Poisoning.
    All requests came from the following source port: nnnn

    I’d like to think this is a protected internal machine they don’t need to fix, and there’s another DNS machine it talks to outside.

    But I’d like to know more than they’re telling me.

  26. kme
    July 29, 2008 at 1:16 am

    me: Yes, the test will correctly identify a problem.

    If it says that all the queries are coming from one point, then the last server in the chain of DNS servers that your machine is using is vulnerable – so _you_ are vulnerable.

  27. Ian B
    July 30, 2008 at 3:41 am

    I think this tool is of questionable value. I have tried testing from work and from home, and get results that I am using name servers that are in no way related to the ones I have entered in my system configurations. I have checked with my ISP and with our DNS server admins at work, they have patched their systems. This tells me that the tool is somehow producing bad results.

  28. andar909
    August 10, 2008 at 3:26 pm

    hi, andar here, i just read your post. i like very much. agree to you, sir.

  1. July 24, 2008 at 10:03 pm
  2. July 25, 2008 at 8:56 am
  3. July 25, 2008 at 11:15 am
  4. July 25, 2008 at 12:32 pm
  5. July 25, 2008 at 9:54 pm
  6. July 25, 2008 at 10:02 pm
  7. July 26, 2008 at 1:46 am
  8. July 26, 2008 at 3:00 am
  9. July 26, 2008 at 12:38 pm
  10. July 26, 2008 at 12:38 pm
  11. July 26, 2008 at 6:17 pm
  12. July 26, 2008 at 8:02 pm
  13. July 26, 2008 at 10:52 pm
  14. July 27, 2008 at 3:13 am
  15. July 27, 2008 at 5:15 am
  16. July 27, 2008 at 7:32 am
  17. July 27, 2008 at 8:14 am
  18. July 27, 2008 at 8:21 am
  19. July 27, 2008 at 10:00 am
  20. July 28, 2008 at 7:01 am
  21. July 28, 2008 at 12:15 pm
  22. July 28, 2008 at 8:38 pm
  23. July 28, 2008 at 9:11 pm
  24. July 29, 2008 at 10:06 am
  25. July 29, 2008 at 5:19 pm
  26. July 30, 2008 at 3:21 am
  27. July 30, 2008 at 10:36 am
  28. July 31, 2008 at 12:02 pm
  29. August 1, 2008 at 7:41 am
  30. August 1, 2008 at 8:54 am
  31. August 1, 2008 at 7:01 pm
  32. August 3, 2008 at 9:07 pm
  33. August 5, 2008 at 2:32 pm
  34. August 14, 2008 at 2:19 pm
  35. August 17, 2008 at 9:19 am
  36. August 18, 2008 at 5:04 am
  37. September 17, 2008 at 3:11 pm
  38. September 29, 2008 at 11:33 am

Leave a comment