An Astonishing Collaboration
Wow. It’s out. It’s finally, finally out.
So there’s a bug in DNS, the name-to-address mapping system at the core of most Internet services. DNS goes bad, every website goes bad, and every email goes…somewhere. Not where it was supposed to. You may have heard about this — the Wall Street Journal, the BBC, and some particularly important people are reporting on what’s been going on. Specifically:
1) It’s a bug in many platforms
2) It’s the exact same bug in many platforms (design bugs, they are a pain)
3) After an enormous and secret effort, we’ve got fixes for all major platforms, all out on the same day.
4) This has not happened before. Everything is genuinely under control.
I’m pretty proud of what we accomplished here. We got Windows. We got Cisco IOS. We got Nominum. We got BIND 9, and when we couldn’t get BIND 8, we got Yahoo, the biggest BIND 8 deployment we knew of, to publicly commit to abandoning it entirely.
It was a good day.
CERT has details up, and there’s a full-on interview between myself and Rich Mogull up on Securosis. For the non-geeks in the audience, you might want to tune out here, but this is my personal blog and I do have some stuff to mention to the crew.
There’s something very important about what we accomplished here.
We. Because there’s absolutely no way I could have pulled this off by myself.
Paul Vixie is an institution. Having long maintained the Internet’s most popular DNS server, Paul simply knows everybody. Paul was absolutely instrumental in pulling together the engineers we needed to solve this problem. We needed Florian Weimer there, all the way from Europe. We needed David Dagon, and Jinmei Tatuya, and Wouter Wijngaards. We needed Microsoft, Cisco, Nominum, Neustar, and OpenDNS.
And we really needed CERT.
It was an interesting discussion, with lots of disagreement, but ever-growing consensus. After evaluating several options, one approach was clear — and, I must admit, somewhat embarassing to Paul.
DJB was right. All those years ago, Dan J. Bernstein was right: Source Port Randomization should be standard on every name server in production use.
There is a fantastic quote that guides a lot of the work I do: Luck is the residue of design. Dan Bernstein is a notably lucky programmer, and that’s no accident. The professor lives and breathes systems engineering in a way that my hackish code aspires to one day experience. DJB got “lucky” here — he ended up defending himself against an attack he almost certainly never encountered.
Such is the mark of excellent design. Excellent design protects you against things you don’t have any information about. And so we are deploying this excellent design to provide no information.
To translate the fix strategy into a more familiar domain, imagine large chunks of Windows RPC went from Anonymous to Authenticated User only, or even all the way to Admin Only. Or wait, just remember Windows XPSP2🙂 This is a sledgehammer, by design. It cuts off attack surface, without necessarily saying why. Astonishingly subtle bugs can be easily hidden, or even rendered irrelevant, by a suitably blunt fix.
Of course, it remains obvious that something new is up, and that something will be found eventually. But there’s a lot of buggy systems out there, vulnerable not just to new bugs but bugs that have been known for years. If all this effort ever accomplished was sweeping old and crusty BIND8 off the Internet, if we could finally fully eliminate Joe Stewart’s (edit: Originally Vagner Sacramento’s, thanks Joe!) Birthday Attacks from 2002, if we started doing something about Amit Klein’s Transaction ID Randomness finds (even the deeply underrated client vulns) from last year, and yes, if the static port assignments DJB warned us about ages ago were finally shut down — then this would still be a huge win.
There are reasons why the new issue is particularly severe, but I think reasonable people can agree that anything that could scrub even the old bugs would be a boon to the security of the Internet. And so, I ask the open research community…assume I found nothing! Assume this is nothing but a stunt, to finally get people to take Joe and Amit and DJB seriously, and to give network scanners a crystal clear fingerprint of what a trustable recursive server looks like.
Joe and Amit and especially DJB have done some incredible work. I’d look terrible at the end of it, but their bugs would finally get fixed, and stay fixed. As for me, I dunno. Go back to graphics🙂 Mmmm…SIGGRAPH…
For those of you who won’t make that assumption, I have a request. It is very unusual, and maybe unreasonable. But I have to ask.
I want you to explore DNS. I want you to try to build off the same bugs I did to figure out what could possibly go wrong. Maybe I missed something — I want you to help me find out if I did, so we can deal with it now instead of later. I do want all this. But I also want my family to be able to use the Internet in peace. I’m not asking for forever. I am asking about thirty days. I’ve done everything in my power to get the patches available, no matter the platform. But the code doesn’t (always) install itself. While I’m out there, trying to get all these bugs scrubbed — old and new — please, keep the speculation off the @public forums and IRC channels. We’re a curious lot, and we want to know how things break. But the public needs at least a chance to deploy this fix, and from a blatantly selfish perspective, I’d kind of like my thunder not to be completely stolen in Vegas🙂
Now, if you do figure it out, and tell me privately, you’re coming on stage with me at Defcon. So I can at least offer that🙂