So some people want me to say a few more things about Heartbleed.
Meanwhile, this happened.
My immediate reaction, of course, was that there was no way IE shipped that early. Windows 95 didn’t even have TCP/IP, the core protocols of the Internet, enabled by default! Colbert had no idea what he’s Tolkein about.
Nope. Turns out, the Colbert Report crew did their homework. But, wait. What? In what universe has yet another browser bug become fodder for late night?
Hacking has become a spectator sport. Really can’t say it’s surprising — how many people play sports every day? Now how many people look at a big glowing rectangle?
We make the glowing box do some very strange things.
I mentioned earlier that one of the things that made Heartbleed so painful, is that it was a bug where we least expected one to be. That is not the situation with browsers. Despite genuinely herculean efforts, any security professional worth their salt completely expects web browser vulnerabilities to be found, and exploited, from time to time. The simple explanation is that web browsers expose a tremendous amount of attack surface to relatively anonymous attackers.
Let’s get a bit beyond the simple explanation.
There’s a reason we have very different codebases, but very similar bugs.
So why all the noise? Why now? In this particular case, this is the first bug after Microsoft’s genuinely unprecedented campaign to announce the end of XP support. The masses were marketed to, and basically told in no uncertain terms “It’s time to upgrade, the next bug is going to burn you if you’re still on XP.” Well, here’s the next bug, and there’s Microsoft keeping their word.
(Update: MS is issuing an XP patch after all. Actual attacks in the field generally do trump everything else. Really, the story of XP is tragic. Microsoft finally makes the first consumer OS that doesn’t crash when you look at it funny…and then it crashes when I look at it funny. Goalposts with rockets on em…)
This is also the first bug after Heartbleed, whose response somehow metastasized into the world being told to freak out and change all their passwords.
Neither of these events have anything to do with the quality of the browser itself, but they’re certainly driving the noise. Yes, IE’s got some somewhat unique issues. Back in the day, when Microsoft argued that they couldn’t remove IE from Windows, it was too integrated — yep, pretty much. Internet Explorer is basically Windows: The Remix feat. The Internet. Which makes sense, because at the end of the day browsers have long been the new operating systems. So of course, Microsoft would basically have lots of stuff for the browser lying about. None of that stuff was ever designed to be executed by untrusted parties, so when it ended up rigged up for remote scripting…sometimes just through the magic of COM…bad things happened.
Boom. And this is pretty common. As the great poet of our modern era, The Grugq wrote recently:
At this point, you may be feeling rather sad about the state of security in general. The three major browsers — IE, Firefox, and Chrome — are some of the most well funded and deeply audited ongoing development efforts in the world. If they all fail, in similar ways even, what hope do we have?
There is hope on the horizon (and through this path, we finally get back to Heartbleed). While the browsers remain imperfect, that they work at all — let alone with ever increasing performance and usability — is nothing short of miraculous. There is literally nothing else where it’s conceivable that you’d just wander around, executing random chunks of code from random suppliers, and not get compromised instantaneously. (Sandboxes are things children walk into and out of with relative ease. I’ve always wondered why we called them that.) And there is motion towards making useful languages that are both memory safe and fast enough to do the heavy lifting browsers require — Rust being the prime example.
It’s not enough to be secure. Hardest lesson anyone in security can ever learn. Some never do.
There’s hope, but it comes at some price. We don’t know what solutions will actually work, and we shouldn’t assume any of them will ever reach 100% security. We absolutely should not assume we knew how to do security decades ago, and just “forgot” or got lazy or whatnot. For a while it seemed like the answer was obviously Java, or C#/.NET. Useful languages for many problem sets, sure. But Microsoft once tried to rewrite chunks of Windows in .NET. It did not go well. To this day, “Longhorn” will bring chills to old-school MS’ers (“Rosebud…”), and there is genuine excitement around the .NET to C++ compiler.
There are things that looked like they worked in the past, but it was a mirage. There are things that were a mirage in the past, but technology or other factors have changed. (Anyone remember DHTML? Great idea, but it took a few generations before client side interactivity really became a thing.)
Who knows. Maybe Google will someday make a sandbox that impresses even Pinkie Pie. And perhaps I have something up my sleeve…
But expecting no bugs is like expecting no crime, nobody to die in the ER, no cars to crash, no businesses to fail. It’s not just unreasonable. It’s also kind of awkward to see it become a spectator sport.
(Splitting the Heartbleed commentary to a second post.)
Oh, Information Disclosure vulnerabilities. Truly the Rodney Dangerfield of vulns, people never quite know what their impact is going to be. With Memory Corruption, we’ve basically accepted that a sufficiently skilled attacker always has enough degrees of freedom to at least unreliably achieve arbitrary code execution (and from there, by the way, to leak arbitrary information like private keys). With Information Disclosure, even the straight up finder of Heartbleed has his doubts:
So, can Heartbleed leak private keys in the real world or not? The best way to resolve this discussion is PoC||GTFO (Proof of Concept, or you can figure out the rest). CloudFlare threw up a challenge page to steal their key. It would appear Fedor Indutny and Illkka Mattila have successfully done just that. Now, what’s kind of neat is that because this is a crypto challenge, Fedor can actually prove he pulled this off, in a way everyone else can prove but nobody else can duplicate.
I’m not sure if key theft has already occurred, but I think this fairly conclusively establishes key theft is coming. Lets do a walk through:
First, we retrieve the certificate from http://www.cloudflarechallenge.com.
$ echo “” | openssl s_client -connect www.cloudflarechallenge.com:443 -showcerts | openssl x509 > cloudflare.pem
depth=4 C = SE, O = AddTrust AB, OU = AddTrust External TTP Network, CN = AddTrust External CA Root
verify error:num=19:self signed certificate in certificate chain
This gives us the certificate, with lots of metadata about the RSA key. We just want the key.
$ openssl x509 -pubkey -noout -in cloudflare.pem > cloudflare_pubkey.pem
Now, we take the message posted by Fedor, and Base64 decode it:
$ wget https://gist.githubusercontent.com/indutny/a11c2568533abcf8b9a1/raw/1d35c1670cb74262ee1cc0c1ae46285a91959b3f/1.bash
–2014-04-11 18:14:48– https://gist.githubusercontent.com/indutny/a11c2568533abcf8b9a1/raw/1d35c1670cb74262ee1cc0c1ae46285a91959b3f/1.bash
Resolving gist.githubusercontent.com… 184.108.40.206
Connecting to gist.githubusercontent.com|220.127.116.11|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: unspecified [text/plain]
Saving to: `1.bash’
[ <=> ] 456 –.-K/s in 0s
2014-04-11 18:14:49 (990 KB/s) – `1.bash’ saved 
$ cat 1.bash
> echo “Proof I have your key. email@example.com” | openssl sha1 -sign key.pem -sha1 | openssl enc -base64
$ cat 1.bash | grep -v fedor | openssl enc -d -base64 > fedor_signed_proof.bin
So, what do we have? A message, a public key, and a signature linking that specific message to that specific public key. At least, we’re supposed to. Does OpenSSL agree?
$ echo “Proof I have your key. firstname.lastname@example.org” | openssl dgst -verify cloudflare_pubkey.pem -signature fedor_signed_proof.bin -sha1
Ah, but what if we tried this for some other RSA public key, like the one from Google?
$ echo “” | openssl s_client -connect www.google.com:443 -showcerts | openssl x509 > google.pem
depth=2 C = US, O = GeoTrust Inc., CN = GeoTrust Global CA
verify error:num=20:unable to get local issuer certificate
$ openssl x509 -pubkey -noout -in google.pem > google_pubkey.pem
$ echo “Proof I have your key. email@example.com” | openssl dgst -verify google_pubkey.pem -signature fedor_signed_proof.bin -sha1
Or what if I changed things around so it looked like it was me who cracked the code?
$ echo “Proof I have your key. firstname.lastname@example.org” | openssl dgst -verify cloudflare_pubkey.pem -signature fedor_signed_proof.bin -sha1
Nope, it’s Fedor or bust :) Note that it’s a common mistake, when testing cryptographic systems, to not test the failure modes. Why was “Verified OK” important? Because “Verification Failure” happened when our expectations weren’t met.
There is, of course, one mode I haven’t brought up. Fedor could have used some other exploit to retrieve the key. He claims not to have, and I believe him. But that this is a failure mode is also part of the point — there have been lots of bugs that affect something that ultimately grants access to SSL private keys. The world didn’t end then and it’s not ending now.
I am satisfied that the burden of proof for Heartbleed leaking private keys has been met, and I’m sufficiently convinced that “noisy but turnkey solutions” for key extraction will be in the field in the coming weeks (it’s only been ~3 days since Neel told everyone not to panic).
Been getting emails asking for what the appropriate response to Heartbleed is. My advice is pretty exclusively for system administrators and CISOs, and is something akin to “Patch immediately, particularly the systems exposed to the outside world, and don’t just worry about HTTP. Find anything moving SSL, particularly your SSL VPNs, prioritizing on open inbound, any TCP port. Cycle your certs if you have them, you’re going to lose them, you may have already, we don’t know. But patch, even if there’s self signed certs, this is a generic Information Leakage in all sorts of apps. If there is no patch and probably won’t ever be, look at putting a TLS proxy in front of the endpoint. Pretty sure stunnel4 can do this for you.”
QUICK EDIT: One final note — the bug was written at the end of 2011, so devices that are older than that and have not been patched at all remain invulnerable (at least to Heartbleed). The guidance is really to identify systems that are exposing OpenSSL 1.01 SSL servers (and eventually clients) w/ heartbeat support, on top of any protocol, and get ’em patched up.
Yes, I’m trying to get focus off of users and passwords and even certificates and onto stemming the bleeding, particularly against non-obvious endpoints. For some reason a surprising number of people think SSL is only for HTTP and browsers. No, it’s 2014. A lot of protocol implementations have basically evolved to use this one environment middleboxes can’t mess with.
A lot of corporate product implementations, by the way. This has nothing to do with Open Source, except to the degree that Open Source code is just basic Critical Infrastructure now.
Abstract: Heartbleed wasn’t fun. It represents us moving from “attacks could happen” to “attacks have happened”, and that’s not necessarily a good thing. The larger takeaway actually isn’t “This wouldn’t have happened if we didn’t add Ping”, the takeaway is “We can’t even add Ping, how the heck are we going to fix everything else?”. The answer is that we need to take Matthew Green’s advice, start getting serious about figuring out what software has become Critical Infrastructure to the global economy, and dedicating genuine resources to supporting that code. It took three years to find Heartbleed. We have to move towards a model of No More Accidental Finds.
You know, I’d hoped I’d be able to avoid a long form writeup on Heartbleed. Such things are not meant to be. I’m going to leave many of the gory technical details to others, but there’s a few angles that haven’t been addressed and really need to be. So, let’s talk. What to make of all this noise?
First off, there’s been a subtle shift in the risk calculus around security vulnerabilities. Before, we used to say: “A flaw has been discovered. Fix it, before it’s too late.” In the case of Heartbleed, the presumption is that it’s already too late, that all information that could be extracted, has been extracted, and that pretty much everyone needs to execute emergency remediation procedures.
It’s a significant change, to assume the worst has already occurred.
It always seems like a good idea in security to emphasize prudence over accuracy, possible risk over evidence of actual attack. And frankly this policy has been run by the privacy community for some time now. Is this a positive shift? It certainly allows an answer to the question for your average consumer, “What am I supposed to do in response to this Internet ending bug?” “Well, presume all your passwords leaked and change them!”
I worry, and not merely because “You can’t be too careful” has not at all been an entirely pleasant policy in the real world. We have lots of bugs in software. Shall we presume every browser flaw not only needs to be patched, but has already been exploited globally worldwide, and you should wipe your machine any time one is discovered? This OpenSSL flaw is pernicious, sure. We’ve had big flaws before, ones that didn’t just provide read access to remote memory either. Why the freak out here?
Because we expected better, here, of all places.
There’s been quite a bit of talk, about how we never should have been exposed to Heartbleed at all, because TLS heartbeats aren’t all that important a feature anyway. Yes, it’s 2014, and the security community is complaining about Ping again. This is of course pretty rich, given that it seems half of us just spent the last few days pinging the entire Internet to see who’s still exposed to this particular flaw. We in security sort of have blinders on, in that if the feature isn’t immediately and obviously useful to us, we don’t see the point.
In general, you don’t want to see a protocol designed by the security community. It won’t do much. In return (with the notable and very appreciated exception of Dan Bernstein), the security community doesn’t want to design you a protocol. It’s pretty miserable work. Thanks to what I’ll charitably describe as “unbound creativity” the fast and dumb and unmodifying design of the Internet has made way to a hodge podge of proxies and routers and “smart” middleboxes that do who knows what. Protocol design is miserable, nothing is elegant. Anyone who’s spent a day or two trying to make P2P VoIP work on the modern Internet discovers very quickly why Skype was worth billions. It worked, no matter what.
Anyway, in an alternate universe TLS heartbeats (with full ping functionality) are a beloved security feature of the protocol as they’re the key to constant time, constant bandwidth tunneling of data over TLS without horrifying application layer hacks. As is, they’re tremendously useful for keeping sessions alive, a thing I’d expect hackers with even a mild amount of experience with remote shells to appreciate. The Internet is moving to long lived sessions, as all Gmail users can attest to. KeepAlives keep long lived things working. SSH has been supporting protocol layer KeepAlives forever, as can be seen:
The takeaway here is not “If only we hadn’t added ping, this wouldn’t have happened.” The true lesson is, “If only we hadn’t added anything at all, this wouldn’t have happened.” In other words, if we can’t even securely implement Ping, how could we ever demand “important” changes? Those changes tend to be much more fiddly, much more complicated, much riskier. But if we can’t even securely add this line of code:
if (1 + 2 + payload + 16 > s->s3->rrec.length)
I know Neel Mehta. I really like Neel Mehta. It shouldn’t take absolute heroism, one of the smartest guys in our community, and three years for somebody to notice a flaw when there’s a straight up length field in the patch. And that, I think, is a major and unspoken component of the panic around Heartbleed. The OpenSSL dev shouldn’t have written this (on New Years Eve, at 1AM apparently). His coauthors and release engineers shouldn’t have let it through. The distros should have noticed. Somebody should have been watching the till, at least this one particular till, and it seems nobody was.
Nobody publicly, anyway.
If we’re going to fix the Internet, if we’re going to really change things, we’re going to need the freedom to do a lot more dramatic changes than just Ping over TLS. We have to be able to manage more; we’re failing at less.
There’s a lot of rigamarole around defense in depth, other languages that OpenSSL could be written in, “provable software”, etc. Everyone, myself included, has some toy that would have fixed this. But you know, word from the Wall Street Journal is that there have been all of $841 in donations to the OpenSSL project to address this matter. We are building the most important technologies for the global economy on shockingly underfunded infrastructure. We are truly living through Code in the Age of Cholera.
Professor Matthew Green of Johns Hopkins University recently commented that he’s been running around telling the world for some time that OpenSSL is Critical Infrastructure. He’s right. He really is. The conclusion is resisted strongly, because you cannot imagine the regulatory hassles normally involved with traditionally being deemed Critical Infrastructure. A world where SSL stacks have to be audited and operated against such standards is a world that doesn’t run SSL stacks at all.
And so, finally, we end up with what to learn from Heartbleed. First, we need a new model of Critical Infrastructure protection, one that dedicates real financial resources to the safety and stability of the code our global economy depends on – without attempting to regulate that code to death. And second, we need to actually identify that code.
When I said that we expected better of OpenSSL, it’s not merely that there’s some sense that security-driven code should be of higher quality. (OpenSSL is legendary for being considered a mess, internally.) It’s that the number of systems that depend on it, and then expose that dependency to the outside world, are considerable. This is security’s largest contributed dependency, but it’s not necessarily the software ecosystem’s largest dependency. Many, maybe even more systems depend on web servers like Apache, nginx, and IIS. We fear vulnerabilities significantly more in libz than libbz2 than libxz, because more servers will decompress untrusted gzip over bzip2 over xz. Vulnerabilities are not always in obvious places – people underestimate just how exposed things like libxml and libcurl and libjpeg are. And as HD Moore showed me some time ago, the embedded space is its own universe of pain, with 90’s bugs covering entire countries.
If we accept that a software dependency becomes Critical Infrastructure at some level of economic dependency, the game becomes identifying those dependencies, and delivering direct technical and even financial support. What are the one million most important lines of code that are reachable by attackers, and least covered by defenders? (The browsers, for example, are very reachable by attackers but actually defended pretty zealously – FFMPEG public is not FFMPEG in Chrome.)
Note that not all code, even in the same project, is equally exposed. It’s tempting to say it’s a needle in a haystack. But I promise you this: Anybody patches Linux/net/ipv4/tcp_input.c (which handles inbound network for Linux), a hundred alerts are fired and many of them are not to individuals anyone would call friendly. One guy, one night, patched OpenSSL. Not enough defenders noticed, and it took Neel Mehta to do something.
We fix that, or this happens again. And again. And again.
No more accidental finds. The stakes are just too high.
Reality is what refuses to go away when you stop believing in it.
The reality – the ground truth—is that Aaron Swartz is dead.
Brinksmanship is a terrible game, that all too many systems evolve towards. The suicide of Aaron Swartz is an awful outcome, an unfair outcome, a radically out of proportion outcome. As in all negotiations to the brink, it represents a scenario in which all parties lose.
Aaron Swartz lost. He paid with his life. This is no victory for Carmen Ortiz, or Steve Heymann, or JSTOR, MIT, the United States Government, or society in general. In brinksmanship, everybody loses.
Suicide is a horrendous act and an even worse threat. But let us not pretend that a set of charges covering the majority of Aaron’s productive years is not also fundamentally noxious, with ultimately a deeply similar outcome. Carmen Ortiz (and, presumably, Steve Heymann) are almost certainly telling the truth when they say – they had no intention of demanding thirty years of imprisonment from Aaron. This did not stop them from in fact, demanding thirty years of imprisonment from Aaron.
Brinksmanship. It’s just negotiation. Nothing personal.
Let’s return to ground truth. MIT was a mostly open network, and the content “stolen” by Aaron was itself mostly open. You can make whatever legalistic argument you like; the reality is there simply wasn’t much offense taken to Aaron’s actions. He wasn’t stealing credit card numbers, he wasn’t reading personal or professional emails, he wasn’t extracting design documents or military secrets. These were academic papers he was ‘liberating’.
What he was, was easy to find.
I have been saying, for some time now, that we have three problems in computer security. First, we can’t authenticate. Second, we can’t write secure code. Third, we can’t bust the bad guys. What we’ve experienced here, is a failure of the third category. Computer crime exists. Somebody caused a huge amount of damage – and made a lot of money – with a Java exploit, and is going to get away with it. That’s hard to accept. Some of our rage from this ground truth is sublimated by blaming Oracle. But some of it turns into pressure on prosecutors, to find somebody, anybody, who can be made an example of.
There are two arguments to be made now. Perhaps prosecution by example is immoral – people should only be punished for their own crimes. In that case, these crimes just weren’t offensive enough for the resources proposed (prison isn’t free for society). Or perhaps prosecution by example is how the system works, don’t be naïve – well then.
Aaron Swartz’s antics were absolutely annoying to somebody at MIT and somebody at JSTOR. (Apparently someone at PACER as well.) That’s not good, but that’s not enough. Nobody who we actually have significant consensus for prosecuting, models himself after Aaron Swartz and thinks “Man, if they go after him, they might go after me”.
The hard truth is that this should have gone away, quietly, ages ago. Aaron should have received a restraining order to avoid MIT, or perhaps some sort of fine. Instead, we have a death. There will be consequences to that – should or should not doesn’t exist here, it is simply a statement of fact. Reality is what refuses to go away, and this is the path by which brinksmanship is disincentivized.
My take on the situation is that we need a higher class of computer crime prosecution. We, the computer community in general, must engage at a higher level – both in terms of legislation that captures our mores (and funds actual investigations – those things ain’t free!), and operational support that can provide a critical check on who is or isn’t punished for their deeds. Aaron’s law is an excellent start, and I support it strongly, but it excludes faux law rather than including reasoned policy. We can do more. I will do more.
The status quo is not sustainable, and has cost us a good friend. It’s so out of control, so desperate to find somebody – anybody! – to take the fall for unpunished computer crime, that it’s almost entirely become about the raw mechanics of being able to locate and arrest the individual instead of about their actual actions.
Aaron Swartz should be alive today. Carmen Ortiz and Steve Heymann should have been prosecuting somebody else. They certainly should not have been applying a 60x multiple between the amount of time they wanted, and the degree of threat they were issuing. The system, in all of its brinksmanship, has failed. It falls on us, all of us, to fix it.
[Obligatory disclosures — I’ve consulted for Microsoft, and had been doing some research on Mouse events myself.]
So one of the more important aspects of security reporting is what I’ve been calling Actionable Intelligence. Specifically, when discussing a bug — and there are many, far more than are ever discovered let alone disclosed — we have to ask:
What can an attacker do today, that he couldn’t do yesterday, for what class attacker, to what class victim?
Spider.IO, a fraud analytics company, recently disclosed that under Internet Explorer attackers can capture mouse movement events from outside an open window. What is the Actionable Intelligence here? It’s moderately tempting to reply: We have a profound new source of modern art.
(Credit: Anatoly Zenkov’s IOGraph tool)
I exaggerate, but not much. The simple truth is that there are simply not many situations where mouse movements are security sensitive. Keyboard events, of course, would be a different story — but mouse? As more than a few people have noted, they’d be more than happy to publish their full movement history for the past few years.
It is interesting to discuss the case of the “virtual keyboard”. There has been a movement (thankfully rare) to force credential input via mouse instead of keyboard, to stymie keyloggers. This presupposes a class of attacker that has access to keyboard events, but not mouse movements or screen content. No such class actually exists; the technique was never protecting much of anything in the first place. It’s just pain-in-the-butt-as-a-feature. More precisely, it’s another example of Rick Wash’s profoundly interesting turn of phrase, Folk Security. Put simply, there is a belief that if something is hard for a legitimate user, it’s even harder for the hacker. Karmic security is (unfortunately) not a property of the universe.
(What about the attacker with an inline keylogger? Not only does he have physical access, he’s not actually constrained to just emulating a keyboard. He’s on the USB bus, he has many more interesting devices to spoof.)
That’s not to say spider.io has not found a bug. Mouse events should only come from the web frame for which script has dominion over, in much the same way CNN should not be receiving Image Load events from a tab open to Yahoo. But the story of the last decade is that bugs are not actually rare, and that from time to time issues will be found in everything. We don’t need to have an outright panic when a small leak is found. The truth is, every remote code execution vulnerability can also capture full screen mouse motion. Every universal cross site scripting attack (in which CNN can inject code into a frame owned by Yahoo) can do similar, though perhaps only against other browser windows.
I would like to live in a world where this sort of very limited overextension of the web security model warrants a strong reaction. It is in fact nice that we do live in a world where browsers effectively expose the most nuanced and well developed (if by fire) security model in all of software. Where else is even the proper scope of mouse events even a comprehensible discussion?
(Note that it’s a meaningless concept to say that mouse events within the frame shouldn’t be capturable. Being able to “hover” on items is a core user interface element, particularly for the highly dynamic UI’s that Canvas and WebGL enable. The depth of damage one would have to inflict on the browser usability model, to ‘secure’ activity in what’s actually the legitimate realm of a page, would be profound. When suggesting defenses, one must consider whether the changes required to make them reparable under actual assault ruins the thing being defended in the first place. We can’t go off destroying villages in order to save them.)
So, in summary: Sure, there’s a bug here with these mouse events. I expect it will be fixed, like tens of thousands of others. But it’s not particularly significant. What can an attacker do today, that he couldn’t do yesterday? Not much, to not many. Spider.io’s up to interesting stuff, but not really this.
“The generation of random numbers is too important to be left to chance.”
—Robert R. Coveyou
“One out of 200 RSA keys in the field were badly generated as a result of standard dogma. There’s a chance this might fail less.”
[Note: There are times I write things with CIO’s in mind. This is not one of those times.]
So, I’ve been playing with userspace random number generation, as per Matt Blaze and D.P. Mitchell’s TrueRand from 1996. (Important: Matt Blaze has essentially disowned this approach, and seems to be honestly horrified that I’m revisiting it.) The basic concept is that any system with two clocks has a hardware number generator, since clocks jitter relative to one another based on physical properties, particularly when one is operating on a slow scale (like, say, a human hitting a keyboard) while another is operating on a fast scale (like a CPU counter cycling at nanosecond speeds). Different tolerances on clocks mean more opportunities for unmodelable noise to enter the system. And since the core lie of your computer is that it’s just one computer, as opposed to a small network of independent nodes running on their own time, there should be no shortage of bits to mine.
At least, that’s the theory.
As announced at Defcon 20 / Black Hat, here’s DakaRand 1.0. Let me be the first to say, I don’t know that this works. Let me also be the first to say, I don’t know that it doesn’t. DakaRand is a collection of modes that tries to convert the difference between clocks into enough entropy that, whether or not it survives academic attack, would certainly force me (as an actual guy who breaks stuff) to go attack something else.
A proper post on DakaRand is reserved, I think, for when we have some idea that it actually works. Details can be seen in the slides for the aforementioned talk; what I’d like to focus on now is recommendations for trying to break this code. The short version:
1) Download DakaRand, untar, and run “sh build.sh”.
2) Run dakarand -v -d out.bin -m [0-8]
3) Predict out.bin, bit for bit, in less than 2^128 work effort, on practically any platform you desire with almost any level of active manipulation you wish to insert.
The slightly longer version:
- DakaRand essentially tries to force the attacker into having no better attack than brute force, and then tries to make that work effort at least 2^128. As such, the code is split into generators that acquire bits, and then a masking sequence of SHA-256, Scrypt, and AES-256-CTR that expands those bits into however much is requested. (In the wake of Argyros and Kiayias’s excellent and underreported “I Forgot Your Password: Randomness Attacks Against PHP Applications“, I think it’s time to deprecate all RNG’s with invertable output. At the point you’re asking whether an RNG should be predictable based on its history, you’ve already lost.) The upshot of this is that the actual target for a break is not the direct output of DakaRand, but the input to the masking sequence. Your goal is to show that you can predict this particular stream, with perfect accuracy, at less than 2^128 work effort. Unless you think you can glean interesting information from the masking sequence (in which case, you have more interesting things to attack than my RNG), you’re stuck trying to design a model of the underlying clock jitter.
- There are nine generators in this initial release of DakaRand. Seriously, they can’t all work.
- You control the platform. Seriously — embedded, desktop, server, VM, whatever — it’s fair game. About the only constraint that I’ll add is that the device has to be powerful enough to run Linux. Microcontrollers are about the only things in the world that do play the nanosecond accuracy game, so I’m much less confident against those. But, against anything ARM or larger, real time operation is simply not a thing you get for free, and even when you pay dearly for it you’re still operating within tolerances far larger than DakaRand needs to mine a bit. (Systems that are basically cycle-for-cycle emulators don’t count. Put Bochs and your favorite ICE away. Nice try though!)
- You seriously control the platform. I’ve got no problem with you remotely spiking the CPU to 100%, sending arbitrary network traffic at whatever times you like, and so on. The one constraint is that you can’t already have root — so, no physical access, and no injecting yourself into my gathering process. It’s something of a special case if you’ve got non-root local code execution. I’d be interested in such a break, but multitenancy is a lie and there’s just so many interprocess leaks (like this if-it’s-so-obvious-why-didn’t-you-do-it example of cross-VM communication).
- Virtual machines get special rules: You’re allowed to suspend/restore right up to the execution of DakaRand. That is the point of atomicity.
- The code’s a bit hinky, what with globals and a horde of dependencies. If you’d like to test on a platform that you just can’t get DakaRand to build on, that makes things more interesting, not less. Email me.
- All data generated is mixed into the hash, but bits are “counted” when Von Neumann debiasing works. Basically, generators return integers between 0 and 2^32-1. Every integer is mixed into the keying hash (thus, you having to predict out.bin bit for bit). However, each integer is also measured for the number of 1’s it contains. An even number yields a 0; an odd number, a 1. Bits are only counted when two sequential numbers have either a 10 or a 01, and as long as there’s less than 256 bits counted, the generator will continue to be called. So your attack needs to model the absolute integers returned (which isn’t so bad) and the amount of generator calls it takes for a Von Neumann transition to occur and whether the transition is a 01 or a 10 (since I put that value into the hash too).
- I’ve got a default “gap” between generator probes of just 1000us — a millisecond. This is probably not enough for all platforms — my assumption is that, if anything has to change, that this has to become somewhat dynamic.
Have fun! Remember, “it might fail somehow somewhere” just got trumped by “it actually did fail all over the place”, so how about we investigate a thing or two that we’re not so sure in advance will actually work?
(Side note: Couple other projects in this space: Twuewand, from Ryan Finnie, has the chutzpah to be pure Perl. And of course, Timer Entropyd, from Folkert Van Heusden. Also, my recommendation to kernel developers is to do what I’m hearing they’re up to anyway, which is to monitor all the interrupts that hit the system on a nanosecond timescale. Yep, that’s probably more than enough.)
Here’s my slides from Black Hat and Defcon for 2012. Pile of interesting heresies — should make for interesting discussion. Here’s what we’ve got:
1) Generic timing attack defense through network interface jitter
2) Revisiting Random Number Generation through clock drift
3) Suppressing injection attacks by altering variable scope and per-character taint
4) Deployable mechanisms for detecting censorship, content alteration, and certificate replacement
5) Stateless TCP w/ payload retrieval
I hate saying “code to be released shortly”, but I want to post the slides and the code’s pretty hairy. Email me if you want to test anything, particularly if you’d like to try to break this stuff or wrap it up for release. I’ll also be at Toorcamp, if you want to chat there.