DakaRand 1.0: Revisiting Clock Drift For Entropy Generation
“The generation of random numbers is too important to be left to chance.”
—Robert R. Coveyou
“One out of 200 RSA keys in the field were badly generated as a result of standard dogma. There’s a chance this might fail less.”
[Note: There are times I write things with CIO’s in mind. This is not one of those times.]
So, I’ve been playing with userspace random number generation, as per Matt Blaze and D.P. Mitchell’s TrueRand from 1996. (Important: Matt Blaze has essentially disowned this approach, and seems to be honestly horrified that I’m revisiting it.) The basic concept is that any system with two clocks has a hardware number generator, since clocks jitter relative to one another based on physical properties, particularly when one is operating on a slow scale (like, say, a human hitting a keyboard) while another is operating on a fast scale (like a CPU counter cycling at nanosecond speeds). Different tolerances on clocks mean more opportunities for unmodelable noise to enter the system. And since the core lie of your computer is that it’s just one computer, as opposed to a small network of independent nodes running on their own time, there should be no shortage of bits to mine.
At least, that’s the theory.
As announced at Defcon 20 / Black Hat, here’s DakaRand 1.0. Let me be the first to say, I don’t know that this works. Let me also be the first to say, I don’t know that it doesn’t. DakaRand is a collection of modes that tries to convert the difference between clocks into enough entropy that, whether or not it survives academic attack, would certainly force me (as an actual guy who breaks stuff) to go attack something else.
A proper post on DakaRand is reserved, I think, for when we have some idea that it actually works. Details can be seen in the slides for the aforementioned talk; what I’d like to focus on now is recommendations for trying to break this code. The short version:
1) Download DakaRand, untar, and run “sh build.sh”.
2) Run dakarand -v -d out.bin -m [0-8]
3) Predict out.bin, bit for bit, in less than 2^128 work effort, on practically any platform you desire with almost any level of active manipulation you wish to insert.
The slightly longer version:
- DakaRand essentially tries to force the attacker into having no better attack than brute force, and then tries to make that work effort at least 2^128. As such, the code is split into generators that acquire bits, and then a masking sequence of SHA-256, Scrypt, and AES-256-CTR that expands those bits into however much is requested. (In the wake of Argyros and Kiayias’s excellent and underreported “I Forgot Your Password: Randomness Attacks Against PHP Applications“, I think it’s time to deprecate all RNG’s with invertable output. At the point you’re asking whether an RNG should be predictable based on its history, you’ve already lost.) The upshot of this is that the actual target for a break is not the direct output of DakaRand, but the input to the masking sequence. Your goal is to show that you can predict this particular stream, with perfect accuracy, at less than 2^128 work effort. Unless you think you can glean interesting information from the masking sequence (in which case, you have more interesting things to attack than my RNG), you’re stuck trying to design a model of the underlying clock jitter.
- There are nine generators in this initial release of DakaRand. Seriously, they can’t all work.
- You control the platform. Seriously — embedded, desktop, server, VM, whatever — it’s fair game. About the only constraint that I’ll add is that the device has to be powerful enough to run Linux. Microcontrollers are about the only things in the world that do play the nanosecond accuracy game, so I’m much less confident against those. But, against anything ARM or larger, real time operation is simply not a thing you get for free, and even when you pay dearly for it you’re still operating within tolerances far larger than DakaRand needs to mine a bit. (Systems that are basically cycle-for-cycle emulators don’t count. Put Bochs and your favorite ICE away. Nice try though!)
- You seriously control the platform. I’ve got no problem with you remotely spiking the CPU to 100%, sending arbitrary network traffic at whatever times you like, and so on. The one constraint is that you can’t already have root — so, no physical access, and no injecting yourself into my gathering process. It’s something of a special case if you’ve got non-root local code execution. I’d be interested in such a break, but multitenancy is a lie and there’s just so many interprocess leaks (like this if-it’s-so-obvious-why-didn’t-you-do-it example of cross-VM communication).
- Virtual machines get special rules: You’re allowed to suspend/restore right up to the execution of DakaRand. That is the point of atomicity.
- The code’s a bit hinky, what with globals and a horde of dependencies. If you’d like to test on a platform that you just can’t get DakaRand to build on, that makes things more interesting, not less. Email me.
- All data generated is mixed into the hash, but bits are “counted” when Von Neumann debiasing works. Basically, generators return integers between 0 and 2^32-1. Every integer is mixed into the keying hash (thus, you having to predict out.bin bit for bit). However, each integer is also measured for the number of 1’s it contains. An even number yields a 0; an odd number, a 1. Bits are only counted when two sequential numbers have either a 10 or a 01, and as long as there’s less than 256 bits counted, the generator will continue to be called. So your attack needs to model the absolute integers returned (which isn’t so bad) and the amount of generator calls it takes for a Von Neumann transition to occur and whether the transition is a 01 or a 10 (since I put that value into the hash too).
- I’ve got a default “gap” between generator probes of just 1000us — a millisecond. This is probably not enough for all platforms — my assumption is that, if anything has to change, that this has to become somewhat dynamic.
Have fun! Remember, “it might fail somehow somewhere” just got trumped by “it actually did fail all over the place”, so how about we investigate a thing or two that we’re not so sure in advance will actually work?
(Side note: Couple other projects in this space: Twuewand, from Ryan Finnie, has the chutzpah to be pure Perl. And of course, Timer Entropyd, from Folkert Van Heusden. Also, my recommendation to kernel developers is to do what I’m hearing they’re up to anyway, which is to monitor all the interrupts that hit the system on a nanosecond timescale. Yep, that’s probably more than enough.)