You Set The Rules

Stopped by AT&T ThreatTraq and talked about, you know.  Infosec.  Good times!  Here’s video and a transcript.

====

You set the rules and you get to CHEAT (with Dan Kaminsky)
AT&T ThreatTraq #143

Brian Rexroad: Hello. Welcome to AT&T ThreatTraq for May 12th, 2015. This program provides network security highlights, discussion, and countermeasures for cyber threats. Today, we’re joined by Dan Kaminsky. Dan, welcome. You know, you’re one that practically needs no introduction, but I understand you’re Chief Scientist at White Ops.

Dan Kaminsky: Mm-hmm.

Brian: And can you tell us a little more about White Ops, and what you do?

Dan: We make sure that people on the Internet are actually people, because sometimes they’re not. Sometimes they’re just machines that have been programmed to run around. We always wondered why all these machines were getting broken into, like how interesting can Grandma’s email be? Well, it turns out you hack a million grandma’s. You click a billion ads, you make a million dollars. So we’re in the business of cleaning up the advertising ecosystem, and dealing with other situations where these automated machines, known as bots, run around and do bad things.

Brian: Right. You know, we’ve talked about click fraud a number of times on this program. And I guess, so that’s really kind of the underpinnings of the work that you’re doing.

Dan: When you rob a bank, the man gets pretty angry. When you rob advertisers, they’re like oh, the numbers are up.

Brian: Right. So it’s a little bit strange the way that – you know, I remember, and I don’t even know what the advertisement was about. But this guy, he’s out on the market. He’s buying clicks. Can I get some clicks? I just need to get through the next quarter.

Dan: I know, right? There’s this great thing by Adobe, just need a few more, just need a few more.

Brian: Right, so that whole notion, it’s true. There’s like perverse motivation that’s built in there, if there isn’t some sort of enforcement mechanism. And that’s what you’re out to do.

Dan: Yeah, we’ve really been changing the market. We built the largest ad fraud study of all time that had been done. It was called the Bot Baseline. It’s at whiteops.com/botfraud. And we really found there’s going to be about five or six billion dollars’ worth of this fraud this year. I mean, this is real money. A lot of money is going, not to people who make actual content that people like, but instead just going to outright fraud search, who just steal. And we’re fixing that.

Brian: So, welcome. We’re glad to have you here, and we’ll be able to talk about some other discussions here today, too.

Dan: All right.

Brian: So, let’s go on. Matt, Matt Keyser’s here. Welcome, Matt.

Matt Keyser: How’s it going?

Brian: And we have online, Jim Clausing. Welcome, Jim.

Jim Clausing: Hey Brian, hey guys.

Brian: I hear it’s been a little hot in Ohio. Did you say a little hot?

Jim: Yeah, it was in the nineties a couple of days. And now, it’s not going to be quite so hot for a couple of days, yeah.

Brian: Right, okay, well good. I’m Brian Rexroad and welcome. And so, what we’ll do first here, Dan, is talk a little bit about what you think some of the security trends are that are coming up.

Dan: All right. the big trend that I think that is going to start up is God, you can just lose all your data really quickly. We keep trying to make it so that no one ever gets anything, but if they get in, it’s the end of the world. And the first big trend that I think is going to start is that’s going to slow down a bit. We are going to figure out architectures that lose money at a – or lose data or money at a “not all of it at once” rate.

And I think we’re going to see that become a real trend in information security.

Brian:  So, more and more, I think you’re absolutely right. And you mentioned sort of like ATM’s. You’re limited to what you can take from an ATM. But that next step is well, what if you can go to a hundred ATM’s at the same time, and before those actual transactions go through? So, I think it’s going to the next step.

Dan: Which of course was the thing that happened.

Brian: Yes.

Dan: You know, we had a major ATM thing, where I think it was a couple hundred ATM’s were hit within seconds of each other. And the information, that they shouldn’t all provide the information, ended up getting not distributed fast enough. So it’s like a $17 million loss? And, attackers are quite clever. You have to be able to adapt.

Brian: Yep, be able to adapt. Agility and security is one of the main themes to that capability.

Dan: Yeah, this is not a thing where you’re one and done. No, you’ve got a cat and mouse thing. When we’re out dealing with these ad fraud guys, it is constantly cat and mouse.

Brian: Right. I don’t think it was related to the same subject, but we were chatting a little bit earlier. And you said something about it took hours to find the problem, and then six months to solve it.

Dan: To the point where at the end of it, I’m like man, I didn’t do much at all, as the attacker. That was like a distant memory.

Brian: Yeah, so that’s one of the challenges that we deal with is that the attackers really kind of tend to have the advantage, and it takes a lot more effort to try to solve it, without having derogatory impact in the long run.

Dan:  It still has to be performant. It still has to be reliable. It still has to be usable. It still has to be maintainable. It still has to be debuggable. All of these other engineering constraints don’t exist on the offense side, and they take a tremendous amount of work on the defense side.

Brian: Right, right.

Dan: Just a hard problem, but that’s what we signed up for here, so let’s play.

Brian: Yep, good. So what other kinds of trends do you expect?

Dan:  Well, we’ve never really been taking all that test code, all those test processes, and merging it in production. Well, in a world of continuous deployment, in a world of repeatedly updating and modifying, and fixing and developing software, test and monitoring is going live as well. And that information stream is turning out to be tremendously useful for security work. There are things that only happen when you’re under attack. There are code paths that are only exposed when there are vulnerabilities.

Not that can be found during test or in isolation, but when actual real world production data starts flowing through. It’s like when you run water through a pipe, you see the water start linking. So, companies like Previty and Signal Sciences, and these guys are actually really starting to see, hey, there’s a lot of data to be extracted from our architecture. Let’s go ahead and use it to basically build more secure operational systems.

Brian: So, are you referring to threat analytics? Or is this like really kind of a nuance of that?

Dan: Just, I think that the actual systems that we run. The code that we deploy to make our companies go, is going to have a lot more of its test and monitoring internals exposed to the Ops guys. And that exposure is going to have a real security tint to it, slant to it.

Brian: So it’s really beyond security. It’s really just a broader set of instrumentation on the systems, so that we have some visibility into what’s going on with them.

Dan: Security is part of operations.

Brian: It is.

Dan: And one of the major customer needs now is it needs to not leak data.

Brian: Right.

Dan: So, but what I see is we will start getting better signals that we’re going to leak data, that we are leaking data, and especially that we did leak data. Getting monitoring and shrinking that time between compromise and loss is what’s going to happen.

Brian:  Why not kind of tie those things together? Now, do you have any thoughts on sort of the tradeoff? One of the cardinal sins that I’ve heard of in the past with software is you leave debugging mode on. And so, there are all kinds of indicators that are in there, and perhaps back doors that are built in. You mentioned the number of lines of code. Adding more lines of code potentially adds more vulnerability. Any thoughts on that?

Dan: Nothing comes for free. It is absolutely the case that as we build out our debugging infrastructure, bugs in the debugging infrastructure can go ahead and hit us. It’s part of the tradeoff. But you know, every copy of Windows in the world sends data back to Microsoft. And you could make the case, you could make the argument, oh my God, look at all this data that Microsoft is taking.  What if that data has exploits? You know what? That’s the tradeoff. There might be issues in the data feeds that come back. But in return, they get to know what bugs are in the world and fix them. And it’s part of really of how you create ecosystem that is, if not self-healing, repairable.

Brian: Right, right.

Jim: I mean, we get buried in data as it is sometimes. And if we want to instrument our code better, we’re going to be creating more data.

Dan:  Let’s find some guns and see if there’s some smoke coming out of them. Better instrumentation can really go ahead and take right now what’s an ugly problem, and just give you the clean answers. Zane Lackey, who now runs Signal Sciences, and used to run security over at Etsy has this great set of slides. It’s called Attack-Driven Defense, one of my favorite decks of all time. And he’s basically showing, look, here are errors that only happen in the time period after an attacker has found a vulnerability, but before they’ve successfully exploited it.

Brian: Right.

Dan: These bugs, they only happen when the SQL engine is breaking. If this bug happens, file the bug. They got it to the point where they had splunk auto-filing critical bugs and it was always accurate. That’s where the world is going towards.

Brian: Right, right. You know, and see if I’m interpreting this correctly. I think one of the things that we’ve been finding is that the more and more you can direct your analysis toward anticipated actual attacks, and even understanding the motivation or the types of things that attackers are doing, you know, trending in the environment, will help you to understand what data is really valuable and what’s perhaps just a bit junk.

Dan: You need to realize you’re playing a game. This is player versus player programming. But guess what? It’s your network. You get to set the rules of the game, and you get to cheat. You get to say hey, you guys are playing on my battleground. You’re in my environment. You get to make those rules. So, make them.

Brian: Yeah, very good. Make the rules. So, what are your thoughts on threat data sharing?

Dan:  It’s just I think as a trend, where I think we’re going to go, is I think we’re going to go towards a lot more distribution and openness. The data that’s going to be out there about threats is just, we’re just going to have to accept it [being out there]. In some instances, the bad guys know that we know. Because it is worse that the right good guys don’t know. And that is really what we do when we’re talking about open disclosure of vulnerabilities.

You could have a world where we found the five or ten most interesting parties that had a particular vulnerability. And we’re like, you really need to patch your SSL stack. We could do that, and the odds that we would get enough of the SSL stacks are zero. We wouldn’t. If you want to actually fix certain bugs, sometimes you just got to talk openly about it. So I really think that we’re going to see a trend towards what in the past were going to be forms of threat data sharing that we shied away from.

Certainly in the ad space we’ve been talking about, look, there’s some domains. There’s just bots there, and we’re just going to tell you who they are. And at one point, we were like maybe we don’t ever want to share [any of] that. And now, now there’s a realization, we need to start talking about our problems. You can’t manage what you can’t measure, and interestingly, you can’t predict who needs to be able to do the measuring.

Brian: Yep, I think you’re right. You know, one of the reasons we do this program is partly because we feel the need to try to get a larger audience, in terms of understanding what the threats are, what activities are taking place. And you’re right. There’s a tradeoff between making it publicly open versus trying to keep it closed to a closed group.

But as I think you’re pointing out, your opportunity for distribution, if you’re trying to do it in a closed community is much more limited, and the attackers know. They know what attacks they’re performing, and they know what things block them. And so ultimately, they’ve got the insights. We need to try to get the good guys with more insights.

Dan:  It doesn’t mean that everything needs to be a big loud hooah about whatever, because sometimes it’s great to just fix things quietly. Let me tell ya’, I’ve done the big thing. I like the little thing, too. But I really do think particularly with threat data, we need to really, really start evaluating when we have it, could there be more good done if we were open with it?

Brian: You know, you mentioned – go ahead, Matt.

Matt: Well, I was going to step in and say, what are your thoughts on – if the data is truly open, as you’re saying, and it’s more of a distributed, where everyone can potentially be a source of that threat data, my concern would be vetting of that threat data. I’ve seen good intel, and I’ve seen terrible intel.

Dan: Oh, it’s true.

Matt: And if everybody’s feeding from those, all of the pools at once, you’re going to have your sock, you know, flipping their lid over the number of false positives you keep tripping.

Dan: So raw – I’m mostly referring to raw intelligence –

Matt: Okay.

Dan: – being anything that’s more open. The absolute problem when – you’re entirely correct – when there’s too many sources. There’s too many opportunities for people to inject bad data. There’s a really fun attack class, where what you do – people are saying oh, you know, there’s lots of IP’s on the Internet. So if a few of them are attacking us, let’s just block them outright and you know, whatever, they can go away. And so, what someone does, they pretend to be the Internet’s DNS root servers.  The Internet’s DNS root servers are attacking you, so they get blocked. And in eighteen hours, the network goes down. So you’re absolutely right.

It’s just the ability to develop vetted intelligence is seeded on there being the availability of raw intelligence. I can’t tell you how many – there are entire attack classes that are not public. And no one can even start addressing defenses for them, until they become at least somewhat public.

Brian:  So that that piece of threat intelligence, you know, you may report, there’s a command and control server at this IP address. But it only worked for that one attack, and that one time, in that one place. And for everybody else, it’s something different. That information is really useless in the threat intelligence world. So those kinds of things, you’re absolutely right.

Dan: Or maybe that means they don’t do the attack in the first place. Maybe they’re afraid they’ll get caught, because they’re seeing themselves getting caught.

Brian: Creating a deterrence is a very positive thing, absolutely. So, bringing it out to the public I think is a very helpful aspect of this. So, what do you think is the solution? How do we get a more open sharing environment?

Brian: You have a good point, yeah. We haven’t tackled that healing challenge yet.

Dan:  And it is the kind of thing where I don’t even know if market forces are going to be sufficient in order to fund this. But the value I think to the global economy of really funding, hey, let’s run a defense for six months, and run it on a similar population, only have it be absent. Let’s have a placebo – a control group and an experimental group. Come back in six months and see if there’s a difference in infection rates.

This kind of work is actually a good thing to do, and it is not the kind of thing we’re doing in information security. So I think the path towards any of this stuff working is actually investing in finding out what works and what doesn’t, and it’s going to be expensive. It’s going to be really expensive.

Brian: Yeah. You know, we’ll talk about this, I think in sort of a broader context in a little bit here. So, let’s take a little quick break here. We’ll move over to Jim. And Jim – actually Dan, you had mentioned some work, you were with Microsoft. And Jim’s going to tell us a little bit about some changes in the patching processes. So, tell us about it Jim.

Jim: And he talked about how Microsoft with Windows 10, which is due out later this year, is going to change their patching processes a little bit.

It’s not going to be one big Patch Tuesday every month. For home users, they’re going to start making the patches available as soon as they’re ready, not holding them until the second Tuesday of the month.

So businesses will be able to set their own date, when within the month they want to apply patches. And they can wait a little bit after they’ve been tested out on the guinea pig home users. It’s kind of an interesting change in the way they go about doing things. We’ll have to see how well it works. I mean, it seems to work okay for the Linux distros these days. They release their patches whenever they’ve got them ready.

And one – I think it was the Register article that was explaining this said, this new policy looks sort of like apt-get update and apt-get dist-upgrade, which is how This is a similar kind of thing. You can automate this and do it fairly quickly. So it’s going to be interesting to see how this works out.

As I said, it’s kind of appropriate that we’re mentioning this on Patch Tuesday. Because as I said, there were a whole bunch of new patches again this month, and three of them that Microsoft called critical. And a few more that I probably would have called critical, because they’re remote code exploit. But we’ve discussed that in previous months, so.

Brian: Okay. What are your thoughts, Dan, about the changes in the practices’ area? Do you think this is a good thing?

Dan: Sometimes, patches break a whole bunch of stuff.

Brian: Sometimes, they do. That’s certain.

Dan: If we can at least get to the point where there’s like the bleeding edge, the normal business, and the factory floor that absolutely never needs to change, that’s at least three. And that’s less than there might otherwise be.

Well, I’m going to be honest. It’s a hard problem to patch software, because there’s just so many moving parts. Google got into a ton of trouble when they had made a dependency in Chrome on a new feature in the Linux kernel. But the default Ubuntu kernel didn’t have that feature. So Chrome just stopped working on Ubuntu. And as far as I know, that state continues. So this is the difficulty of software. We are constantly putting things together, and hoping, dreaming, assuming it’s going to work after.

Chrome and Firefox have actually done a very good job of showing that yeah, you can actually really keep updating things. But remember, the dependency in the browser world is you got to work with the latest browsers. There’s an entire team at every major website that makes sure stuff still works. And let me tell you, when stuff is broken, yeah the browser guys try not to, but the web guys go ahead and are there to fix it.

What happens when it’s a business that has, like the IT guy, who’s maintaining some old binary code? There is no source around. That’s the kind of guy who’s like hey, I don’t want any moving parts in my operating system that are surprising me.

Brian: Yeah. So that’s an interesting dynamic, because when you start getting into the business aspects of it, it seems like it’s more going toward the needs of the many kind of thing. Where the needs of the few are kind of – I’m using the Star Trek thing here. Where the needs of the few start to get [managed] but the needs of the few start to get belittled. And so it’s that case where, are there really that many Ubuntu users of Chrome that needed to be out there? Is that really a priority? Or, are they really just satisfying the majority of the users?

Dan: And a lot of companies have tried to go without, and eventually – this stuff comes in waves.

QA does well, but it’s slow. People are like, well let’s just get rid of it. And find out in the field, move fast and break things. And then, they move fast and break things. And things are broken, and it’s really bad. So, we’ll see exactly what ends up happening here. There are processes and procedures where the code you end up putting out, more likely to work in the first place. Or, you put it out in waves. Certainly, one of the big ways that Chrome – you may not notice it, but you are randomly running random future builds of Chrome all the time.

Brian: Right, right.

Dan: And that’s how they find out before they do a production release, is this something that’s going to go break everything? They actually get telemetry back. It really all does come back to telemetry. This is, you know, security engineering problems are, in very serious ways, just more engineering problems.

Brian: And if they don’t get complaints or they don’t get that negative telemetry back, then they can do a broader thing. And they’re not waiting for monthly cycles to do that.

Dan: Microsoft is right.

Brian: Yep, absolutely.

Matt: And maybe you’ll work, and you’re going to come out on the other side okay, or maybe you won’t.

Brian: Well, and in some cases, they can have a volunteer group that does that. You can sign up for, would you like the beta releases of something to evaluate them? But my suspicion is in most circumstances – I can’t speak for Microsoft in this case. But my suspicion is in most cases, it’s like well, let’s try it. And well, you know, if – maybe I need to reboot or something and then –

Dan: Well, it’s a specific style of engineering. Where if there’s a failure, you actually have like local rollback. It’s like hey, this was tried. It didn’t work. Don’t do any damage. And you got to be really careful when you do it, and it makes your patching and it makes your testing more expensive. But the reality is, is someone’s going to be the hamster.

Matt: True.

Dan: You need to have the ability to update problems.

You can have an infrastructure that can survive your 1 out of n, where n is unknown, but not impossible. One out of n times, a patch is going to break things. Figure out how to survive it. That was one of the big reasons why Windows update changed the world. It took Windows from a thing where attackers could assume that a bug today was always going to have a large population.

Brian: Right.

Dan: To one where it was like bugs had a timeline. And once they were going to go, that was like, they’re going to go. And it made things better. It made things a lot better.

Brian: I’m glad you brought that up. I’m not sure I’ve ever mentioned it on this show. But I absolutely agree with you. I think Microsoft really changed things when they did the automatic update. They weren’t the inventors. They were following, I think. But the –

Dan: I think they did it – they were the first ones to do it right at scale. And by that, I mean it wasn’t – updating systems is hard. And forget all the stability issues, although they’re pretty significant. Just secure –

Brian: A large diversity of different systems, yeah.

Dan: And sometimes they’ll be secure, unless there’s a bad guy, unless someone blocks the secure side. Then goes well, I need a patch, so let me get this random code. Oh, look at this, you know.

Brian: That must be better, because it’s not what I’m running now, right?

Dan: I wish you were joking, but that’s totally the design assumption.

Brian: I’m pretending to joke here.

Dan: Of course.

Brian: [So, Paul Vixie.]  You’ve worked with him quite a bit, huh?

Dan: Yeah, he jokes, he spent six months in a well with me.

Brian: With a positive outcome.

Dan: We fixed DNS. We fixed a big part of it, to the degree it could be.

Brian: Yeah, that’s good. So, he made a recent proposal. And Matt, maybe you can tell us a little about it, and we’ll talk about it some more.

Matt:  And it’s either going to be a thirty – could be anywhere from thirty minutes to an hour, to a week. The time period is something that I think people would still be debating for a long time. But the idea is if anybody can give reasons why this domain should not be able to be used, it would be denied. But it also has – that cool down period means that no one can use it in that time period. So anybody who’s registering large numbers of domains, and immediately using them and throwing them away, will no longer have this advantage.

But, I feel like there’s always edge cases in a system like this, where if you throw a monkey wrench into the flow of things, it will have a bigger impact that maybe you haven’t quite thought about yet. I’m not saying he hasn’t thought about it. But I’m saying I don’t know what it is yet, personally.

Brian: Well, I’m going to ask Jim’s opinion on this. Because I think it was just a week or two ago, Jim, that you talked a little bit about a domain named Generator algorithm that was – what was he using? The exchange rates as one of the feeders into –

Jim: Right. He was using European Central Bank euro exchange rates in their algorithm, yeah.

Brian: So, it sounds to me like this is really a proposal to try to put a deterrence against that sort of thing. That is, if there is a domain named Generator, you would have to have some type of a way to predict what it’s going to be, so that you could get past that wait period, before the domain name could be actually activated. In which case, hopefully somebody else has some knowledge of it, and you’d be able to sway its’ potential use in that malicious activity. So, I don’t know. Do you have any thoughts on this, Dan?

Dan: When Paul says, there’s really not a legitimate use for a lot of these domains that have only been around for thirty seconds, he’s probably right.

Now, third-level domains, people are generating random third-level domains all the time, because there’s all these interesting reasons why you want to have randomized or data contained inside of a DNS label. That has a bunch of legitimate stuff. But second layer stuff, he’s right. You know, when you have something where 99.99999 percent of uses are illegitimate, you got to kind of look askance and say hey, you know, maybe this is where we put some pressure.

Brian: Yeah, this is a terrible analogy but I can’t help but think. When I read it, I was thinking this is what has to happen when you go to buy a weapon. You know, you got to buy a weapon and they say well, we want to make sure you’re not mad, and buying a weapon. Or having some malicious intent planning to rob a bank or something, and buying a weapon. So, but DNS is not a weapon, obviously. But it’s a case that can be – it’s a tool that can be used in nefarious ways.

Dan: Like think about how much money that has gotten paid for like Amazon and Apple, and Microsoft to have their domain names, versus like how much money they made for them. Like, nano-pennies on the dollar.

And that wasn’t going to be the way for AOL or Minitel or all the pre-Internets. But on this Internet, it’s very inexpensive to go on. You don’t pay a gatekeeper tax, and that’s really part of the heart of the success of this Internet. Where things get to be a bit of a headache is low friction for good honest providers is also low friction for fraudsters. And so, a real observation is that the fraudsters are trying to leverage the availability of the DNS, because it is the most available thing available.

They’re leveraging that availability. They’re using stolen funds to buy all these domain names. And maybe there’s an argument that they don’t necessarily have to work quite that quickly. That those who wish to defend themselves should be able to use the age of the domain. And in fact, the domain should take a little while to age into legitimacy.

Brian: Right. Like any good liquor, right?

Dan: Yeah, right?

Brian: Okay.

Dan: That’s right. Jack Daniel’s security.

Brian: Well hopefully, you don’t have to wait twenty years for it, right? I think they age at three years or four years or something. So anyway, we’ve talked a lot about DNS. We’ll go take it a step further here. You’ve been a big proponent of DNSSEC and, but it’s not quite there yet. I don’t know if you know this, but we’ve got a little back and forth. I have sort of some reservations about DNSSEC. But by the same token, let’s talk about it a little bit. Where do we need to go?

Dan: Because we have a law called HIPAA that says if you can’t communicate securely, you can’t communicate at all.

Brian: Right, right.  So we haven’t tackled – we really haven’t managed the key management activities.

Dan: It lets you get key material as easy as you get basic connectivity.

Brian: So it sounds like you’re going beyond DNS per se. It’s not necessarily just looking for domain names, but perhaps to use it for key material distribution.

Dan: The whole point of DNSSEC, we the real point is to get security as functional as connectivity. Like, it’s not a coincidence that our lack of DNS in security – like it’s not a coincidence that we have no DNS in security, and we don’t have security to scales. It’s a consequence. That’s why it’s not scaling. You look at what the world would look like for IP connectivity if you didn’t have DNS, and it looks exactly like the nightmare of key management.

And the nightmare of key management is very specifically that it is very difficult to automate. We have to get significant automation in security if we want any hope of solving a lot of our problems with the resources we have available. And where that’s going to ultimately go is we’re going to use the DNS as our cross organizational key store. This is what’s going to happen.

Brian: So, I agree with you thoroughly. We need to do something to improve the security on DNS. And I guess, where my reservation comes in is completely separate from that point. I think it has more to do with the way we went about implementing security for DNS. That is some of the fundamental issues that we deal with on DNS – we were talking a little bit earlier about reflective attacks and the opportunity for using UDP-based protocols in a nefarious way. It’s really just any UDP protocol that has this problem.

Dan: DNS because there’s this record or that record. Who cares what records are? The point is, is the underlying IP layer, and our underlying ability to trace DDoS floods have a problem. That’s where we need to fix.

Brian: Is there a time when we should, if we suspect something, switch to TCP?

Dan: It’s weird about what do we do about the fact that in UDP-based protocols – the problem is, is that with UDP-based protocols, there’s no evidence that the other side actually wants to talk to you.

Brian: Right.

Dan: We used to have a thing in IP called source quench, where the thing on a generic way could say hey, stop talking to me.

Brian: And it might listen.

Dan:  And the answer is to actually start investing in mechanisms for pushback, where we get automation throughout the traceback flow, throughout the shutdown flow, throughout firewalling. And it’s doable, but we got to do it. But protocol design is a mess right now. The real world is like: Hi. The first thing you get to do is route everything over HTTP and probably HTTPS, because there’s something in the middle that might mess with you. Protocol design is sausage engineering in 2015. You really don’t want to know.

Brian: Yeah, to your point. One of my slogans has been that on the Internet, there are no rules. There are generally guidelines.

Dan: Right.

Brian: And I think that’s fundamentally what we have to sort of overcome. As we really need to kind of lay some groundwork on what are really good practices, and have some means to enforce that. And I think that was one of the topics that you kind of had here is that, you know, how are we going to fix this? Is there a way to really improve our situation from a security standpoint?

Dan: One of the quotes [large financial institutions] told me is we don’t compete on security. Because if any of us get hit, we’re all getting hit.

Brian: Right.

Dan: There’s significant tooling that everybody needs to exist. And that in some ways as us, and professionals in information security, we’re the only people actually directly exposed to the problem. We’re the people in the muck, to deal with all this stuff. The tools we build to start dealing with it, that stuff needs to be shared a lot more than I think it already is.

Brian: Yeah, sharing a lot more. You know, and in one respect, I think it may be just fundamental information overload for the, you know, your practical human being. That is us, as practitioners in security, we’re paying attention to the security aspects. But for the folks that are not practitioners in security, it’s an overwhelming amount of information that needs to be comprehended, in order to do a completely separate activity.

Dan: What do we do for them? You know, there’s one thing about like building hard things that are hard. But there’s another thing – there’s like building hard things, so that the next guy, it’s easy.

Brian: That’s exactly – build the modularity, so it’s fool-proof.

Dan: You know what?  There’s a lot of people out there really who can see a crash, but have no idea, what do I do with the crash? I’ve got 100 crashes. Which are the ones that I need to go ahead and prioritize in the bug database? Because it’s a problem. And Microsoft said, fine, here’s a tool. Type this, it will tell you. And that is the path to follow. How do we find our problems and figure out what makes them easy to solve? That can be open source that’s out there. That can be even just releasing reviews and experiences of commercial products. Like it has to be the stuff that makes things better is widely known to make things better.

Brian: Yeah, you know, an analogy that just came to mind is I wanted to build a shed to store my junk on my property. And I am terrible with a hammer. My solution? Bought a nail gun.

Dan: There you go.

Brian: It’s a tool that made the job easy, and it was only a couple hundred dollars. I love it. Just don’t try to nail on your own.

Matt: But if you had known that you can buy sheds down at Home Depot for around $70, you might have gone that route. But you didn’t have the information yet.

Brian: I bought a kit. So Matt, tell us a little bit about a new kind of rootkit? Is it a new?

Matt:  But, you know, it depends on what machine you’ve got. But the thing about it is, a GPU, is it’s a standalone processor that you slide into your machine. It handles graphics functions, but it can also be used for other functions. It is a full on processor. People often use them for doing bitcoin mining, or hash cracking, or other computationally intensive stuff. So someone has written code that runs entirely in there, stores itself to the memory on this card, and is effectively invisible to most antivirus.

So this is a rootkit, so it has the ability to hide other codes. So you might use it in order to hide your malware, which is still running on the CPU. And you can access system memory using DMA, Direct Memory Access. So it’s interesting. Like I was saying, this exists in other forms as well. People have written codes that runs entirely on the controller of a hard disk. So again, if it’s running on a separate machine, and it is a fully separate machine, it doesn’t have the same kind of – your antivirus is not going to be looking for it. Or at least, today’s antivirus is not going to be looking for this.

Brian: It’s almost like an IoT thing, an Internet of Things thing, but it’s just not a network interface. It’s like a PCI interface, for example.

Matt: Standalone GPU’s don’t exist in all hardware. And I think – in all PC’s. And I think that if you want to have malware that’s truly successful, and spreads widely and runs on most platforms, you wouldn’t necessarily limit yourself to hardware that you’re on the fence, as to whether or not most of your targets will have it.

Brian: Would this be kind of specialized to particular GPU’s as well?

Matt: I’m not actually sure about that. And I guess it depends on the architecture of the GPU, and I’m not an expert on them. I would defer to somebody else.

Dan: I’m used to [this question].

Brian: Okay.

Dan:  Granted it’s the big one, but all those other ones mutually trust each other. See, the way it works when you’re doing computer engineering is, it’s like man, you know, making the CPU’s spend all this time dealing with this fiddly problem is really inefficient. Let’s take that problem and put it on a dedicated device. And then it’ll just like access memory, and send events saying, I did the job. So you compromise the external device, and you get all the access, and you don’t have to deal with all that pesky inspection.

So, it’s of course not limited to GPU’s. It Most likely, it’ll have to be customized. There are two things you’re trying to do when you operate off the main CPU. One, you’re trying to evade detection during that particular boot. Potentially, there’s dedicated memory that no one can see you’re running, that you’re pulling, that you’re doing stuff.

You’re also trying to achieve persistence. There’s a reason why there are facilities. And if a machine is compromised, you throw it out. It’s a very expensive solution to the problem, but it’s also the only way to be sure.

Brian:  You know, rsync is basically a tool to be able to synchronize files between two systems. It’s oftentimes used for a backup tool, or to be able to basically redundant systems. There’s a good possibility that some activities are taking place across the Internet. It’s actually a single source in China that’s doing most of this probing activity. They’re also probing a variety of other ports. I didn’t try to enumerate those here, but a number of other ports.

It would be indicative of trying to perform penetration activities against systems. So, keep an eye out. If you are using rsync over the Internet, you’ll want to pay attention to that. And even if it’s not intended to be on the Internet, you might want to make sure that it’s not exposed to the Internet.

Jim: The timing of that is interesting, because as I recall, I think back in September or so, there was a vulnerability in rsync on some load balancers. That I don’t know exactly what the timeframe was that that scanning started, but it looked like it might have been back in that vicinity.

Brian: We’re showing 120 days of an activity, and it was actually in the beginning of March that we saw sort of an uptick in this activity.

Well, it turns out that Jordan Wright had done a couple of blogs on this particular topic. One where he was – and this is actually just from yesterday, May 11th, where he had been tracking, over 60 days, watching hackers attack Elasticsearch. In fact, he had found a vulnerability that perhaps is associated with this particular activity. Dan, you had taken a little bit of a look at this. Any comments?

Dan: Really?

Brian: Yeah, really.

Dan: No, no. Like, there’s remote code execution vulnerabilities. And then where there’s just a field, that’s like, please place the code in that you would like us to execute, to run across this search. And at some point fairly recently, they’re like oh, maybe we should put that into the Java sandbox, which is basically a discredited sandbox. It’s very clear, this thing that’s easy to break out of.

Brian: Yeah, I had a lot of problems .

Dan: .

Brian: Right. You know, to have a feature like this in a closed environment. You know, we talk oftentimes about having layers of defense. And if your only layer of defense is a sandbox, it’s probably not a good defense.

Dan: So it’s a very good point, that this feature’s totally fine if you are running the code as kind of a local thing. But by default, it wasn’t installed on a local thing. It listed on all interfaces on port 9200. And then shockingly now, people are scanning everything on the Internet on 9200. So hopefully, this is getting managed.

Brian: And there is the potential, you know, like we saw with the Bash vulnerability. Where there is the potential that perhaps there is a frontend interface to an Elasticsearch system. Perhaps they’re not scanning port 9200 here, but it could be a web interface that would potentially expose this vulnerability as well, I presume.

Dan:  I always like to talk about the most million important lines of code, that are being exposed to attackers, that would cause problems across the global Internet. It should be the same for your organization. It’s okay to use open source. Everyone does and it’s really good stuff. But when there’s problems like this, especially as you say, you know, what you’re doing is doing what John Lambert at Microsoft calls, not thinking in terms of lists, but thinking in terms of graphs. It’s not just what’s exposed on 9200. It’s what’s exposed on 80 that forwards stuff to 9200. Because that is how you find really good attacks.

Brian: Good, from an attacker’s point of view.

Dan: Well, yes.

Brian: Okay. This next – go ahead.

Jim: Yeah, well and the guy, Jordan Wright, who did the blog post that you were looking at a minute ago, also released a Honeypot, Elastichoney that I’m going to throw up on one of our honeypots, and see what we can get out of that. That sits on 9200 and pretends to be Elasticsearch.

Brian: Yeah, very cool. We’ll look forward to some results from that. Next item here is we have flows, packets, and bytes that were off the charts, relatively speaking, on port 53/udp. That’s DNS. We talked a lot about DNS today. We’re showing 30 days of activity. And really what this amounts to, this was actually a reflection attack. And it turned out that this was a reflection using NTP, so the source port is 123. The destination port happened to be 53. So they were targeting –

Dan: Can’t they just pick one?

Brian: Looking at the top ten most probed ports. At the top of the list here, we have port 80. Look at that, port 80 is through the roof. That’s rather unusual here.

And probing by the way, this is looking for sources that are making connections to lots of different addresses on a common port or a handful of ports. And so, we track that activity as probing or scanning activity on the Internet, and it helps us to identify this sort of activity. Port 80 is normally probed quite a bit. It usually shows in the top ten, but not at this proportion. So this is a little bit of an anomaly, a big anomaly that we’ll take a little closer look at. Followed by port 22/tcp, 23/tcp, no surprises there. Port 445, can you believe it? Still conficker on the Internet.

Dan: No!

Brian: They appear to be actually sort of a SYN flood against a block of addresses. So it’s actually about, like a slash 23 address block.

And so you see lots of flows from each of these source addresses being thrown to those, on the course of tens of millions of those, of course. So it appears to be a SYN flood against a block of addresses that are located in China. They appear to be associated with video game hosting. So it appears that perhaps somebody has a bit of a beef against them. We don’t have intimate details of that however. Interesting, we’ll call it false positive in the class of probing activity.

Next one here is probes on port 23/tcp. That’s Telnet, and we do have an increase in that. We’re showing 90 days of activity here. And over the last week or so, you can see that there’s been an uptick in that activity. We’re going to take a look at that, in terms of the number of sources doing that probing in a couple of minutes here. And then looking at the – in fact, in a couple of seconds – most sources doing the probing, port 23 at the top of the list. It’s clearly far and above the others, and moved up a couple of places relative to last week. Followed by port 445/tcp.

And then we also have some other ports. We’re going to take a look at port 23 and port 17788 a little bit more closely.

You know, we had identified this as being very indicative of BitTorrent activity, and it appears to be associated with some pirated video content, basically being distributed toward China. The reason I brought this up again, we’ve reported on this a couple of times is that, it appears that whatever activity here, they had a little bit of a disruption in service. And that seems to be a pretty typical – even the folks that are doing bad things have these reliability issues that they have to deal with.

So in any case, that’s our show for today. We’d like to thank you for joining us. And if you’d like to get in touch with us, you can email us at threattraq@list.att.com. And you can find ThreatTraq on the AT&T Tech channel. It’s att.com/threattraq. It’s on YouTube and on iTunes. You can follow us on Twitter. Our handle is @ATTSecurity. And Dan, your Twitter handle.

Dan: @dakami, D-A-K-A-M-I.

Brian: All right, so we really appreciate your feedback. If you’d like to share your thoughts or questions, we look forward to hearing that. I’d like to thank you, Dan, for joining us today. Very much a pleasure. I really enjoyed speaking with you here today.

Dan: This was a lot of fun.

Brian: Thank you, Matt. Thanks Jim. I’m Brian Rexroad. We’ll be back next week with a new episode. And until then, keep your network safe.

Categories: Security

The Little MAC Attack

May 7, 2015 3 comments

THIS IS NOT A BREAK OF HMAC.  THIS IS NOT A BREAK OF HMAC.  THIS IS NOT A BREAK OF HMAC.

That being said:

Let bz=blocksize(h), k=a[0:bz]^(0x36*bz): If h(a)==h(b) and a[0:bz]==b[0:bz], then hmac(k, a[bz:])==hmac(k, b[bz:])

Given a hash function with a collision and a key either known or controlled by an attacker, it’s trivially possible to generate a HMAC collision.  The slightly less quick and dirty steps are:

1) Start with a file at least 64 bytes long
2) Generate a collision that can append to that file.
3) XOR the first 64 bytes of your file with 0x36’s.  Make that your HMAC key.
4) Concatenate the rest of your file with your colliding blocks.  Make those your HMAC messages.
5) HMAC(key, msg1+anything) will equal HMAC(key, msg2+anything)

Here’s the quick demo:

>>> f
‘ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff”Fresh Prince Of Bel-Air (Theme Song)”\n\nNow, this is a story all about how\nMy life got flipped-turned upside down\nAnd I\’d like to take a minute\nJust sit right there\nI\’ll tell you how I became the prince of a town called Bel Air\n\nIn west Philadelphia born a’
>>> key=f[0:64]
>>> key
‘ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff’
>>> trans_36 = bytearray((x ^ 0x36) for x in range(256))
>>> key=key.translate(trans_36)
>>> key
‘PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP’
>>> msg=f[64:]
>>> collision0==collision1 # calculated w/ md5coll, thanks Stach!
False
>>> hmac.new(key, msg+collision0).hexdigest()
‘917e8efe9a5c692e50b86e9581c509f3’
>>> hmac.new(key, msg+collision1).hexdigest()
‘917e8efe9a5c692e50b86e9581c509f3’
>>> hmac.new(key, msg+collision0+”can also append arbitrary content”).hexdigest()
‘b8a879d7cc7a89ed59d0768fdfec3351’
>>> hmac.new(key, msg+collision1+”can also append arbitrary content”).hexdigest()
‘b8a879d7cc7a89ed59d0768fdfec3351’

How cool is that???

We might have different definitions of cool.  Particularly since — to aggressively not bury the lede — there really shouldn’t be any security impact here.  HMAC depends on a secret.  Obviously if the attacker knows the secret it’s not a secret!  And in what universe would it be HMAC’s responsibility to provide collision resistance for its contained hash?  So this is in no way a break of HMAC (at least in any sane use of the construction, though sadly sanity is not in fact universal).

And yet.  This is novel — I believe Little MAC the first applied “attack” against HMAC, in any form.  And importantly, it’s simple and elegant, something that can be explained.  And it’s fun!  Remember when we did things for fun?  So, let’s talk about this Little MAC attack and how it works.

====

We’ll start with the basics.  Hash functions exist to take fingerprints of data, like so:

$ python
 Python 2.7.8 (default, Jul 25 2014, 14:04:36)
 [GCC 4.8.3] on cygwin
 Type "help", "copyright", "credits" or "license" for more information.
 >>> from md5 import md5
 >>> md5('1').hexdigest()  # hash of "1"
 'c4ca4238a0b923820dcc509a6f75849b'
 >>> md5('2').hexdigest() # hash of "2"
 'c81e728d9d4c2f636f067f89cc14862c'

The files can be of practically any size, like people can, but their fingerprints (or “hashes”) end up the same size (128 bits, or 32 hexadecimal characters for the MD5 hash).  It’s supposed to be unrealistic for anyone to make two files with the same hash.  This allows all sorts of useful security properties, like it being safe to retrieve a file over an insecure channel because you know the hash it’s supposed to end up with.

Supposed to.  Weasel words if there ever were.  In 2004, Xiaoyun Wang of Shandong University showed MD5 doing the thing it really wasn’t supposed to do (that, to be fair, Hans Dobbertin made clear was inevitable back in ’96):

>>> from md5 import md5
 >>> from binascii import unhexlify
 >>> a = """
 d8823e3156348f5bae6dacd436c919c6 dd53e2b487da03fd02396306d248cda0
 ... d131dd02c5e6eec4693d9a0698aff95c 2fcab58712467eab4004583eb8fb7f89
 ... 55ad340609f4b30283e488832571415a 085125e8f7cdc99fd91dbdf280373c5b
 ... d8823e3156348f5bae6dacd436c919c6 dd53e2b487da03fd02396306d248cda0
 ... e99f33420f577ee8ce54b67080a80d1e c69821bcb6a8839396f9652b6ff72a70
 ... """
 >>> b = """
 ... d131dd02c5e6eec4693d9a0698aff95c 2fcab50712467eab4004583eb8fb7f89
 ... 55ad340609f4b30283e4888325f1415a 085125e8f7cdc99fd91dbd7280373c5b
 ... d8823e3156348f5bae6dacd436c919c6 dd53e23487da03fd02396306d248cda0
 ... e99f33420f577ee8ce54b67080280d1e c69821bcb6a8839396f965ab6ff72a70
 ... """
 >>> a=unhexlify(a.replace(" ","").replace("\n","")) # translate from hex to binary
 >>> b=unhexlify(b.replace(" ","").replace("\n",""))
 >>> a==b
 False
 >>> md5(a).hexdigest()
 '79054025255fb1a26e4bc422aef54eb4'
 >>> md5(b).hexdigest()
 '79054025255fb1a26e4bc422aef54eb4'
 >>> md5(a).hexdigest()==md5(b).hexdigest()
 True

And that is how you get a standing ovation at Crypto 2004.  Over the next few years, various people (myself included) showed how to extend this result to compromise real world systems.  Probably the best work was by Stevens and Sotirov et al with their work actually compromising the processes of a real world Certificate Authority, acquiring extensive powers over the web using a very elegant MD5 attack.  And so, slowly but surely, industry has learned not to use MD5.

But what about HMAC-MD5?

===

HMAC — Hashed Message Authentication Codes — solve a slightly different problem than pure hashes.  Let’s say Bob is validating that some data matches some hash.  Why should he trust that hash?  Maybe it didn’t come from Alice.  Maybe it came from everybody’s favorite hacker, Mallory.  If only Alice and Bob had some secret they could use, to “mix in” with that hash. so that Alice had enough information to provide a “keyed hash” but Mallory did not.

There are lots of ways of doing this, many of which are entertainingly ill-advised.  One way that is not, is HMAC, generally considered the standard construction (method of putting cryptographic primitives together in a way that does something useful, hopefully securely) for keyed hashes.  The first thing to note about HMAC is, at least at first, it does not seem to suffer from the Wang collision:

>>> import hmac
 >>> hmac.new("",a).hexdigest()
 'ad0b4561611f06292377ea66c7867db5'
 >>> hmac.new("",b).hexdigest()
 'c5c472f9a5ecf134c93aa09169cb8756'

So, blank keys mixed with Wang’s ‘vectors’ do not collide.  This research comes out of some conversations with Marsh Ray, who commented that HMAC with known keys derives to MD5.  It’s not that easy :).  You have to jump through at least a few hoops, that require looking inside of HMAC itself.

So what is HMAC?

Basically, it’s double hashing with a key and some cleverness, optimized for speed (for example, the data isn’t hashed twice, which would be slow, instead the outer hash runs across the results of the inner hash.)  Since the attacker isn’t supposed to know the key, they can’t generate new keyed hashes that are valid.  (They can sure replay old keyed hashes, but you know, that’s some other layer’s problem.)

Mathematically, HMAC is hash(key XOR 5c5c5c… + hash(key XOR 363636… + msg), with the size of the key being the blocksize of the hash (specifically, how many bytes it operates on at a time, generally 64).  HMAC-MD5 means the hash is MD5, unsurprisingly.  HMAC represented in Python looks like:

trans_5C = bytearray((x ^ 0x5c) for x in range(256))
trans_36 = bytearray((x ^ 0x36) for x in range(256))
blocksize = md5().block_size # 64
def hmac_md5(key, msg):

   if len(key) > blocksize:
   key = md5(key).digest()
   key = key + bytearray(blocksize - len(key))
   o_key_pad = key.translate(trans_5C)
   i_key_pad = key.translate(trans_36)
   return md5(o_key_pad + md5(i_key_pad + msg).digest())

Now, I see this, and I go — oh!  There’s two hashes, and only one of them actually sees the full message!  If I can get the two inside hashes (two messages, two inside hashes) to return the same value — thus destroying any information about differences between the messages — who cares about the outer hash?

You know, nobody blogs about when they get something wrong.   That’s me, being totally wrong.

MD5 collisions actually depend on initial conditions.  Even if MD5(a)==MD5(b), MD5(x+a) != MD5(x+b).  (!= is nerd for Not Equal.)  And even with a blank HMAC key, x starts out 64 bytes of 0’s, XOR’ed with 36, leaving a first MD5 block of 36363636…

Not what Wang’s collision expected to deal with — it thinks it’s going to be at the beginning of MD5 processing, not one block in.  So I think, ah!  I know!  MD5 wants to get the Wang vectors unmodified.  It doesn’t care if the bytes come in via one blob (a) or two blobs (x+a) or whatever, it’s just a stream of bytes to MD5.  So let’s split those vectors into the “key” component and the “msg” component — 64 bytes, and whatever’s left.  Now, of course HMAC is going to XOR that first 64 bytes with 36’s, but you know, XOR is reversible.  We could just XOR them first, and then HMAC will undo the damage, like this:

>>> 1 ^ 36
 37
 >>> 2 ^ 36
 38
 >>> 3 ^ 36
 39
 >>> 37 ^ 36
 1
 >>> 38 ^ 36
 2
 >>> 39 ^ 36
 3

This should at least take care of the Inner hashing, and indeed, it does:

>>> akey=a[0:64] # split the first 64 bytes...
 >>> bkey=b[0:64]
 >>> amsg=a[64:] # from the rest...
 >>> bmsg=b[64:]
 >>> akey_xored=akey.translate(trans_36) # 'damage' the keys
 >>> bkey_xored=bkey.translate(trans_36)
 >>> md5(akey_xored.translate(trans_36) + amsg).hexdigest() # let HMAC reverse the damage
 '79054025255fb1a26e4bc422aef54eb4'
 >>> md5(bkey_xored.translate(trans_36) + bmsg).hexdigest()
 '79054025255fb1a26e4bc422aef54eb4'

Huzzah!  We’ve gotten collisions in the inner hash of HMAC-MD5, using nothing but decade old test vectors.  So, we’re done, right?

>>> hmac_md5(akey_xored, amsg).hexdigest()
 'd9a57219479c530d0ae6d1a699a8e8b0'
 >>> hmac_md5(bkey_xored, bmsg).hexdigest()
 '87be91e4dfe3538589b965a912872432'

It…doesn’t work.  HMAC knows we’re up to something.  Well, of course.  While HMAC doesn’t run over your data twice, it sure does run over your keys twice, with two different XORs even.  Remember, those keys come from the first 64 bytes of two files that are not identical.  So HMAC sees the different data in the inner hash (where it’s compensated for), and in the outer hash too (where it’s not — can’t XOR defend yourself against both 36 and 5C, and this isn’t an accident).

Now, could we simultaneously generate a collision dealing with both 36 and 5C XOR masks?

Maybe?  It’s possible, but we certainly have no idea how to do it right now.  Such attacks are known as Related Key attacks, and they’re pretty rare.  There’s no established research I’m aware of here, in the context of MD5.  So I guess HMAC wins?

Ha, no.  We just need to be a little more creative.

===

Funny crypto story.  When Xiaoyun Wang first announced her collisions to the world, they actually didn’t work, and she took a bit of heat.  “Oh”, she apologized.  “I misread a few of the numbers in the MD5 specification.  Here’s the correct collisions.”

That was a few hours later, thus demonstrating her attack could be executed in hours and not, say, years.

>>> Xiaoyun_Wang=="Baller"
 True
 >>> Xiaoyun_Wang_Wishes_She_Was_A_Little_Bit_Taller
 False

Let’s talk about exactly how MD5 actually works, and what it means to generate two files with the same hash.  MD5 uses what’s called a a Merkle-Damgard construction, meaning it:

1) Starts with some initial values, A, B, C, D
2) Takes 64 bytes from a message
3) Uses those 64 bytes to shuffle those values in various interesting ways
4) End up with a new A, B, C, D
5) Go to Step 2 until a) Message is complete and b) 8 extra bytes have been included describing how many bytes were hashed (“MD-Hardening”)
6) Return A, B, C, and D as a series of bytes representing the hash of the data

There are two interesting elements from this design.  First, every 64 bytes, there’s a new A, B, C, and D, and they’re supposed to represent whatever might have been different in what came before.  If there’s ever a collision — if bytes 0 through 127 collide with some other bytes 0 through 127 — anything tacked on will have the information about the difference destroyed.  So, you can have something like:

>>> md5(a).hexdigest()==md5(b).hexdigest()
 True
 >>> ping = "// " >>> md5(a+ping).hexdigest()==md5(b+ping).hexdigest() True

More interestingly, and more critically, collisions don’t have to start from the first block.  Wang’s corrected ones did.  They came from these “Initialization Vectors”, as extracted from this Pure Python implementation:

# Load magic initialization constants.
 self.A = 0x67452301L
 self.B = 0xefcdab89L
 self.C = 0x98badcfeL
 self.D = 0x10325476L

But as you remember, she had just as easy a time colliding against basically completely incorrect vectors too.  The colliders don’t really care, they just need to know what A, B, C, and D to start with.  Since those values are changed by whatever blocks you want, you can start your collision at any block you want.

So that’s the key to getting collisions in HMAC:  Yeah, it can detect differences in the first block.  So don’t collide in the first block.  Collide in the second, or wherever you want.  How do you do that?  Glad you asked!

First, you need some data to make a collision with.  I’m feeling fresh.

>>> f
‘ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff”Fresh Prince Of Bel-Air (Theme Song)”\n\nNow, this is a story all about how\nMy life got flipped-turned upside down\nAnd I\’d like to take a minute\nJust sit right there\nI\’ll tell you how I became the prince of a town called Bel Air\n\nIn west Philadelphia born a’

Second, we’ve slightly modified that Pure Python MD5 library to emit intermediate states, with this line:

print int(A), int(B), int(C), int(D)

Now, when we use puremd5, we see:

>>> import puremd5
 >>> puremd5.md5(f).hexdigest()
1245699185 1715851451 2228545190 3605688132
3587913948 1041697412 534724617 3149578532
2537882128 3367935924 570866200 4260972837
4196602073 370791185 3701485921 977618867
2646787607 1303232542 1363560139 1938082240
2027711712 2342103113 1309181583 4220729378
'e06cdc7849a8998b8f86084e223893fb'

Funny thing.  That’s a 320 byte file, with a 64 bit block size, but there’s six intermediate states.  What gives?  Well, again, MD5 needs another 8 bytes to express that it’s hashed 320 bytes.  So we get a sixth round, consisting exclusively of the contribution from this length encoding.

We’re going to skip that round, and take the intermediate values after our actual Fresh Prince lyrics — so, round five, where A == 2646787607.  What we want is two data sets that, when appended to the data that generates these intermediate values, collides.  To do this, we’re going to use Patrick Stach’s md5coll, like so:

$ ./md5coll.exe 2646787607 1303232542 1363560139 1938082240
block #1 done
block #2 done
unsigned int m0[32] = {
0x7f774ef0, 0xd082709d, 0x0c2e2dc4, 0xad45fdb8, 
0x60245a90, 0xa45e0fe4, 0x9709a7ba, 0xd9f19db7, 
0x0533ed55, 0xbff4a606, 0x7dcec2c3, 0x4ff04d6c, 
0x2e27c907, 0x66dac8f3, 0x8ab60422, 0xbff4808d, 
0xa31a0e9c, 0x9231c038, 0x0bf22863, 0x98847dd4, 
0x1678780b, 0x1cc891c5, 0x466a894a, 0x0076194a, 
0xa336e005, 0x87b55e66, 0xa4265f42, 0x560ee046, 
0x36636c97, 0x2a5fc139, 0x90b895d1, 0x38d03d76, 
};

unsigned int m1[32] = {
0x7f774ef0, 0xd082709d, 0x0c2e2dc4, 0xad45fdb8, 
0xe0245a90, 0xa45e0fe4, 0x9709a7ba, 0xd9f19db7, 
0x0533ed55, 0xbff4a606, 0x7dcec2c3, 0x4ff0cd6c, 
0x2e27c907, 0x66dac8f3, 0x0ab60422, 0xbff4808d, 
0xa31a0e9c, 0x9231c038, 0x0bf22863, 0x98847dd4, 
0x9678780b, 0x1cc891c5, 0x466a894a, 0x0076194a, 
0xa336e005, 0x87b55e66, 0xa4265f42, 0x560e6046, 
0x36636c97, 0x2a5fc139, 0x10b895d1, 0x38d03d76, 
};

In simple terms, if we put these bytes after our Fresh Prince lyrics, md5coll says the results will collide.  Let’s see this in md5 first:

>>> m0 = (0x7f774ef0, 0xd082709d, 0x0c2e2dc4, 0xad45fdb8,
... 0x60245a90, 0xa45e0fe4, 0x9709a7ba, 0xd9f19db7,
... 0x0533ed55, 0xbff4a606, 0x7dcec2c3, 0x4ff04d6c,
... 0x2e27c907, 0x66dac8f3, 0x8ab60422, 0xbff4808d,
... 0xa31a0e9c, 0x9231c038, 0x0bf22863, 0x98847dd4,
... 0x1678780b, 0x1cc891c5, 0x466a894a, 0x0076194a,
... 0xa336e005, 0x87b55e66, 0xa4265f42, 0x560ee046,
... 0x36636c97, 0x2a5fc139, 0x90b895d1, 0x38d03d76, )
>>> m1 = (0x7f774ef0, 0xd082709d, 0x0c2e2dc4, 0xad45fdb8,
... 0xe0245a90, 0xa45e0fe4, 0x9709a7ba, 0xd9f19db7,
... 0x0533ed55, 0xbff4a606, 0x7dcec2c3, 0x4ff0cd6c,
... 0x2e27c907, 0x66dac8f3, 0x0ab60422, 0xbff4808d,
... 0xa31a0e9c, 0x9231c038, 0x0bf22863, 0x98847dd4,
... 0x9678780b, 0x1cc891c5, 0x466a894a, 0x0076194a,
... 0xa336e005, 0x87b55e66, 0xa4265f42, 0x560e6046,
... 0x36636c97, 0x2a5fc139, 0x10b895d1, 0x38d03d76, )
>>> packed_m0 = struct.pack("<32I", *m0) >>> packed_m1 = struct.pack("<32I", *m1) >>> import md5
>>> md5.md5(f+packed_m0).hexdigest()
'ddebd8f9209fc7311687d7f9ff61a869'
>>> md5.md5(f+packed_m1).hexdigest()
'ddebd8f9209fc7311687d7f9ff61a869'

Does this work for HMAC?

>>> import hmac
>>> f0 = f+packed_m0
>>> f1 = f+packed_m1
>>> hmac.new(f0[0:64], f0[64:]).hexdigest()
'd372df68e47fd619089c2c795144e572'
>>> hmac.new(f1[0:64], f1[64:]).hexdigest()
'4f94497f947e424e378d30853bd6e421'

No, not yet.  Still need to compensate for the inside hash, and split our colliding message across a key (first 64 bytes) and a message (the rest).

>>> trans_36 = bytearray((x ^ 0x36) for x in range(256))
>>> f0key.translate(trans_36)
'PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP'
>>> f0key=f0key.translate(trans_36)
>>> f1key=f1key.translate(trans_36)
>>> f0key==f1key
True
>>> f0msg=f0[64:]
>>> f1msg=f0[64:]
>>> hmac.new(f0key, f0msg).hexdigest()
'917e8efe9a5c692e50b86e9581c509f3'
>>> hmac.new(f1key, f1msg).hexdigest()
'917e8efe9a5c692e50b86e9581c509f3'

Tadah!  Inside hash is mollified by XOR’ing with 36’s, and outside hash is happy because actually those two keys are identical.  The bytes that are different, yet still collide as per MD5, are only visible to the inside hash.  By the time the outside hash gets around to running, it’s already too late.  And of course we can in fact append arbitrary content, as such:

>>> hmac.new(f0key, f0msg+"hello world").hexdigest()
'7bad230645867d9802829db78289878c'
>>> hmac.new(f1key, f1msg+"hello world").hexdigest()
'7bad230645867d9802829db78289878c'

Presumably Steven’s HashClash / Chosen Prefix attacks will work just as well inside HMAC. And since this is a generic chosen key attack (there are no constraints on your key), it’s also a known key attack — choose the key you know but don’t control. Take the known key, XOR with 0x36’s, make that the prefix to your message, generate your collision. Simple.

Hope you enjoyed!  This almost certainly doesn’t have any security impact, but I’m happy(ish) to be proved wrong.

Categories: Security

Not Safe For Not Working On

September 3, 2014 1 comment

There’s an old Soviet saying:

If you think it, don’t say it.
If you say it, don’t write it.
If you write it, don’t be surprised.

It’s not a pleasant way to live.  The coiner of this quote was not celebrating his oppression.  The power of the Western brand has long been associated with its outright rejection of this sort of thought policing.

In the wake of a truly profound compromise of sensitive photographs of celebrities, those of us in Information Security find ourselves called upon to answer what this all means – to average citizens, to celebrities, to a global economy that has found itself transformed in ways not necessarily planned.  Let’s talk about some things we don’t normally discuss.

Victim Shaming Goes Exponential
https://twitter.com/houseofwachs/status/506442903883763712

Dumdum?  Really?

We shouldn’t entirely be surprised.  Victim shaming is par for the course in Infosec, and more than Infosec, for uncomfortably similar reasons.  When social norms are violated, a society that cannot punish the transgressor will punish the victim.  If the victim is not made a monster, their unpunished victimization today could become our unpunished victimization tomorrow.  And that’s such a dark and scary conclusion that it’s really quite tempting to say –

No, it’s OK.  Only these celebrities got hacked, not me, because they were so stupid they took sexy photos.  It attracted the hackers.

As if the hackers knew there had to be such photos in the first place, and only stole the photos.  As if we don’t all have private lives, with sensitive bits, that could be or already have been acquired by parties unknown.  We’ve all got something to lose.  And frankly, it may already be lost.

You Don’t Necessarily Know When You’ve Been Hit, Let Alone What’s Gone

There’s a peculiar property of much criminality in the real world:  You notice.  A burgled home is missing things, an assaulted body hurts.  These crimes still occur, but we can start responding to them immediately.  If there’s one thing to take away from this compromise, it’s that when it comes to information theft you might find out quickly, or you may never find out at all.  Consider this anonymous post, forwarded by @OneTrueDoxbin to the surprisingly insightful @SwiftOnSecurity:

anontheory

It is a matter of undisputable fact that “darknet” information trading networks exist.  People collected stamps, after all, and there’s way rarer stuff floating around out there than postal artifacts.  This sort of very personal imagery is the tip of a very large and occasionally deeply creepy iceberg.  The most interesting aspects of Swift’s research have centered on the exceptions – yes, he found, a significant majority of the files came from iPhones, implicating some aspect of the Apple iCloud infrastructure.  But not all – there’s some JVC camcorder data here, a Motorola RAZR EXIF extension there – and there’s a directory structure that no structured database might have but a disorganized human whose photo count doesn’t number in the billions would absolutely use.  The exceptions show more of a network and less of a lone operator.

The key element of a darknet is, of course, staying dark.  It’s hard to do that if you’re taunting your victims, and so generally they don’t.  Some of the images they found in his research went back years.  A corollary of not discovering one attack is not detecting many, extending over many victims and coming from multiple attackers.

Of course, darknets have operational security risks, same as anyone, and eventually someone might come in to game the gamers.  From someone who claims to be the original leaker:

“People wanted shit for free. Sure, I got $120 with my bitcoin address, but when you consider how much time was put into acquiring this stuff (i’m not the hacker, just a collector), and the money (I paid a lot via bitcoin as well to get certain sets when this stuff was being privately traded Friday/Saturday) I really didn’t get close to what I was hoping.

Real?  Fake?  Can’t really know.  Pretty risky, trying to draw together a narrative less than a hundred hours since the initial compromise was detected.  It’s the Internet, people lie, even more so anonymously.  It fits with my personal theory that the person who acquired these images isn’t necessarily the person who’s distributing them (I figured hacker-on-hacker theft), but heh.  It’s amazingly easy to have your suspicions confirmed.

One reporter asked me how it was possible that J.P. Morgan could immediately diagnose and correct their extended infection, while Apple couldn’t instantaneously provide similar answers.  As I told them, J.P. Morgan knew without question they were hit, and had the luxury of deciding its disclosure schedule (with some constraints); this particular case simply showed up on 4Chan during Labor Day Weekend when presumably half the people who would investigate were digging their way back from Burning Man.  Internal discoveries and external surprises just follow different schedules.

I’ve personally been skeptical that an account brute forcing bug that happened to be discovered around now, was also used for this particular attack.  There’s only so many days in the year and sometimes multiple events happen close in time just randomly.  As it happens, Apple has confirmed at least some of these celebrity raids have come via account attacks, but whether brute forcing was required hasn’t been disclosed.  It does seem that this exploit has been used in the field since at least May, however, lending some credibility.

We have, at best, a partial explanation.  Much as we desperately would like this to be a single, isolated event, with a nice, well defined and well funded defender who can make sure this never happens again – that’s just not likely to be the case.  We’re going to learn a lot more about how this happened, and in response, there will be improvements.  But celebrity (and otherwise) photo exploitation will not be found to be an isolated attack and it won’t be addressed or ended with a spot fix to password brute forcing.

So there’s a long investigation ahead, quite a bit longer than a single press cycle.

Implications For Cloud Providers

Are we actually stuck right now at another password debate?  Passwords have failed us yet again, let’s have that tired conversation once more?  Sam Biddle, who otherwise did some pretty good research in this post, did have one somewhat amusing paragraph:

To fix this, Apple could have simply forced everyone to use two-factor verification for their accounts. It’s easy, and would have probably prevented all of this.

Probably the only time “simply”, “easy”, and “two-factor verification” have ever been seen in quite such proximity, outside of marketing materials anyway.  There’s a reason we use that broken old disco tech.

Still, we have to do better.  So-called “online brute-forcing” – where you don’t have a database of passwords you want to crack, but instead have to interact with a server that does – is a far slower, and far noisier affair.

But noise doesn’t matter if nobody is listening.  Authentication systems could probably do more to detect brute force attacks across large numbers of accounts.  And given the wide variety of systems that interface with backend password stores, it’s foolish to expect them all to implement rate limiting correctly.  Limits need to exist as close as possible to the actual store, independent of access method.

Sam’s particularly right about the need to get past personal entropy.  Security questions are time bombs in a way even passwords aren’t.  In a world of tradeoffs, I’m beginning to wonder if voice prints across sentences aren’t better than personal information widely shared.  Yes, I’m aware of the downsides, but look at the competition.

iamgroot

OK.  It’s time to ban Password1.  Many users like predictable passwords.  Few users like their data being compromised.  Which wins?  Presently, the former.  Perhaps this is the moment to shift that balance.  Service providers (cloud and otherwise) are burying their heads in the sand and going with password policies that can only be called inertial.   Defenders are using simple rules like “doesn’t have an uppercase letter” and “not enough punctuation” to block passwords while attackers are just straight up analyzing password dumps and figuring out the most likely passwords to attempt in any scenario.  Attackers are just way ahead.  That has to change.  Defenders have password dumps too now.  It’s time we start outright blocking passwords common enough that they can be online brute forced, and it’s time we admit we know what they are.

We’re not quite ready to start generating passwords for users, and post-password initiatives like Fido are still some of the hardest things we’re working on in all of computer engineering.  But there’s some low hanging fruit, and it’s probably time to start addressing it.

And we can’t avoid the elephant in the room any longer.

In Which The Nerd Has To Talk About Sex Because Everyone Else Won’t

It’s not all victim shaming. At least some of the reaction to this leak of celebrity nudity can only be described as bewilderment.  Why are people taking these photos in the first place?  Even with the occasional lack of judgment… there’s a sense of surprise.  Is everybody taking and sending sexy photos?

No.  Just everyone who went through puberty owning a cell phone.

I’m joking, of course.  There are also a number of people who grew up before cell phones who nonetheless discovered that a technology able to move audio, still images, and videos across the world in an instant could be a potent enabler of Flirting-At-A-Distance.  This tends to reduce distance, increasing…happiness.

Every generation thinks it invents sex, but to a remarkable degree generations don’t talk to each other about what they’ve created.  It’s rarely the same, and though older generations can (and do) try, there is nothing in all of creation humans take less advice about than mating rituals.

So, yeah.  People use communication technologies for sexy times.  Deal with it.

Interestingly, to a very limited extent, web browsers actually do.  You may have noticed that each browser has Porn Mode.  Oh, sure, that particular name never makes it through Corporate Branding.  It gets renamed “InPrivate Browsing” or “Incognito Mode” or “The I’m Busy I’ll Be Out In A Minute Window”.  The actual feature descriptions are generally hilarious parallel constructions about wanting to use a friend’s web browser to buy them a gift, but not having the nature of the gift show up in their browser cache.  But we know the score and so do browser developers, who know the market share they’d lose if they didn’t support safer consumption of pornography (at least in the sense that certain sites don’t show up on highly visible “popular tabs” pages during important presentations).

I say all this because of a tweet that is accurate, and needs to be understood outside the context of shaming the victim:

Technology can do great, wonderful, and repeatedly demanded things, and still have a dark side.  That’s not limited to sexy comms.  That applies to the cloud itself.

True story:  A friend of mine and I are at the airport in Abu Dhabi a few years back.  We get out of the taxi, she drops her cell phone straight into a twenty foot deep storm drain.  She starts panicking:  She can’t leave her phone.  She can’t lose her phone.  She’s got pictures of her kids on that phone that she just can’t lose.  “No problem, I get police” says the taxi driver, who proceeds to drive off, with all our stuff.

We’re ten thousand miles away from home, and our flight’s coming.  Five minutes go by. Ten.  Fifteen…and the taxi driver returns, with a police officer, who rounds up some airport workers who agree to help us retrieve the phone (which, despite being submerged for twenty minutes, continued to work just fine).

Probably the best customer service I’ve ever received while traveling, but let me tell you, I’d have rather just told my friend to pull the photos off the cloud.

The reality is that cell phone designers have heard for years what a painful experience it is to lose data, and have prioritized the seamless recovery of those bits best they can. It’s awful to lose your photos, your emails, your contacts.  No, really, major life disruption.  Lot of demand to fix that.  But in all things there are engineering tradeoffs, and data that is stored in more than one location can be stolen from more than one location.  98% of the time, that’s OK, you really don’t want to lose photos of your loved ones.  You don’t care if the pics are stolen, you’re just going to post them on Facebook anyway.

2% of the time, those pictures weren’t for Facebook.  2% of the time, no it’s cool, those can go away, you can always take more selfies.  Way better to lose the photos than see them all over the Internet.  Terrible choice to have to make, but not generally a hard decision.

So the game becomes, separate the 98% from the 2%.

3_inbox_screen_new-bgg

So, actual concrete advice.  Just like browsers have porn mode for the personal consumption of private imagery, cell phones have applications that are significantly less likely to lead to anyone else but your special friends seeing your special bits.  I personally advise Wickr, an instant messaging firm that develops secure software for iPhone and Android.  What’s important about Wickr here isn’t just the deep crypto they’ve implemented, though it’s useful too.  What’s important in this context is that with this code there’s just a lot fewer places to steal your data from.  Photos and other content sent in Wickr don’t get backed up to your desktop, don’t get saved in any cloud, and by default get removed from your friend’s phone after an amount of time you control.  Wickr is of course not the only company supporting what’s called “ephemeral messaging”; SnapChat also dramatically reduces the exposure of your private imagery (with the caveat that with SnapChat, unlike Wickr, SnapChat itself gets an unencrypted copy of your imagery and messaging so you have to hope they’re not saving anything.  Better for national intelligence services, worse for you).

Sure, you can hunt down settings that reduce your exposure to specific cloud services.  You’ve still got to worry about desktop backups, and whatever your desktop is backing up to.  But really, if you can keep your 2% far away from your 98%, you’ll be better off.

In the long run, I think the standard photo applications will get a UI refresh, to allow “sensitive photographs” (presumably of surprise birthday parties in planning) to be locked to the device, allowed through SMS only in directly authenticated circumstances, and not backed up.  Something like a lock icon on the camera screen.  I don’t think the feature request could have happened before this huge leak.  But maybe now it’s obvious just how many people require it.

Categories: Security

Never Let A Good Crisis Go To Waste

(This post is something of a follow-up to a previous post on browser security, but that went long enough that I decided to split the posts.)

I’ll leave most of what’s been said about Heartbleed to others.  In particular, I enjoyed Adam Cecchetti’s Heartbleed: A Beat In Time presentation.  But, a few people have been poking me for comments in response to recent events.  A little while ago, I called for a couple responses to Heartbleed:

  1. Accepting that some software has become Critical Infrastructure
  2. Supporting that software, with both money and talent
  3. Identifying that software — lets find the most important million lines of code that would be the most dangerous to have a bug, and actively monitor them

Thanks to the Linux Foundation and a host of well respected companies…this is actually happening.  Whoa.  That was fast.  I’m not even remotely taking credit — this has been a consensus, a long time brewing.  As Kennedy opined, “Victory has a thousand fathers.”  Right after the announcement, I was asked what I thought about this.  Here’s my take:

The Linux Foundation has announced their Core Infrastructure project, with early supporters including Amazon, Cisco, Dell, Facebook, Fujitsu, Google, IBM, Intel, Microsoft, NetApp, Qualcomm, Rackspace, and VMWare.  $100K from each, for three years, to identify and support the Open Source projects we depend on.

This is fantastic.

The Internet was not our first attempt at building the Internet.  Many other systems, from Minicom to AOL, came first.  What we call the Internet today was just the first time a global computer telecom infrastructure really took hold, pushing data and global communications onto every desk, into every home, onto mobile phones the world over.  Open Source software played, and continues to play, a tremendous role in the continuing success of the Internet, as the reason this platform worked was people could connect without hindrance from a gatekeeper.

It’s hard for some people to believe, but the phone company used to own your telephone.  Also, long distance calls used to be an event, because they were so very expensive.

We do live in a different world, than when the fundamental technologies of the Internet were created, and even when they were deployed.  Bad actors abound, and the question is:  How do we respond?  What do we do, to make the Internet safe, without threatening its character?  It’s a baby and the bathwater scenario.

I am profoundly grateful to see the Core Infrastructure project pledging real money and real resources to Open Source projects.  There are important things to note here.  First, yes, Core Infrastructure is a great name that avoids the baggage of Critical Infrastructure while expressing the importance of attention.  Second, we are seeing consensus that we must have the conversation about exactly what it is we depend on, if only to direct funds to appropriate projects.  Third, there’s an actual stable commitment of money, critical if there’s to be full time engineers hired to protect this infrastructure.

The Core Infrastructure project is not the only effort going on to repair or support Open Source.  The OpenBSD team has started a major fork of OpenSSL, called LibreSSL.  It will take some time to see what that effort will yield, but everyone’s hopeful.  What’s key here is that we are seeing consensus that we can and should do more, one that really does stay within the patterns that helped all these companies be the multi-billion dollar concerns they are now.  The Internet grew their markets, and in many cases, created them.  This isn’t charity.  It’s just verywise business.

In summary, I’m happy.  This is what I had hoped to see when I wrote about Heartbleed, and I am impressed to see industry stepping up so quickly to fill the need to identify and secure Core Infrastructure.

Making the Internet a safer place is not going to be easy.  But I’m seeing a world with a Core Infrastructure Project, the Internet Bug Bounty, the Bluehat Prize (which never gets enough attention), and even real discussion around when the US Government should disclose vulnerabilities.

That’s,..not the world I’m used to.  If this world also has Steven Colbert cracking wise about IE vulns…OK.

(Quick,vaguely subversive thought:  We always talk about transparency in things like National Security Letters and wiretaps.  What about vulnerability reports that lead to fixes?  Government is not monolithic.  Maybe transparency can highlight groups that defend the foundations our new economies are built upon, wherever they happen to be.  Doing this without creating some really perverse incentives would be…interesting.)

Categories: Security

Of Expectations And Rewards

So some people want me to say a few more things about Heartbleed.

Meanwhile, this happened.

colbert

 

My immediate reaction, of course, was that there was no way IE shipped that early.  Windows 95 didn’t even have TCP/IP, the core protocols of the Internet, enabled by default!  Colbert had no idea what he’s Tolkein about.

Nope.  Turns out, the Colbert Report crew did their homework.   But, wait.  What?  In what universe has yet another browser bug become fodder for late night?

Hacking has become a spectator sport.  Really can’t say it’s surprising — how many people play sports every day?  Now how many people look at a big glowing rectangle?

We make the glowing box do some very strange things.

I mentioned earlier that one of the things that made Heartbleed so painful, is that it was a bug where we least expected one to be.  That is not the situation with browsers.  Despite genuinely herculean efforts, any security professional worth their salt completely expects web browser vulnerabilities to be found, and exploited, from time to time. The simple explanation is that web browsers expose a tremendous amount of attack surface to relatively anonymous attackers.

Let’s get a bit beyond the simple explanation.

It’s important to realize that web browsers, in general, are particularly vulnerable creations.  Despite the three major platforms (IE, Firefox, Chrome) being developed essentially independently, they’re all implementing the same specifications, leading to something akin to convergent evolution:  A slow language (like HTML, CSS, or JavaScript) plays puppeteer to a fast language’s object model (C/C++) via some sort of formalized translation layer (IDL for COM/XPCOM/WebkitIDL).  Take a look at this analysis of the gory details of browser internals.  See how often they just refer to browsers, in general?

There’s a reason we have very different codebases, but very similar bugs.

So why all the noise?  Why now?  In this particular case, this is the first bug after Microsoft’s genuinely unprecedented campaign to announce the end of XP support.  The masses were marketed to,  and basically told in no uncertain terms “It’s time to upgrade, the next bug is going to burn you if you’re still on XP.”  Well, here’s the next bug, and there’s Microsoft keeping their word.

(Update:  MS is issuing an XP patch after all.  Actual attacks in the field generally do trump everything else.  Really, the story of XP is tragic.  Microsoft finally makes the first consumer OS that doesn’t crash when you look at it funny…and then it crashes when I look at it funny.  Goalposts with rockets on em…)

This is also the first bug after Heartbleed, whose response somehow metastasized into the world being told to freak out and change all their passwords.

Neither of these events have anything to do with the quality of the browser itself, but they’re certainly driving the noise.  Yes, IE’s got some somewhat unique issues.  Back in the day, when Microsoft argued that they couldn’t remove IE from Windows, it was too integrated — yep, pretty much.  Internet Explorer is basically Windows: The Remix feat. The Internet.  Which makes sense, because at the end of the day browsers have long been the new operating systems.  So of course, Microsoft would basically have lots of stuff for the browser lying about.  None of that stuff was ever designed to be executed by untrusted parties, so when it ended up rigged up for remote scripting…sometimes just through the magic of COM…bad things happened.

Of course, I said somewhat unique.  Firefox and its predecessors implemented various amounts of themselves not at the C++ layer, but in JavaScript itself via a language called XUL and various interesting things reachable via the Components object.  Leaks into this trusted code have happened, with unpleasant side effects.

Where all the browsers uniformly struggle, though, is with object memory management.  Everything on a web page takes memory, and there’s only so much to go around.  Eventually, you’ve got to clear out some old objects you’re not viewing any more, to make room for more pictures of cats.  So, when can you do that?  What it comes down to is that JavaScript is really flexible, while  C++ is really fast.  We bridge the two to get the best of both worlds — the flexibility (and security!) of the former, the speed of the latter. What happens when they disagree?  What happens when the language exposed to the developer (for various values of ”developer’) still thinks there’s a picture somewhere, while C++ (yes, I know, the heap allocator, I’m trying to simplify this) has long since destroyed that picture and reused that particular memory for a video file?

Boom.  And this is pretty common.  As the great poet of our modern era, The Grugq wrote recently:

At this point, you may be feeling rather sad about the state of security in general.  The three major browsers — IE, Firefox, and Chrome — are some of the most well funded and deeply audited ongoing development efforts in the world.  If they all fail, in similar ways even, what hope do we have?

There is hope on the horizon (and through this path, we finally get back to Heartbleed).  While the browsers remain imperfect, that they work at all — let alone with ever increasing performance and usability — is nothing short of miraculous.  There is literally nothing else where it’s conceivable that you’d just wander around, executing random chunks of code from random suppliers, and not get compromised instantaneously.  (Sandboxes are things children walk into and out of with relative ease.  I’ve always wondered why we called them that.)  And there is motion towards making useful languages that are both memory safe and fast enough to do the heavy lifting browsers require — Rust being the prime example.

It’s not enough to be secure.  Hardest lesson anyone in security can ever learn.  Some never do.

Interesting things happen as JavaScript becomes a fast language — particularly the hyper-optimizable subset of JavaScript known as asm.js.  Ultimately, most virtual machines are like most sandboxes.  Not the JS VM’s — there’s been an ongoing battle to lock those down for a decade.  A VM with less to attack, that can still leverage all the optimization knowledge being absorbed into LLVM, is Interesting.

There’s a crazy project to try to run all of Webkit (the engine in Chrome and Safari, more or less) inside of JavaScript, including JavaScript itself.  Yo dawg.

There’s hope, but it comes at some price.  We don’t know what solutions will actually work, and we shouldn’t assume any of them will ever reach 100% security.  We absolutely should not assume we knew how to do security decades ago, and just “forgot” or got lazy or whatnot.  For a while it seemed like the answer was obviously Java, or C#/.NET.  Useful languages for many problem sets, sure.  But Microsoft once tried to rewrite chunks of Windows in .NET. It did not go well.  To this day, “Longhorn” will bring chills to old-school MS’ers (“Rosebud…”), and there is genuine excitement around the .NET to C++ compiler.

There are things that looked like they worked in the past, but it was a mirage.  There are things that were a mirage in the past, but technology or other factors have changed.  (Anyone remember DHTML?  Great idea, but it took a few generations before client side interactivity really became a thing.)

Who knows.  Maybe Google will someday make a sandbox that impresses even Pinkie Pie.  And perhaps I have something up my sleeve…

But expecting no bugs is like expecting no crime, nobody to die in the ER, no cars to crash, no businesses to fail.  It’s not just unreasonable.  It’s also kind of awkward to see it become a spectator sport.

(Splitting the Heartbleed commentary to a second post.)

Categories: Security

Bloody Cert Certified

April 12, 2014 7 comments

Oh, Information Disclosure vulnerabilities.  Truly the Rodney Dangerfield of vulns, people never quite know what their impact is going to be.  With Memory Corruption, we’ve basically accepted that a sufficiently skilled attacker always has enough degrees of freedom to at least unreliably achieve arbitrary code execution (and from there, by the way, to leak arbitrary information like private keys).  With Information Disclosure, even the straight up finder of Heartbleed has his doubts:

So, can Heartbleed leak private keys in the real world or not?  The best way to resolve this discussion is PoC||GTFO (Proof of Concept, or you can figure out the rest).  CloudFlare threw up a challenge page to steal their key.  It would appear Fedor Indutny and Illkka Mattila have successfully done just that.  Now, what’s kind of neat is that because this is a crypto challenge, Fedor can actually prove he pulled this off, in a way everyone else can prove but nobody else can duplicate.

I’m not sure if key theft has already occurred, but I think this fairly conclusively establishes key theft is coming.  Lets do a walk through:

First, we retrieve the certificate from http://www.cloudflarechallenge.com.

$ echo “” | openssl s_client -connect www.cloudflarechallenge.com:443 -showcerts | openssl x509 > cloudflare.pem
depth=4 C = SE, O = AddTrust AB, OU = AddTrust External TTP Network, CN = AddTrust External CA Root
verify error:num=19:self signed certificate in certificate chain
verify return:0
DONE

This gives us the certificate, with lots of metadata about the RSA key.  We just want the key.

$ openssl x509 -pubkey -noout -in cloudflare.pem > cloudflare_pubkey.pem

Now, we take the message posted by Fedor, and Base64 decode it:

$ wget https://gist.githubusercontent.com/indutny/a11c2568533abcf8b9a1/raw/1d35c1670cb74262ee1cc0c1ae46285a91959b3f/1.bash
–2014-04-11 18:14:48– https://gist.githubusercontent.com/indutny/a11c2568533abcf8b9a1/raw/1d35c1670cb74262ee1cc0c1ae46285a91959b3f/1.bash
Resolving gist.githubusercontent.com… 192.30.252.159
Connecting to gist.githubusercontent.com|192.30.252.159|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: unspecified [text/plain]
Saving to: `1.bash’

[ <=> ] 456 –.-K/s in 0s

2014-04-11 18:14:49 (990 KB/s) – `1.bash’ saved [456]

$ cat 1.bash
> echo “Proof I have your key. fedor@indutny.com” | openssl sha1 -sign key.pem -sha1 | openssl enc -base64
aKmd7OEXbTvlh6CQNczA+XYVsru4b4t1RMcHCqvjQOW3sxrLFX0laO47fks1G90c
6CtcQwYO06uVxwa9XAr8TAXbmZZj+kGoSNkRzpaNeeqqNkcvVUEv5dV5wy4Q2Mfr
aL6YQSVCfbGGlU0SwBYKdI3cssfPPiClM+i9psi655zDYcgYDdZIvl3/XpGBHeKu
CH9gmt9R9ey448IdZ2uCF+oX+KAogcl9AgpX7GPmJNXZWN13ASCjIFxPdQUguj5p
Z1e8sv6DmenKABaWGmMtQDf/JAzHDCl8CaYlReXQW1wja3I4crD2tDqxq6+Hbajn
SOUFr2XOSpuj5wGxlo20KA==

$ cat 1.bash | grep -v fedor | openssl enc -d -base64 > fedor_signed_proof.bin

So, what do we have?  A message, a public key, and a signature linking that specific message to that specific public key.  At least, we’re supposed to.  Does OpenSSL agree?

$ echo “Proof I have your key. fedor@indutny.com” | openssl dgst -verify cloudflare_pubkey.pem -signature fedor_signed_proof.bin -sha1
Verified OK

Ah, but what if we tried this for some other RSA public key, like the one from Google?

$ echo “” | openssl s_client -connect www.google.com:443 -showcerts | openssl x509 > google.pem
depth=2 C = US, O = GeoTrust Inc., CN = GeoTrust Global CA
verify error:num=20:unable to get local issuer certificate
verify return:0
DONE

$ openssl x509 -pubkey -noout -in google.pem > google_pubkey.pem

$ echo “Proof I have your key. fedor@indutny.com” | openssl dgst -verify google_pubkey.pem -signature fedor_signed_proof.bin -sha1
Verification Failure

Or what if I changed things around so it looked like it was me who cracked the code?

$ echo “Proof I have your key. dan@whiteops.com” | openssl dgst -verify cloudflare_pubkey.pem -signature fedor_signed_proof.bin -sha1
Verification Failure

Nope, it’s Fedor or bust 🙂  Note that it’s a common mistake, when testing cryptographic systems, to not test the failure modes.  Why was “Verified OK” important?  Because “Verification Failure” happened when our expectations weren’t met.

There is, of course, one mode I haven’t brought up.  Fedor could have used some other exploit to retrieve the key.  He claims not to have, and I believe him.  But that this is a failure mode is also part of the point — there have been lots of bugs that affect something that ultimately grants access to SSL private keys.  The world didn’t end then and it’s not ending now.

I am satisfied that the burden of proof for Heartbleed leaking  private keys has been met, and I’m sufficiently convinced that “noisy but turnkey solutions” for key extraction will be in the field in the coming weeks (it’s only been ~3 days since Neel told everyone not to panic).

Been getting emails asking for what the appropriate response to Heartbleed is.  My advice is pretty exclusively for system administrators and CISOs, and is something akin to “Patch immediately, particularly the systems exposed to the outside world, and don’t just worry about HTTP.  Find anything moving SSL, particularly your SSL VPNs, prioritizing on open inbound, any TCP port.  Cycle your certs if you have them, you’re going to lose them, you may have already, we don’t know.  But patch, even if there’s self signed certs, this is a generic Information Leakage in all sorts of apps.  If there is no patch and probably won’t ever be, look at putting a TLS proxy in front of the endpoint.  Pretty sure stunnel4 can do this for you.”

QUICK EDIT:  One final note — the bug was written at the end of 2011, so devices that are older than that and have not been patched at all remain invulnerable (at least to Heartbleed).  The guidance is really to identify systems that are exposing OpenSSL 1.01  SSL servers (and eventually clients) w/ heartbeat support, on top of any protocol, and get ’em patched up.

Yes, I’m trying to get focus off of users and passwords and even certificates and onto stemming the bleeding, particularly against non-obvious endpoints.  For some reason a surprising number of people think SSL is only for HTTP and browsers.  No, it’s 2014.  A lot of protocol implementations have basically evolved to use this one environment middleboxes can’t mess with.

A lot of corporate product implementations, by the way.  This has nothing to do with Open Source, except to the degree that Open Source code is just basic Critical Infrastructure now.

Categories: Security

Be Still My Breaking Heart

April 10, 2014 36 comments

Abstract:  Heartbleed wasn’t fun.  It represents us moving from “attacks could happen” to “attacks have happened”, and that’s not necessarily a good thing.  The larger takeaway actually isn’t “This wouldn’t have happened if we didn’t add Ping”, the takeaway is “We can’t even add Ping, how the heck are we going to fix everything else?”.  The answer is that we need to take Matthew Green’s advice, start getting serious about figuring out what software has become Critical Infrastructure to the global economy, and dedicating genuine resources to supporting that code.  It took three years to find Heartbleed.  We have to move towards a model of No More Accidental Finds.

======================

You know, I’d hoped I’d be able to avoid a long form writeup on Heartbleed.  Such things are not meant to be.  I’m going to leave many of the gory technical details to others, but there’s a few angles that haven’t been addressed and really need to be.  So, let’s talk.  What to make of all this noise?

First off, there’s been a subtle shift in the risk calculus around security vulnerabilities.  Before, we used to say:  “A flaw has been discovered.  Fix it, before it’s too late.”  In the case of Heartbleed, the presumption is that it’s already too late, that all information that could be extracted, has been extracted, and that pretty much everyone needs to execute emergency remediation procedures.

It’s a significant change, to assume the worst has already occurred.

It always seems like a good idea in security to emphasize prudence over accuracy, possible risk over evidence of actual attack.  And frankly this policy has been run by the privacy community for some time now.  Is this a positive shift?  It certainly allows an answer to the question for your average consumer, “What am I supposed to do in response to this Internet ending bug?”  “Well, presume all your passwords leaked and change them!”

I worry, and not merely because “You can’t be too careful” has not at all been an entirely pleasant policy in the real world.  We have lots of bugs in software.  Shall we presume every browser flaw not only needs to be patched, but has already been exploited globally worldwide, and you should wipe your machine any time one is discovered?  This OpenSSL flaw is pernicious, sure.  We’ve had big flaws before, ones that didn’t just provide read access to remote memory either.  Why the freak out here?

Because we expected better, here, of all places.

There’s been quite a bit of talk, about how we never should have been exposed to Heartbleed at all, because TLS heartbeats aren’t all that important a feature anyway.  Yes, it’s 2014, and the security community is complaining about Ping again.  This is of course pretty rich, given that it seems half of us just spent the last few days pinging the entire Internet to see who’s still exposed to this particular flaw.  We in security sort of have blinders on, in that if the feature isn’t immediately and obviously useful to us, we don’t see the point.

In general, you don’t want to see a protocol designed by the security community.  It won’t do much.  In return (with the notable and very appreciated exception of Dan Bernstein), the security community doesn’t want to design you a protocol.  It’s pretty miserable work.  Thanks to what I’ll charitably describe as “unbound creativity” the fast and dumb and unmodifying design of the Internet has made way to a hodge podge of proxies and routers and “smart” middleboxes that do who knows what.  Protocol design is miserable, nothing is elegant.  Anyone who’s spent a day or two trying to make P2P VoIP work on the modern Internet discovers very quickly why Skype was worth billions.  It worked, no matter what.

Anyway, in an alternate universe TLS heartbeats (with full ping functionality) are a beloved security feature of the protocol as they’re the key to constant time, constant bandwidth tunneling of data over TLS without horrifying application layer hacks.  As is, they’re tremendously useful for keeping sessions alive, a thing I’d expect hackers with even a mild amount of experience with remote shells to appreciate.  The Internet is moving to long lived sessions, as all Gmail users can attest to.  KeepAlives keep long lived things working.  SSH has been supporting protocol layer KeepAlives forever, as can be seen:

The takeaway here is not “If only we hadn’t added ping, this wouldn’t have happened.”  The true lesson is, “If only we hadn’t added anything at all, this wouldn’t have happened.”  In other words, if we can’t even securely implement Ping, how could we ever demand “important” changes?  Those changes tend to be much more fiddly, much more complicated, much riskier.  But if we can’t even securely add this line of code:

if (1 + 2 + payload + 16 > s->s3->rrec.length)

I know Neel Mehta.  I really like Neel Mehta.  It shouldn’t take absolute heroism, one of the smartest guys in our community, and three years for somebody to notice a flaw when there’s a straight up length field in the patch.  And that, I think, is a major and unspoken component of the panic around Heartbleed.  The OpenSSL dev shouldn’t have written this (on New Years Eve, at 1AM apparently).  His coauthors and release engineers shouldn’t have let it through.  The distros should have noticed.  Somebody should have been watching the till, at least this one particular till, and it seems nobody was.

Nobody publicly, anyway.

If we’re going to fix the Internet, if we’re going to really change things, we’re going to need the freedom to do a lot more dramatic changes than just Ping over TLS.  We have to be able to manage more; we’re failing at less.

There’s a lot of rigamarole around defense in depth, other languages that OpenSSL could be written in, “provable software”, etc.  Everyone, myself included, has some toy that would have fixed this.  But you know, word from the Wall Street Journal is that there have been all of $841 in donations to the OpenSSL project to address this matter.  We are building the most important technologies for the global economy on shockingly underfunded infrastructure.  We are truly living through Code in the Age of Cholera.

Professor Matthew Green of Johns Hopkins University recently commented that he’s been running around telling the world for some time that OpenSSL is Critical Infrastructure.  He’s right.  He really is.  The conclusion is resisted strongly, because you cannot imagine the regulatory hassles normally involved with traditionally being deemed Critical Infrastructure.  A world where SSL stacks have to be audited and operated against such standards is a world that doesn’t run SSL stacks at all.

And so, finally, we end up with what to learn from Heartbleed.  First, we need a new model of Critical Infrastructure protection, one that dedicates real financial resources to the safety and stability of the code our global economy depends on – without attempting to regulate that code to death.  And second, we need to actually identify that code.

When I said that we expected better of OpenSSL, it’s not merely that there’s some sense that security-driven code should be of higher quality.  (OpenSSL is legendary for being considered a mess, internally.)  It’s that the number of systems that depend on it, and then expose that dependency to the outside world, are considerable.  This is security’s largest contributed dependency, but it’s not necessarily the software ecosystem’s largest dependency.  Many, maybe even more systems depend on web servers like Apache, nginx, and IIS.  We fear vulnerabilities significantly more in libz than libbz2 than libxz, because more servers will decompress untrusted gzip over bzip2 over xz.  Vulnerabilities are not always in obvious places – people underestimate just how exposed things like libxml and libcurl and libjpeg are.  And as HD Moore showed me some time ago, the embedded space is its own universe of pain, with 90’s bugs covering entire countries.

If we accept that a software dependency becomes Critical Infrastructure at some level of economic dependency, the game becomes identifying those dependencies, and delivering direct technical and even financial support.  What are the one million most important lines of code that are reachable by attackers, and least covered by defenders?  (The browsers, for example, are very reachable by attackers but actually defended pretty zealously – FFMPEG public is not FFMPEG in Chrome.)

Note that not all code, even in the same project, is equally exposed.    It’s tempting to say it’s a needle in a haystack.  But I promise you this:  Anybody patches Linux/net/ipv4/tcp_input.c (which handles inbound network for Linux), a hundred alerts are fired and many of them are not to individuals anyone would call friendly.  One guy, one night, patched OpenSSL.  Not enough defenders noticed, and it took Neel Mehta to do something.

We fix that, or this happens again.  And again.  And again.

No more accidental finds.  The stakes are just too high.

Categories: Security

On Brinksmanship

January 18, 2013 6 comments

Reality is what refuses to go away when you stop believing in it.

The reality – the ground truth—is that Aaron Swartz is dead.

Now what.

Brinksmanship is a terrible game, that all too many systems evolve towards.  The suicide of Aaron Swartz is an awful outcome, an unfair outcome, a radically out of proportion outcome.   As in all negotiations to the brink, it represents a scenario in which all parties lose.

Aaron Swartz lost.  He paid with his life.  This is no victory for Carmen Ortiz, or Steve Heymann, or JSTOR, MIT, the United States Government, or society in general.  In brinksmanship, everybody loses.

Suicide is a horrendous act and an even worse threat.  But let us not pretend that a set of charges covering the majority of Aaron’s productive years is not also fundamentally noxious, with ultimately a deeply similar outcome.  Carmen Ortiz (and, presumably, Steve Heymann) are almost certainly telling the truth when they say – they had no intention of demanding thirty years of imprisonment from Aaron.  This did not stop them from in fact, demanding thirty years of imprisonment from Aaron.

Brinksmanship.  It’s just negotiation.  Nothing personal.

Let’s return to ground truth.  MIT was a mostly open network, and the content “stolen” by Aaron was itself mostly open.  You can make whatever legalistic argument you like; the reality is there simply wasn’t much offense taken to Aaron’s actions.  He wasn’t stealing credit card numbers, he wasn’t reading personal or professional emails, he wasn’t extracting design documents or military secrets.  These were academic papers he was ‘liberating’.

What he was, was easy to find.

I have been saying, for some time now, that we have three problems in computer security.  First, we can’t authenticate.  Second, we can’t write secure code.  Third, we can’t bust the bad guys.  What we’ve experienced here, is a failure of the third category.  Computer crime exists.  Somebody caused a huge amount of damage – and made a lot of money – with a Java exploit, and is going to get away with it.  That’s hard to accept.  Some of our rage from this ground truth is sublimated by blaming Oracle.  But some of it turns into pressure on prosecutors, to find somebody, anybody, who can be made an example of.

There are two arguments to be made now.  Perhaps prosecution by example is immoral – people should only be punished for their own crimes.  In that case, these crimes just weren’t offensive enough for the resources proposed (prison isn’t free for society).  Or perhaps prosecution by example is how the system works, don’t be naïve – well then.

Aaron Swartz’s antics were absolutely annoying to somebody at MIT and somebody at JSTOR.  (Apparently someone at PACER as well.)  That’s not good, but that’s not enough.  Nobody who we actually have significant consensus for prosecuting, models himself after Aaron Swartz and thinks “Man, if they go after him, they might go after me”.

The hard truth is that this should have gone away, quietly, ages ago.  Aaron should have received a restraining order to avoid MIT, or perhaps some sort of fine.  Instead, we have a death.  There will be consequences to that – should or should not doesn’t exist here, it is simply a statement of fact.  Reality is what refuses to go away, and this is the path by which brinksmanship is disincentivized.

My take on the situation is that we need a higher class of computer crime prosecution.  We, the computer community in general, must engage at a higher level – both in terms of legislation that captures our mores (and funds actual investigations – those things ain’t free!), and operational support that can provide a critical check on who is or isn’t punished for their deeds.  Aaron’s law is an excellent start, and I support it strongly, but it excludes faux law rather than including reasoned policy.  We can do more.  I will do more.

The status quo is not sustainable, and has cost us a good friend.  It’s so out of control, so desperate to find somebody – anybody! – to take the fall for unpunished computer crime, that it’s almost entirely become about the raw mechanics of being able to locate and arrest the individual instead of about their actual actions.

Aaron Swartz should be alive today.  Carmen Ortiz and Steve Heymann should have been prosecuting somebody else.  They certainly should not have been applying a 60x multiple between the amount of time they wanted, and the degree of threat they were issuing.  The system, in all of its brinksmanship, has failed.  It falls on us, all of us, to fix it.

Categories: Security

Actionable Intelligence: The Mouse That Squeaked

December 14, 2012 3 comments

[Obligatory disclosures — I’ve consulted for Microsoft, and had been doing some research on Mouse events myself.]

So one of the more important aspects of security reporting is what I’ve been calling Actionable Intelligence. Specifically, when discussing a bug — and there are many, far more than are ever discovered let alone disclosed — we have to ask:

What can an attacker do today, that he couldn’t do yesterday, for what class attacker, to what class victim?

Spider.IO, a fraud analytics company, recently disclosed that under Internet Explorer attackers can capture mouse movement events from outside an open window. What is the Actionable Intelligence here? It’s moderately tempting to reply: We have a profound new source of modern art.

psmap
(Credit: Anatoly Zenkov’s IOGraph tool)

I exaggerate, but not much. The simple truth is that there are simply not many situations where mouse movements are security sensitive. Keyboard events, of course, would be a different story — but mouse? As more than a few people have noted, they’d be more than happy to publish their full movement history for the past few years.

It is interesting to discuss the case of the “virtual keyboard”. There has been a movement (thankfully rare) to force credential input via mouse instead of keyboard, to stymie keyloggers. This presupposes a class of attacker that has access to keyboard events, but not mouse movements or screen content. No such class actually exists; the technique was never protecting much of anything in the first place. It’s just pain-in-the-butt-as-a-feature.  More precisely, it’s another example of Rick Wash’s profoundly interesting turn of phrase, Folk Security. Put simply, there is a belief that if something is hard for a legitimate user, it’s even harder for the hacker. Karmic security is (unfortunately) not a property of the universe.

(What about the attacker with an inline keylogger? Not only does he have physical access, he’s not actually constrained to just emulating a keyboard. He’s on the USB bus, he has many more interesting devices to spoof.)

That’s not to say spider.io has not found a bug. Mouse events should only come from the web frame for which script has dominion over, in much the same way CNN should not be receiving Image Load events from a tab open to Yahoo. But the story of the last decade is that bugs are not actually rare, and that from time to time issues will be found in everything. We don’t need to have an outright panic when a small leak is found. The truth is, every remote code execution vulnerability can also capture full screen mouse motion. Every universal cross site scripting attack (in which CNN can inject code into a frame owned by Yahoo) can do similar, though perhaps only against other browser windows.

I would like to live in a world where this sort of very limited overextension of the web security model warrants a strong reaction. It is in fact nice that we do live in a world where browsers effectively expose the most nuanced and well developed (if by fire) security model in all of software. Where else is even the proper scope of mouse events even a comprehensible discussion?

(Note that it’s a meaningless concept to say that mouse events within the frame shouldn’t be capturable. Being able to “hover” on items is a core user interface element, particularly for the highly dynamic UI’s that Canvas and WebGL enable. The depth of damage one would have to inflict on the browser usability model, to ‘secure’ activity in what’s actually the legitimate realm of a page, would be profound. When suggesting defenses, one must consider whether the changes required to make them reparable under actual assault ruins the thing being defended in the first place. We can’t go off destroying villages in order to save them.)

So, in summary: Sure, there’s a bug here with these mouse events. I expect it will be fixed, like tens of thousands of others. But it’s not particularly significant.  What can an attacker do today, that he couldn’t do yesterday?  Not much, to not many.  Spider.io’s up to interesting stuff, but not really this.

Categories: Security

DakaRand 1.0: Revisiting Clock Drift For Entropy Generation

August 15, 2012 22 comments

“The generation of random numbers is too important to be left to chance.”
Robert R. Coveyou

“One out of 200 RSA keys in the field were badly generated as a result of standard dogma.  There’s a chance this might fail less.”
–Me

[Note:  There are times I write things with CIO’s in mind.  This is not one of those times.]

So, I’ve been playing with userspace random number generation, as per Matt Blaze and D.P. Mitchell’s TrueRand from 1996.  (Important:  Matt Blaze has essentially disowned this approach, and seems to be honestly horrified that I’m revisiting it.)  The basic concept is that any system with two clocks has a hardware number generator, since clocks jitter relative to one another based on physical properties, particularly when one is operating on a slow scale (like, say, a human hitting a keyboard) while another is operating on a fast scale (like a CPU counter cycling at nanosecond speeds).  Different tolerances on clocks mean more opportunities for unmodelable noise to enter the system.  And since the core lie of your computer is that it’s just one computer, as opposed to a small network of independent nodes running on their own time, there should be no shortage of bits to mine.

At least, that’s the theory.

As announced at Defcon 20 / Black Hat, here’s DakaRand 1.0.  Let me be the first to say, I don’t know that this works.  Let me also be the first to say, I don’t know that it doesn’t.  DakaRand is a collection of modes that tries to convert the difference between clocks into enough entropy that, whether or not it survives academic attack, would certainly force me (as an actual guy who breaks stuff) to go attack something else.

A proper post on DakaRand is reserved, I think, for when we have some idea that it actually works.  Details can be seen in the slides for the aforementioned talk; what I’d like to focus on now is recommendations for trying to break this code.  The short version:

1) Download DakaRand, untar, and run “sh build.sh”.
2) Run dakarand -v -d out.bin -m [0-8]
3) Predict out.bin, bit for bit, in less than 2^128 work effort, on practically any platform you desire with almost any level of active manipulation you wish to insert.

The slightly longer version:

  1. DakaRand essentially tries to force the attacker into having no better attack than brute force, and then tries to make that work effort at least 2^128.  As such, the code is split into generators that acquire bits, and then a masking sequence of SHA-256, Scrypt, and AES-256-CTR that expands those bits into however much is requested.  (In the wake of Argyros and Kiayias’s excellent and underreported “I Forgot Your Password:  Randomness Attacks Against PHP Applications“, I think it’s time to deprecate all RNG’s with invertable output.  At the point you’re asking whether an RNG should be predictable based on its history, you’ve already lost.)  The upshot of this is that the actual target for a break is not the direct output of DakaRand, but the input to the masking sequence.  Your goal is to show that you can predict this particular stream, with perfect accuracy, at less than 2^128 work effort.  Unless you think you can glean interesting information from the masking sequence (in which case, you have more interesting things to attack than my RNG), you’re stuck trying to design a model of the underlying clock jitter.
  2. There are nine generators in this initial release of DakaRand.  Seriously, they can’t all work.
  3. You control the platform.  Seriously — embedded, desktop, server, VM, whatever — it’s fair game.  About the only constraint that I’ll add is that the device has to be powerful enough to run Linux.  Microcontrollers are about the only things in the world that do play the nanosecond accuracy game, so I’m much less confident against those.  But, against anything ARM or larger, real time operation is simply not a thing you get for free, and even when you pay dearly for it you’re still operating within tolerances far larger than DakaRand needs to mine a bit.  (Systems that are basically cycle-for-cycle emulators don’t count.  Put Bochs and your favorite ICE away.  Nice try though!)
  4. You seriously control the platform.  I’ve got no problem with you remotely spiking the CPU to 100%, sending arbitrary network traffic at whatever times you like, and so on.  The one constraint is that you can’t already have root — so, no physical access, and no injecting yourself into my gathering process.  It’s something of a special case if you’ve got non-root local code execution.  I’d be interested in such a break, but multitenancy is a lie and there’s just so many  interprocess leaks (like this if-it’s-so-obvious-why-didn’t-you-do-it example of cross-VM communication).
  5. Virtual machines get special rules:  You’re allowed to suspend/restore right up to the execution of DakaRand.  That is the point of atomicity.
  6. The code’s a bit hinky, what with globals and a horde of dependencies.  If you’d like to test on a platform that you just can’t get DakaRand to build on, that makes things more interesting, not less.  Email me.
  7. All data generated is mixed into the hash, but bits are “counted” when Von Neumann debiasing works.  Basically, generators return integers between 0 and 2^32-1.  Every integer is mixed into the keying hash (thus, you having to predict out.bin bit for bit).  However, each integer is also measured for the number of 1’s it contains.  An even number yields a 0; an odd number, a 1.  Bits are only counted when two sequential numbers have either a 10 or a 01, and as long as there’s less than 256 bits counted, the generator will continue to be called.  So your attack needs to model the absolute integers returned (which isn’t so bad) and the amount of generator calls it takes for a Von Neumann transition to occur and whether the transition is a 01 or a 10 (since I put that value into the hash too).
  8. I’ve got a default “gap” between generator probes of just 1000us — a millisecond.  This is probably not enough for all platforms — my assumption is that, if anything has to change, that this has to become somewhat dynamic.

Have fun!  Remember, “it might fail somehow somewhere” just got trumped by “it actually did fail all over the place”, so how about we investigate a thing or two that we’re not so sure in advance will actually work?

(Side note:  Couple other projects in this space:  Twuewand, from Ryan Finnie, has the chutzpah to be pure Perl.  And of course, Timer Entropyd, from Folkert Van Heusden.  Also, my recommendation to kernel developers is to do what I’m hearing they’re up to anyway, which is to monitor all the interrupts that hit the system on a nanosecond timescale.  Yep, that’s probably more than enough.)

Categories: Security