Security | Dan Kaminsky's Blog

From 0day to 0data: TelnetD

December 29, 2011 Dan Kaminsky Leave a comment

Recently, it was found that BSD-derived Telnet implementations had a fairly straightforward vulnerability in their encryption handler. (Also, it was found that there was an encryption handler.) Telnet was the de facto standard protocol for remote administration of everything but Windows systems, so there’s been some curiosity in just how nasty this bug is operationally.

So, with the help of Fyodor and the NMAP scripting team, I set out to collect some data on just how widespread this bug is. First, I set out to find all hosts listening on tcp/23. Then, with Fyodor’s code, I checked for encryption support. Presence of support does not necessarily imply vulnerability, but absence does imply invulnerability.

The scanner’s pretty rough, and I’m doing some sampling here, but here’s the bottom line data:

Out of 156113 confirmed telnet servers randomly distributed across the Internet, only 429 exposed encryption support. By far my largest source of noise is the estimate of just how many telnet servers (as opposed to just TCP listeners on 23, or transient services) exist across the entire Internet. My conservative numbers place that count at around 7.8M. This yields an estimate of ~~~13785~~ ~21546 (or 15,600 to 23,500, at 95% confidence — thanks, David Horn!) potentially vulnerable servers.

Patched servers will still look vulnerable unless they remove encryption support entirely, so this isn’t something we can watch get fixed (well, unless we’re willing to crash servers, which I’m not, even if it’d be safe because these things are launched from inetd).

Here is a command line that will scan your networks. You may need nmap from svn.

/usr/local/bin/nmap -p23 -PS23 --script telnet-encryption -T5 -oA telnet_logs -iL list_of_ips -v

So, great bug, just almost certainly not widely distributed. I’ll document the analysis process when I’m not at con.

Categories: Security

Exploring ReU: Rewriting URLs For Fun And XSRF/HTTPS

October 20, 2011 Dan Kaminsky 1 comment

So I’ve been musing lately about a scheme I call ReU. It would allow client side rewriting of URLs, to make the addition of XSRF tokens much easier. As a side effect, it should also make it much easier to upgrade the links on a site from http to https. Does it work? Is it a good idea? I’m not at all sure. But I think it’s worth talking about:

===

There is a consistent theme to both XSS and XSRF attacks: XS. The fact that foreign sites can access local resources, with all the credentials of the user, is a fundamental design issue. Random web sites on the Internet should not be able to retrieve Page 3 of 5 of a shopping cart in the same way they can retrieve this blog entry! The most common mechanism recommended to prevent this sort of cross-site chicanery is to use a XSRF token, like so:

http://foo.com/bar?session_id=12345

The token is randomized, so now links from the outside world fail despite having appropriate credentials in the cookie.

In some people’s minds, this is quite enough. It only needs to be possible to deploy secure solutions; cost of deployment is so irrelevant, it’s not even worth being aware of. The problem is, somebody’s got to fix this junk, and it turns out that retrofitting a site with new URLs is quite tricky. As Martin Johns wrote:

…all application local URLs have to be rewritten for using the randomizer object. While standard HTML forms and hyperlinks pose no special challenge, prior existing JavaScript may be harder to deal with. All JavaScript functions that assign values to document.location or open new windows have to be located and modiﬁed. Also all existing onclick and onsubmit events have to be rewritten. Furthermore, HTML code might include external referenced JavaScript libraries, which have to be processed as well. Because of these problems, a web application that is protected by such a solution has to be examined and tested thoroughly.

Put simply, web pages are pieced together from pieces developed across sprints, teams, companies, and languages. No one group knows the entire story — but if anyone screws up and leaves the token off, the site breaks.

The test load to prevent those breakages from reoccurring are a huge part of why delivering secure solutions is such a slow and expensive process.

So I’ve been thinking: If upgrading the security of URLs is difficult server side, because the code there tends to be fragmented, could we do something on the client? Could the browser itself be used to add XSRF tokens to URLs in advance of their retrieval or navigation, in a way that basically came down to a single script embedded in the HEAD segment of an HTML file?

Would not such a scheme be useful not only for adding XSRF tokens, but also changing http links to https, paving the way for increasing the adoption of TLS?

The answer seems to be yes. A client side rewriter — ReU, as I call it — might very well be a small, constrained, but enormously useful request of the browser community. I see the code looking something vaguely like:

var session_token=aaa241ee1298371;
var session_domain="foo.com";

function analyzeURL(url, domnode, flags)
{
   domain = getDomain(url);
   // only add token to local resources
   if(isInside(session_domain, domain)) { 
      addParam(url, "token", session_token); }
      forceHTTPS(url);
   }
   return url;
}

window.registerURLRewriter(analyzeURL);

Now, something that could totally kill this proposal, is if browsers that didn’t support the extension suddenly broke — or, worse, if users with new browsers couldn’t experience any of the security benefits of the extension, lest older users have their functionality broken. Happily, I think we can avoid this trap, by encoding in the stored cookie whether the user’s browser supports ReU. Remember, for the XSRF case we’re basically trying to figure out in what situation we want to ignore a browser’s request despite the presence of a valid cookie. So if the cookie reports to us that it came from a user without the ability to automatically add XSRF tokens, oh well. We let that subset through anyway.

As long as the bad guys don’t have the ability to force modern browsers to appear as insecure clients, we’re OK.

Now, there’s some history here. Jim Manico pointed me at CSRFGuard, a project from the OWASP guys. This also tries to “upgrade” URLs through client side inspection. The problem is that real world frameworks really do assemble URLs dynamically, or go through paths that CSRFGuard can’t interrupt. These aren’t obscure paths, either:

No known way to inject CSRF prevention tokens into setters of document.location (ex: document.location=”http://www.owasp.org”;)
Tokens are not injected into HTML elements dynamically generated using JavaScript “createElement” and “setAttribute”

Yeah, there’s a lot of setters of document.location in the field. Ultimately, what we’re asking for here is a new security boundary for browsers — a direct statement that, if the client’s going somewhere or grabbing something, this inspector is going to be able to take a look at it.

There’s also history with both the Referer and Origin headers, which I spoke about (along with the seeds of this research) in my Interpolique talk. The basic concept with each is that the navigation source of a request — say http://www.google.com — would identify itself in some manner in the HTTP header of the subsequent request. The problem for Referer is that there are just so many sources of requests, and it’s best effort whether they show up or not (let alone whether they show up correctly).

Origin was its own problem. Effectively, they added this entire use case where images and links on a shared forum wouldn’t get access to Origin headers. It was necessary for their design, because their security model was entirely static. But for everything that’s *not* an Internet forum, the security model is effectively random.

With ReU, you write the code (or at least import the library) that determines whether or not a token is applied. And as a bonus, you can quickly upgrade to HTTPS too!

There is at least one catastrophic error case that needs to be handled. Basically, those pop-under windows that advertisers use, are actually really useful for all sorts of interesting client side attacks. See, those pop-unders usually retain a connection to the window they came from, and (here’s the important part) can renavigate those windows on demand. So there’s a possible scenario in which you go to a malicious site, it spawns the pop-under, you go to another site, you log in, and then the malicious page drags your real logged in window to a malicious URL at the otherwise protected site.

It’s a navigation event, and so a naive implementation of ReU would fire, adding the token. Yikes.

The solution is straightforward — guarantee that, at least by default, any event that fired the rewriter came from a local source. But that brings up a much trickier question:

Do we try to stop clickjacking with this mechanism as well? IFrames are beautiful, but they are also the source of a tremendous number of events that comes from rather untrusted parties. It’s one thing to request a URL rewriter that can special case external navigation events. It’s possibly another to get a default filter against events sourced from actions that really came from an IFrame parent.

It is very, very possible to make technically infeasible demands. People think this is OK, because it counts as “trying”. The problem is that “trying” can end up eating your entire budget for fixes. So I’m a little nervous about making ReU not fire when the event that triggered it came from the parent of an IFrame.

There are a few other things of note — this approach is rather bookmark friendly, both because the URLs still contain tokens, and because a special “bookmark handler” could be registered to make a long-lived bookmark of sorts. It’s also possible to allow other modulations for the XSRF token, which (as people note) does leak into Referer headers and the like. For example, HTTP headers could be added, or the cookie could be modified on a per request basis. Also, for navigations offsite, secrets could be removed from the referring URL before handed to foreign sites — this could close a longstanding concern with the Referer header too.

Finally, for debuggability, it’d be necessary to have some way of inspecting the URL rewriting stack. It might be interesting to allow the stack to be locked, at least in order of when things got patched in. Worth thinking about.

In summary, there’s lots of quirks to noodle on — but I’m fairly convinced that we can give web developers a “one stop shop” for implementing a number of the behaviors we in the security community believe they should be exhibiting. We can make things easier, more comprehensive, less buggy. It’s worth exploring.

Categories: Security

These Are Not The Certs You’re Looking For

August 31, 2011 Dan Kaminsky 16 comments

It began near Africa.

This is not, of course, the *.google.com certificate, seen in Iran, signed by DigiNotar, and announced on August 29th with some fanfare.

This certificate is for Facebook. It was seen in Syria, signed by Verisign (now Symantec), and was routed to me out of the blue by a small group of interesting people — including Stephan Urbach, “The Doctor”, and Meredith Patterson — on August 25th, four days prior.

It wasn’t the first time this year that we’d looked at a mysterious certificate found in the Syrian wild. In May, the Electronic Frontier Foundation reported that it “received several reports from Syrian users who spotted SSL errors when trying to access Facebook over HTTPS.” But this certificate would yield no errors, as it was legitimately signed by Verisign (now Symantec).

But Facebook didn’t use Verisign (now Symantec). From the States, I witnessed Facebook with a DigiCert cert. So too did Meredith in Belgium, with a third strike from Stephan in Belgium: Facebook’s CA was clearly DigiCert.

The SSL Observatory dataset, in all its thirty five gigabytes of glorious detail, agreed completely. With the exception of one Equifax certificate, Facebook was clearly identifiable by DigiCert alone.

And then Meredith found the final nail:

VeriSign intermediary certificate currently used for VeriSign Global certificates. Chained with VeriSign Class 3 Public Primary Certification Authority. End of use in May 2009.

So here we were, looking at a Facebook certificate that couldn’t be found anywhere on the net — by any of us, or by the SSL Observatory — signed by an authority that apparently was never used by the site, found in a country that had recently been found interfering with Facebook communication, using a private key that had been decommissioned for years. Could things get any sketchier?

Well, they could certainly get weirder. That certificate that was killed in 2009, was the certificate killed by us! That’s the MD2-based certificate that Meredith, myself, and the late Len Sassaman warned about in our Black Hat 2009 talk (Slides / Paper)! Somehow, the one cryptographic function that we actually killed before it could do any harm (at the CA, in the browser, even via RFC) was linked to a Facebook certificate issued on July 13th, 2011, two years after its death?

This wasn’t just an apparently false certificate. This was overwhelmingly, almost cartoonishly bogus. Something very strange was happening in Syria.

Facebook was testing some new certificates on one of their load balancers.

WOT

I’ve got a few friends at Facebook. So, with the consent of everyone else involved, I pinged them — “Likely Spoofed Facebook Cert in Syria”. Thought it would get their attention.

It did.

We’re safe.

Turns out we just ordered new (smaller) certs from Verisign and deployed them to a few test LBs on Monday night. We’re currently serving a mixture of digicert and verisign certificates.

One of our test LBs with the new cert: a.b.c.d:443

And thus, what could have been a flash of sound and fury, ultimately signifying nothing, was silenced. There was no attack. Facebook had the private key matching the cert — something not even Verisign would have had, in the case of an actual Syrian break.

In traditional security fashion, we dropped our failed investigation and made no public comment about it — perhaps it’d be a funny story for a future conference, but it was certainly nothing to discuss with urgency.

And then, four days later, the DigiNotar break of *.google.com went public. Totally unrelated, and obviously in progress while our false alarm was ringing. The fates are strange sometimes.

ACTIONABLE INTELLIGENCE

The only entity that can truly assert the cryptographic keys of Facebook, is Facebook. We, as outside observers with direct expertise, detected practically every possible sign that this particular Facebook cert was malicious. It wasn’t. It just wasn’t. That such a comically incorrect certificate might actually be real, has genuine design consequences.

It means you have to put an error screen, with bypass, up for the user. No matter what the outside observers say, they may very well be wrong. The user must be given an override. And the data on what users do, when given an SSL error screen, isn’t remotely subtle. Moxie Marlinspike, probably the smartest guy in the world right now on SSL issues, did a study a few years ago on how many Tor users — not even regular users, but Tor users, clearly concerned about their privacy and possessed with some advanced level of expertise — would notice SSL being disabled and refuse to browse their desired content.

Moxie didn’t find a single user who resisted disabled security. His tool, sslsniff, worked against 100% of the sample set.

To be fair, Moxie’s tool didn’t pop an error — it suppressed security entirely. And, who knows, maybe Tor users think they’ve “done enough” for security and just consider their security budget spent after that. Like most things in security, our data is awful.

But, you know, this UI data ain’t exactly encouraging.

Recently, Moxie’s been presenting some interesting technology, in the form of his Convergence platform. There’s a lot to be said about Convergence, which frankly, I don’t have time to write about right now. (Disclosure: I’m going to Burning Man in about twelve hours. Honestly, I was hoping to postpone this entire discussion until N00ter’s first batch of data was ready. Ah well.) But, to be clear, I was fooled, Meredith was fooled, Moxie would have been fooled…I mean, look at this certificate! How could it possibly be right?

A SMALL TECHNICAL PREVIEW

It is true that the only entity that can truly assert being Facebook, is Facebook. Does that mean everybody should just host self-signed certificates, where they assert themselves directly? Well, no. It doesn’t scale, and just ends up repeatedly asking the user whether a key (first seen, or changed) should be trusted.

Whether a key should be trusted, is an answer to which users only reply yes. Nice for blaming users, I suppose, but it doesn’t move the ball.

What’s better is a way to go from users having to trust everybody, to users having to trust only a small subset of organizations through which they have a business relationship with. Then, that subset of organization can enter into business relationships with the hordes of entities that need their entities asserted…

…and we just reinvented Certificate Authorities.

There’s a lot to say about CA’s, but one thing my friend at Verisign (there were a number of friendly mails exchanged) did point out is that the Facebook certificate passed a check via OCSP — the Online Certificate Status Protocol. Designed because CRLs just do not scale, OCSP allows a browser to quickly determine whether a given certificate is valid or not.

(Not quickly enough — browsers only enforce OCSP checks under the rules of Extended Validation, and *maybe* not even then. I’ve heard rumors.)

CA’s are special, in that they have an actual business relationship with the named parties. For a brief moment, I was excited — perhaps a way for everybody to know if a particular cert was valid, was to ask the certificate authority that issued it! Maybe even we could do something short of the “Internet Death Penalty” for certificates — browsers could be forced to check OCSP before accepting any certificate from a questioned authority.

Alas, there was a problem — and not just “the only value people are adding is republishing the data from the CA”. No, this concept doesn’t work at all, because OCSP assumes a CA never loses control of its keys. I repeat, the system in place to handle a CA losing its keys, assumes the CA never loses the keys.

Look at this.

$ openssl ocsp -issuer vrsn_issuer_cert.pem -cert Cert.pem -url http://ocsp.verisign.com -resp_text
OCSP Response Data:
OCSP Response Status: successful (0x0)
Response Type: Basic OCSP Response
Version: 1 (0x0)
Responder Id: O = VeriSign Trust Network, OU = “VeriSign, Inc.”, OU = VeriSign International Server OCSP Responder – Class 3, OU = Terms of use at http://www.verisign.com/rpa (c)03
Produced At: Aug 30 09:18:44 2011 GMT
Responses:
Certificate ID:
Hash Algorithm: sha1
Issuer Name Hash: C0FE0278FC99188891B3F212E9C7E1B21AB7BFC0
Issuer Key Hash: 0DFC1DF0A9E0F01CE7F2B213177E6F8D157CD4F6
Serial Number: 092197E1C0E9CD03DA35243656108681
Cert Status: good
This Update: Aug 30 09:18:44 2011 GMT
Next Update: Sep 6 09:18:44 2011 GMT

Signature Algorithm: sha1WithRSAEncryption
d0:01:89:71:4c:f9:81:e8:5f:bf:d0:cb:46:e5:ba:34:30:bf:
a0:3a:33:f9:7f:47:51:f8:9b:1d:7d:bc:bc:95:b8:6f:64:49:
e5:b4:34:d9:eb:75:ff:6a:a2:01:b4:0a:ae:d5:f0:1d:a2:bf:
56:45:f2:42:27:44:ef:9b:5f:0d:ba:47:60:ac:1b:eb:9a:45:
f8:c8:3d:27:ad:0e:cb:ae:2a:03:7c:95:40:2b:5f:5b:5a:be:
ad:ea:60:54:15:39:a2:07:37:ac:11:b0:a6:42:2e:03:14:7e:
ff:0e:7c:87:30:f7:63:01:4c:fa:e3:42:54:bc:e8:f3:81:4c:
ba:66
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
78:3a:4f:aa:ce:5f:ab:d6:8f:bd:60:9d:35:53:d4:e9
Signature Algorithm: sha1WithRSAEncryption
Issuer: O=VeriSign Trust Network, OU=VeriSign, Inc., OU=VeriSign International Server CA – Class 3, OU=www.verisign.com/CPS Incorp.by Ref. LIABILITY LTD.(c)97 VeriSign
Validity
Not Before: Aug 4 00:00:00 2011 GMT
Not After : Nov 2 23:59:59 2011 GMT
Subject: O=VeriSign Trust Network, OU=VeriSign, Inc., OU=VeriSign International Server OCSP Responder – Class 3, OU=Terms of use at http://www.verisign.com/rpa (c)03
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public Key: (1024 bit)
Modulus (1024 bit):
00:ef:a4:35:c9:29:e0:66:ca:2c:4e:77:95:55:55:
e8:60:46:5e:f3:8d:2c:c9:d2:83:01:f7:9c:2b:29:
b6:75:f2:e2:7e:45:f6:e9:29:ff:1a:54:d6:40:d2:
72:ef:78:12:7f:18:5c:c0:dc:ee:ac:c3:7c:97:9b:
fb:d5:b1:96:ab:07:bd:5e:e0:b0:de:0c:e9:03:ae:
c3:9f:36:dd:ad:74:97:8a:9e:81:4e:66:85:17:e6:
24:fa:a7:b6:ce:fd:56:f3:d4:c9:f2:36:3b:8d:dc:
fe:ae:0b:9c:25:23:06:8b:88:52:44:94:2c:4a:0c:
50:1a:c1:3f:ff:b3:17:c4:ab
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Basic Constraints:
CA:FALSE
X509v3 Certificate Policies:
Policy: 2.16.840.1.113733.1.7.23.3
CPS: https://www.verisign.com/CPS
User Notice:
Organization: VeriSign, Inc.
Number: 1
Explicit Text: VeriSign’s CPS incorp. by reference liab. ltd. (c)97 VeriSign

X509v3 Extended Key Usage:
OCSP Signing
X509v3 Key Usage:
Digital Signature
OCSP No Check:

X509v3 Subject Alternative Name:
DirName:/CN=OCSP8-TGV7-123
Signature Algorithm: sha1WithRSAEncryption
52:c1:39:cf:65:8e:52:b1:a2:3c:38:dc:e2:d1:3e:bc:85:41:
de:c6:88:61:91:82:10:39:cb:63:62:8b:02:8c:7e:ce:dc:df:
4d:2f:44:d1:f0:89:e1:64:8d:5e:57:5b:a5:71:80:29:28:0b:
b9:c0:8b:68:1c:29:b6:5d:d7:99:d8:b8:e3:a3:11:61:8f:c6:
fb:f3:ad:3b:35:bb:fc:30:b9:0f:69:06:19:9f:9f:72:6d:a3:
f7:06:41:8f:8c:2a:26:94:ba:8e:cf:f6:76:75:b0:35:57:fb:
92:ac:79:03:91:17:24:46:09:88:ca:44:03:92:ed:e5:b8:76:
92:28
—–BEGIN CERTIFICATE—–
MIIEDDCCA3WgAwIBAgIQeDpPqs5fq9aPvWCdNVPU6TANBgkqhkiG9w0BAQUFADCB
ujEfMB0GA1UEChMWVmVyaVNpZ24gVHJ1c3QgTmV0d29yazEXMBUGA1UECxMOVmVy
aVNpZ24sIEluYy4xMzAxBgNVBAsTKlZlcmlTaWduIEludGVybmF0aW9uYWwgU2Vy
dmVyIENBIC0gQ2xhc3MgMzFJMEcGA1UECxNAd3d3LnZlcmlzaWduLmNvbS9DUFMg
SW5jb3JwLmJ5IFJlZi4gTElBQklMSVRZIExURC4oYyk5NyBWZXJpU2lnbjAeFw0x
MTA4MDQwMDAwMDBaFw0xMTExMDIyMzU5NTlaMIGwMR8wHQYDVQQKExZWZXJpU2ln
biBUcnVzdCBOZXR3b3JrMRcwFQYDVQQLEw5WZXJpU2lnbiwgSW5jLjE/MD0GA1UE
CxM2VmVyaVNpZ24gSW50ZXJuYXRpb25hbCBTZXJ2ZXIgT0NTUCBSZXNwb25kZXIg
LSBDbGFzcyAzMTMwMQYDVQQLEypUZXJtcyBvZiB1c2UgYXQgd3d3LnZlcmlzaWdu
LmNvbS9ycGEgKGMpMDMwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAO+kNckp
4GbKLE53lVVV6GBGXvONLMnSgwH3nCsptnXy4n5F9ukp/xpU1kDScu94En8YXMDc
7qzDfJeb+9WxlqsHvV7gsN4M6QOuw5823a10l4qegU5mhRfmJPqnts79VvPUyfI2
O43c/q4LnCUjBouIUkSULEoMUBrBP/+zF8SrAgMBAAGjggEZMIIBFTAJBgNVHRME
AjAAMIGsBgNVHSAEgaQwgaEwgZ4GC2CGSAGG+EUBBxcDMIGOMCgGCCsGAQUFBwIB
FhxodHRwczovL3d3dy52ZXJpc2lnbi5jb20vQ1BTMGIGCCsGAQUFBwICMFYwFRYO
VmVyaVNpZ24sIEluYy4wAwIBARo9VmVyaVNpZ24ncyBDUFMgaW5jb3JwLiBieSBy
ZWZlcmVuY2UgbGlhYi4gbHRkLiAoYyk5NyBWZXJpU2lnbjATBgNVHSUEDDAKBggr
BgEFBQcDCTALBgNVHQ8EBAMCB4AwDwYJKwYBBQUHMAEFBAIFADAmBgNVHREEHzAd
pBswGTEXMBUGA1UEAxMOT0NTUDgtVEdWNy0xMjMwDQYJKoZIhvcNAQEFBQADgYEA
UsE5z2WOUrGiPDjc4tE+vIVB3saIYZGCEDnLY2KLAox+ztzfTS9E0fCJ4WSNXldb
pXGAKSgLucCLaBwptl3Xmdi446MRYY/G+/OtOzW7/DC5D2kGGZ+fcm2j9wZBj4wq
JpS6js/2dnWwNVf7kqx5A5EXJEYJiMpEA5Lt5bh2kig=
—–END CERTIFICATE—–
Cert.pem: good
This Update: Aug 30 09:18:44 2011 GMT
Next Update: Sep 6 09:18:44 2011 GMT

I’ve spent years saying X.509 is broken, but seriously, this is it for me. I copy this in full because the full reply needs to be parsed, and I don’t see this as dropping 0day because it’s very clearly an intentional part of the design. Here is what OCSP asserts:

Certificate ID:
Hash Algorithm: sha1
Issuer Name Hash: C0FE0278FC99188891B3F212E9C7E1B21AB7BFC0
Issuer Key Hash: 0DFC1DF0A9E0F01CE7F2B213177E6F8D157CD4F6
Serial Number: 092197E1C0E9CD03DA35243656108681
Cert Status: good

To translate: “A certificate with the serial number 092197E1C0E9CD03DA35243656108681, using sha1, signed by a name with the hash of C0FE0278FC99188891B3F212E9C7E1B21AB7BFC0, using a key with the hash of 0DFC1DF0A9E0F01CE7F2B213177E6F8D157CD4F6, is good.”

To translate even further, please imagine a scene from one of my favorite shows, the IT crowd.

Roy: Moss…did you…did you sign this certificate?
Moss: I did indeed sign a certificate…with that serial number.
Roy: Yes, but did you sign this particular certificate?
Moss: Yes, I did indeed sign a particular certificate…with that serial number.
Roy: Moss? Did YOU sign, with YOUR key, this particular certificate?
Moss: Roy. I DID sign, with MY key, a certificate….with that serial number.
Roy: Yes, but what if the certificate you signed isn’t the certificate that I have? What if your certificate is for Moe’s Doner Shack, and mine is for *.google.com?
Moss: I would never sign two certificates with the same serial number.
Roy: What if you did?
Moss: I would never sign two certificates with the same serial number.
Roy: What if somebody else did, got your keys perhaps?
Moss: I would never sign two certificates with the same serial number.
Roy: Moss?
Moss: I would never sign two certificates with the same serial number.

Long and short of it is that OCSP doesn’t tell you if a certificate is good, it just tells you if the CA thinks there’s a good certificate out there with that serial number. An attacker who can control the serial number — either by compromising the infrastructure, or just getting the keys — wins.

(This is very similar to the issue Kevin S. McArthur (and Marsh Ray?) and I were musing about on Twitter a few months back during Comodogate, in which it became clear that CRL’s also only secure serial numbers. The difference is that CRL’s are an ancient technology that we knew didn’t work, while OCSP was designed with full awareness of threats and really went out of its way to avoid hashing the parts of the certificate that actually mean something, like Subject Name and whether the certificate is a God Mode Intermediate.)

Now, that’s not to say that another certificate validation endpoint couldn’t be exposed by the CAs, one that actually reported whether a particular certificate was valid. The problem is that CAs can lie — they can claim the existence of a business relationship, when none exists. The CA’s I deal with are actually quite upstanding and hard working — let me call out Comodo and say unequivocally that they took a lot of business risk for doing the right thing, and it couldn’t have been an easy decision to make. The call to push for a browser patch, and not to pretend that browser revocation checking was meaningful, was a pretty big deal. But with over 1500 known entities with the ability to declare certificates for everyone, the trust level is unfortunately degraded.

1500 is too big. We need fewer, and more importantly, we need to reward those that implement better products. We need a situation where there is at least a chain of business (or governmental — we’ll talk about this more next week) relationships between the identity we trust, and the entities that help us recognize that trust — and that those outside that trust (DigiNotar) can be as easily excluded as those inside the trust (Verisign/Symantec) are included.

DNSSEC doesn’t do everything. Specifically, it can’t assert brands, which are different things entirely from domain names. But the degree to which it ties *business relationships* to *key management* is unparalleled — there’s a reason X.509 certificates, for all the gunk they coded into Subject Names, ended up running straight to DNS names as “Common Names” the moment they needed to scale. That’s what every other application does, when it needs to work outside the org! From the registrar that inserts a domain into DNS, to the registry that hosts only the top portion of the namespace, to the one root which is guarded from so much technical and bureaucratic influence, DNSSEC aligns trust with extant relationships.

And, in no uncertain terms, when DNSSEC tells you a certificate is valid, it won’t just be talking about the serial number.

More later. For now, Burning Man. ISPs, a reminder…I’m serious, I genuinely would prefer N00ter not find anything problematic.

A SMALL EPILOGUE

Curious how Facebook could have a certificate that chained to a dead cert signed with a broken hashing function? So was I, until I remembered — roots within browsers are not trusted because of their self signature (best I can tell, just a cheap trick to make ASN.1 parsers happy). They are trusted because the code is made to explicitly trust them. The public key in a certificate otherwise self-signed by MD2 is no less capable of signing SHA-1 blobs as well, which is what’s happening in the Facebook cert. If you want to see the MD2 kill actually breaking stuff, see this post by the brilliant Eric Lawrence.

It IS a little weird, but man, that’s ops. Ops is always a little weird.

Categories: Security

Black Ops of TCP/IP 2011

August 5, 2011 Dan Kaminsky 4 comments

This year’s Black Hat and Defcon slides!

Man, it’s nice to be playing with packets again!

People seem to be rather excited (Forbes, Dark Reading, Search Security) about the Neutrality Router I’ve been working on. It’s called N00ter, and in a nutshell, it normalizes your link such that any differences in performance can’t be coming from different servers taking different routes, and have to instead be caused at the ISP. Here’s a summary of what I posted to Slashdot, explaining more succinctly what N00ter is up to.

Say Google is 50ms slower than Bing. Is this because of the ISP, or the routers and myriad server and path differentials between the ISP and Google, vs. the ISP and Bing? Can’t tell, it’s all conflated. We have to normalize the connection between the two sites, to measure if the ISP is using policy to alter QoS. Here’s how we do this with n00ter.

Start with a VPN, that creates an encrypted link from a Client to a broker/concentrator. An IP at the Broker talks plaintext with Google and Bing, who replies to the Broker. The Broker now encrypts the traffic back to the Client.

Policy can’t differentiate Bing traffic from Google traffic, it’s all encrypted.

Now, lets change things up — let’s have the Broker push the response traffic from Google and Bing, completely in the open. In fact, lets have it go so far as to spoof traffic from the original sources, making it look like there isn’t even a Broker in place. There’s just nice clean streams from Google and Bing.

If traffic from the same host, being sent over the same network path, but looking like Google, arrives faster (or slower) than traffic that looks like it came from Bing, then there’s policy differentiating Google from Bing.

Now, what if the policy is only applied to full flows, and not half flows? Well, in this case, we have one session that’s a straight normal download from Bing. Then we have another, where the entire client->server path is tunneled as before, but the Broker immediately emits the tunneled packets to Bing *spoofing the Client’s IP address*. So basically we’re now comparing the speed of a full legitimate flow to Bing, with a half flow. If QoS differs — as it would, if policy is only applied to full flows, then once again the policy is detected.

I call this client->server spoofing mode Roto-N00ter.

There’s more tricks, but this is what N00ter’s up to in a nutshell. It should work for anything IP based — if you want to know if XBox360 traffic routes faster than PS3 traffic, this’ll tell you.

Also, I’ve been doing some interesting things with BitCoin. (Len, we’ll miss you.) A few weeks ago, I gave a talk at Toorcon Seattle on the subject. Here are those slides as well.

Where’s the code? Well, two things are slowing down Paketto Keiretsu 3.0 (do people even remember scanrand?). First, I could use a release manager. I swear, packing stuff up for release is actually harder in many ways than writing the code! I do admit to know TCP rather better than Autoconf.

Secondly, and I know this is going to sound strange — I’m really not out to bust anyone with N00ter. Yes, I know it’s inevitable. But if a noxious filter is quietly removed with the knowledge that it’s presence is going to be provable in a court of law, well, all’s well that ends well, right?

So, give me a week or two. I have to get back from Germany anyway (the Black Ops talk will indeed be presented in a nuke hardened air bunker, overlooking MiG’s on the front lawn. LOL.)

Categories: Security

New York Times coming out against DNS filtering

June 9, 2011 Dan Kaminsky Leave a comment

This is rather cool: The New York Times is citing our paper and agreeing that filtering DNS for intellectual property violations would be both ineffective and counterproductive.

Some of the remedies are problematic. A group of Internet safety experts cautioned that the procedure to redirect Internet traffic from offending Web sites would mimic what hackers do when they take over a domain. If it occurred on a large enough scale it could impair efforts to enhance the safety of the domain name system.

This kind of blocking is unlikely to be very effective. Users could reach offending Web sites simply by writing the numerical I.P. address in the navigator box, rather than the URL. The Web sites could distribute free plug-ins to translate addresses into numbers automatically.

The bill before the Senate is an important step toward making piracy less profitable. But it shouldn’t pass as is. If protecting intellectual property is important, so is protecting the Internet from overzealous enforcement.

Categories: Security

On The RSA SecurID Compromise

June 9, 2011 Dan Kaminsky 7 comments

TL;DR: Authentication failures are getting us owned. Our standard technology for auth, passwords, fail repeatedly — but their raw simplicity compared to competing solutions drives their continued use. SecurID is the most successful post-password technology, with over 40 million deployed devices. It achieved its success by emulating passwords as closely as possible. This involved generating key material at RSA’s factory, a necessary step that nonetheless created the circumstances that allowed a third party compromise at RSA to affect customers like Lockheed Martin. There are three attack vectors that these hackers are now able to leverage — phishing attacks, compromised nodes, and physical monitoring of traveling workers. SecurID users however are at no greater risk of sharing passwords, a significant source of compromise. I recommend replacing devices in an orderly fashion, possibly while increasing the rotation rate of PINs. I dismiss concerns about source compromise on the grounds that both hardware and software are readily reversed, and anyway we didn’t change operational behavior when Windows or IOS source leaked. RSA’s communications leave a lot to be desired, and even though third party security dependencies are tolerable, it’s unclear (even with claims of a “shiny new HSM”) whether this particular third party security dependency should be tolerated.

Table Of Contents:

INTRODUCTION
MINIMIZING CHANGE
MANDATORY RECOVERY
ACTIONABLE INTELLIGENCE
ARE THINGS BETTER?

INTRODUCTION
There’s an old story, that when famed bank robber Willie Sutton was asked why he robbed banks, he replied: “Because that’s where the money is.” Why would hackers hit RSA Security’s SecurID authenticators?

Because that’s where the money is.

The failure of authentication technologies is one of the three great issues driving the cyber security crisis (the other two being broken code and invulnerable attackers). The data isn’t subtle: According to Verizon Business’ Data Breach Investigative Report, over half of all compromised records were lost to attacks in which a credential breach occurred. Realistically, that means passwords failed us in some way yet again.

No passwords.
Default passwords.
Shared passwords.
Stolen passwords.

By almost any metric, passwords are a failed authentication technology. But there’s one metric in which they win.

Passwords are really simple to implement — and through their simplicity, they’re incredibly robust to real world circumstances. Those silly little strings of text can be entered by users on any device, they can be tunneled inside practically any protocol across effectively any organizational boundary, and even a barely conscious programmer can compare string equality from scratch with no API calls required.

Password handling can be more complex. But they don’t have to be. The same can’t be said about almost any of the technologies that were pegged to replace passwords, like X.509 certificates or biometrics.

It is a hard lesson, but ops — operations — is really the decider of what technologies live or die, because they’re the guys who put their jobs on the line with every deployment. Ops loves simplicity. While “secure in the absence of an attacker” is a joke, “functional in the presence of a legitimate user” is serious business. If you can’t work when you must, you won’t last long enough to not work when you must not.

So passwords, with all their failures, persist in all but the most extreme scenarios. There’s however been one technology that’s been able to get some real world wins against password deployments: RSA SecurIDs. To really understand their recent fall, it’s important to understand the terms of their success. I will attempt to explain that here.

MINIMIZING CHANGE
Over forty million RSA SecurID tokens have been sold. By a wide margin, they are the most successful post-password technology of all time. What are they?

Password generators.

Every thirty seconds, RSA SecurID tokens generate a fresh new code for the user to identify himself with. They do require a user to maintain possession of the item, much as he might retain a post-it note with a hard to remember code. And they do require the server to eventually query a backend server for the correctness of the password, much as it might validate against a centralized password repository.

Everything between the user and the server, however, stays the same. The same keys are used to enter the same character sets into the same devices, which use the same forms and communication protocols to move a simple string from client A to server B.

Ops loves things that don’t change.

It gets better, actually. Before, passwords needed to be cycled every so often. Now, because these devices are “self-cycling”, the rigamarole of managing password rotation across a large number of users can be dispensed with (or kept, if desired, by rotating PIN numbers).

Ops loves the opportunity to do less work.

When things get interesting is when you ask: How do these precious new devices get configured? Passwords are fairly straightforward to provision — give the user something temporary and hideous, then pretend they’ll generate something similar under goofy “l33tspeak” constraints. Providing the user with a physical object is in fact slightly more complex, but not unmanageably so. It’s not like new users don’t need to be provisioned with other physical objects, like computers and badges. No, the question is how much work needs to go into configuring the device for a user.

There are no buttons on a SecurID token. That doesn’t just mean less things for a user to screw up and call help desk over, annoying Ops. It also means less work for Ops to do. Tapping configuration data into a hundred stupid little tokens, or worse, trying to explain to a user how to tap a key in, is expensive at best and miserable at worst. So here’s what RSA hit on:

Just put a random key, called a seed, on each token at the factory.
Have the device combine the seed, with time, to create the randomized token output.
Link each seed to a serial number.
Print the serial number on the device.
Send Ops the seeds for the serial numbers in their possession.

Now, all Ops needs to know is that a particular user has the device with a particular serial number, and they can tie the account to the token in possession. It’s as low configuration as it comes.

There are three interesting quirks of this deployment model:

First, it’s not perfect. SecurID tokens are still significantly more complicated to deploy than passwords, even independent of the cost. That they’re the most successful post-password technology does not make them a universal success, by any stretch of the imagination.

Second, the resale value of the SecurID system is impacted — anyone who owned your system previously, owns your systems today because they have the seeds to your authenticators and you can’t change them. This is not a bug (if you’re RSA), it’s a feature.

Third, and more interestingly, users of SecurID tokens don’t actually generate their own credentials. A major portion, the seed for each token, comes from RSA itself. What’s more, these seeds — or a “Master Key” able to regenerate them — were stored at RSA. It is clear that the seeds were lost to hackers unknown.

It seems obvious that storing the seeds to customer devices was a bug. I argue it was a feature as well.

MANDATING RECOVERY
I fly, a lot. Maybe too much. For me, the mark of a good airline is not how it functions when things go right. It’s what they do when thy don’t.

There’s far more room for variation when things go wrong.

Servers die. It happens. There’s not always a backup. That happens too. The loss of an authentication services is about as significant an event as can be, because (by definition) an outage is created across the entire company (or at least across entire userbases).

Is it rougher when a SecurID server dies, than when a password server dies?

Yes, I argue that it is. For while users have within their own memories backups of their passwords, they have no knowledge of the seeds contained within their tokens. They’ve got a token, it’s got a serial number, they’ve got work to do. The clock is ticking. But while replacing passwords is a matter of communication, replacing tokens is a matter of shipping physical items — in quantities large enough that nobody would have them simply lying around.

Ops hates anything that threatens a multi-day outage. That sort of stuff gets you fired.

And so RSA retained the ability to regenerate seed files for all their customers. Perhaps they did this by storing truly randomly generated secrets. Most likely though they combined a master secret with a serial number, creating a unique value that could nonetheless be restored later.

The ability to recover from a failure has been a thorn in the side of many security systems, particularly Full Disk Encryption products. It’s much easier to have no capacity to escrow keys, or to survive bad blocks. But then nobody deploys your code.

Over the many years that SecurID has been a going concern, these sorts of recoveries were undoubtedly far more common than deep compromises of RSA’s deepest secrets.

Things change. RSA finally got broken into, because that’s where the money was. And we’re seeing the consequences of the break being admitted at Lockheed Martin, who recently cycled the credentials of every user in their entire organization.

That ain’t cheap.

ACTIONABLE INTELLIGENCE
It’s important to understand how we got here. There have been some fairly serious accusations of incompetence, and I think they reflect a simple lack of understanding regarding why the SecurID system was built as it was. But where do we go from here? That requires actionable intelligence, which I define as knowing:

What can an attacker do today, that he couldn’t do yesterday, for what class attacker, to what class customer?

The answer seems to be:

An attacker with a username, a PIN, and at least one output from a SecurID token (but possibly a few more), can extend their limited access window indefinitely through the generation of arbitrary token outputs. The attacker must be linked to the hackers who stole the seed material from RSA. The customer must have deployed RSA SecurID.
This extends into three attack scenarios:

1) Phishing. Attackers have learned they can impersonate sites users are willing to log into, and can sometimes get them to provide credentials willingly. (Google recently accused a Chinese group of doing just this to a number of high profile targets in the US.) Before, the loss of a single SecurID login would only allow a one time access to the phished user’s network. Now the attacker can impersonate the user indefinitely.

2) Keylogging. It does happen that attackers acquire remote access to a user’s machine, and are able to monitor its resources and learn any entered passwords. Before, for the attacker to leverage the user’s access to a trusted internal network, he’d have to wait for the legitimate user to log in, and then run “along side” him either from the same machine or from a foreign one. This could increase the likelihood of discovery. Now the attacker can choose to run an attack at the time, and from the machine of his choosing.

3) Physical monitoring. Historically, information security professionals have mostly ignored the class of attacks known as TEMPEST, in which electromagnetic emissions from computing devices leak valuable information. But in the years since TEMPEST attacks were discovered other classes of attacks against keyboards — as complicated as lasers against the back of a screen, and as simple as telephoto lenses, have proven effective. Before, an attacker who managed to pick off a single log in had to use their findings within minutes in order for them to be useful. Now, the attacker can break in at their leisure. Note that there’s also some risk from attackers who merely pick up the serial number from the back of a token, if those attacker can learn or guess usernames and PINs.

There is one attack scenario that is not impacted: Many compromises are linked to shared credentials, i.e. one user giving their password to another willingly, or posting their credentials to a shared server. SecurID does still prevent password sharing of this type, since users aren’t in the class of attacker that has the seeds.

What to do in defense? My personal recommendation would be to implement an orderly replacement of tokens, unless there was some evidence of active attack in which case I’d say to replace tokens aggressively. In the meantime, rotating PINs on an accelerated basis — and looking for attempted logins that try to recycle old PINs — might be a decent short term mitigation.

But is that all that went wrong?

RAMPANT SPECULATION
The RSA “situation” has been going on for a couple months now, with no shortage of rumors swirling about what was lost, and no real guidance from RSA on the risk to their customers (at least none outside of NDA). Two other possible scenarios should be discussed.

The first is that it’s possible RSA lost their source code. Nobody cares. We didn’t stop using Windows or IOS when their source code leaked, and we wouldn’t stop using SecurID now even if they went Open Source tomorrow. The reality is that any truly interesting hardware has long since been dunked in sulfuric acid and analyzed from photographs, and any such software has been torn apart in IDA Pro. The fact that attackers had to break in to get the seed(s) reflects the fact that there wasn’t any magic backdoor to be found in what was already released.

The second scenario, brought up by the brilliant Mike Ossman, is that perhaps RSA employees had some crazy theoretical attack they’d discussed among themselves, but had never gone public with. Maybe this attack only applied to older SecurIDs, which had not been replaced because nobody outside had learned of the internal discovery. Under the unbound possibilities of this attack, who knows, maybe that occurred. It’s viable enough to cite in passing here, but there’s no evidence for it one way or another.

But whose fault is that?

STRANGE BEHAVIOR
It’s impossible to discuss the RSA situation without calling them out for the lack of actionable intelligence provided by the company itself. Effectively, the company went into full panic mode, then delivered guidance that amounted to tying one’s shoelaces.

That was a surprise. Ops hates surprises, especially when they concern whether critical operational components can be trusted or not.

It was doubly bizarre, given that by all signs half the security community guessed the precise impact of the attack within days of the original announcement. With the exception of the need to support recovery operations, most of the above analysis has been discussed ad nauseum.

But confirmations, and hard details, have only been delivered by RSA under NDA.

Why? A moderately cynical bit of speculation might be that RSA could only replace so many tokens at once, and they were afraid they’d simply get swamped if everybody decided their second factor authenticator had been fully broken. Only now, after several months, are they confident they can handle any and all replacement requests. In the wake of the Lockheed Martin incident, they’ve indeed publicly announced they will indeed effect replacements upon requests.

Whatever the cause, there are other tokens besides RSA SecurID, and I expect their sales teams to exploit this situation extensively. (A filing to the SEC that there would be no financial impact from this hack seemed…premature…given the lack of precedent either regarding the scale of attack or the behavior towards the existing customer base.)

Realistically, customers sticking with RSA should probably seek contractual terms that prevents a repeat of this scenario. That means two things. First, actual customers should obligate RSA to tell them of their exposure to events that may affect them. This is actually quite normal — shows up in HIPAA arrangements all the time, for example.

Second, customers should discover the precise nature of their continued exposure to seed loss at RSA HQ, even after their replacement tokens arrive.

ARE THINGS BETTER?
Jeff Moss recently asked the following question: “When I heard RSA had a shiney new half million dollar HSM to store seed files I wondered where had they been stored before?”

HSM’s are rare, so likely on a stock server that got owned.

I think the more interesting question is, what exactly does the HSM do, that is different than what was done before? HSMs generally use asymmetric cryptography to sign or decrypt information. From my understanding, there’s no asymmetric crypto in any SecurID token — it’s all symmetric crypto and hash functions. The best case scenario, then, would be the HSM receiving a serial number, internally combining it with a “master seed”, then returning the token seed to the factory.

That’s cool, but it doesn’t exactly stop a long term attack. Is there rate limiting? Is there alarming on recovery of older serial numbers? Is this in fact a custom HSM backend? If they are, is the new code in fact running inside the physically protected core?

It is also possible the HSM stores a private key that is able to decrypt randomly generated keys, encrypted to a long term public key. In this case, the HSM could be kept offline, only to be brought up in the occasional case of a necessary recovery.

We don’t really know, and ultimately we should.

See, here’s the thing. Some people are angry that vulnerabilities in a third party can affect their organization. The reality is that this is a ship that has long since sailed. Even if you ignore all the supply chain threats in the underlying hardware within an organizaton, there’s an entire world of software that’s constantly being updated, and automatically shipped to high percentages of clients and servers. There is no question that the future of software is, and must be, dynamic updates — if for no other reason that static defenses are hopelessly vulnerable to dynamic threats. The reality is that more and more of these updates are being left in the hands of the original vendors.

Compromise of these third party update chains is actually much more exploitable than even the RSA hit. At least that required partial credentials from the users to be exploited.

But the fact remains, while these other third parties might get targeted and might have their implicit trust exploited, RSA did. The attackers have not, and realistically will not get caught. They know RSA’s network. Can they come back and do it again?

It is one thing to choose to accept the third party risk from RSA’s authenticators. It’s another to have no choice. It’s not really clear whether RSA’s “shiny new HSMs” make the new SecurIDs more secure than the old ones. RSA has smart people, and it’s likely there’s a concrete improvement. But we need a little more than likeliness here.

In the presence of the HSM, the question should be asked: What are attackers unable to do today, that they could do yesterday, for the attackers who broke into RSA before, to RSA now?

Categories: Security

LA Times Defends DNS

June 8, 2011 Dan Kaminsky 1 comment

The Los Angeles Times has come out in support of our position on recursive DNS filtering!

The main problem with the bill is in its effort to render sites invisible as well as unprofitable. Once a court determines that a site is dedicated to infringing, the measure would require the companies that operate domain-name servers to steer Internet users away from it. This misdirection, however, wouldn’t stop people from going to the site, because it would still be accessible via its underlying numerical address or through overseas domain-name servers.

A group of leading Internet engineers has warned that the bill’s attempt to hide piracy-oriented sites could hurt some legitimate sites because of the way domain names can be shared or have unpredictable mutual dependencies. And by encouraging Web consumers to use foreign or underground servers, the measure could undermine efforts to create a more reliable and fraud-resistant domain-name system. These risks argue for Congress to take a more measured approach to the problem of overseas rogue sites.

Indeed.

Categories: Security

DNS Filtering Threatens the Security and Stability of the Internet

May 26, 2011 Dan Kaminsky 2 comments

The DNS works. It creates the shared namespace that allows applications to interoperate across LANs, organizations, and even countries. With the advent of DNSSEC, it’s our best opportunity to finally address the authentication flaws that are implicated in over half of all data breaches.

There are efforts afoot to manipulate the DNS on a remarkably large scale. The American PROTECT IP act contains several reasonable and well targeted remedies to copyright infringement. One of these remedies, however, is to leverage the millions of recursive DNS servers that act as accelerators for Internet traffic, and convert them into censors for domain names in an effort to block content.

Filtering DNS traffic will not work, and in its failure, will harm both the security and stability of the Internet at large.

But don’t take just my word for it. I’ve been fairly quiet about PROTECT IP since when it was referred to as COICA, working behind the scenes to address the legislation. A common request has been a detailed white paper, deeply elucidating the technical consequences of Recursive DNS Filtering.

Today, I — along with Steve Crocker, author of the first Internet standard ever, Paul Vixie, creator of many of the DNS standards and long time maintainer of the most popular name server in the world, David Dagon, co-founder of the Botnet fighting firm Damballa, and Danny MacPherson, Chief Security Officer of Verisign — release Security and Other Technical Concerns Raised by the DNS Filtering Requirements in the PROTECT IP Bill.

Thanks go to many, but particularly to the staffers who have been genuinely and consistently curious — on both sides of the aisle — about the purely technical impact of the Recursive DNS Filtering provisions. A stable and secure Internet is not a partisan issue, and it’s gratifying to see this firsthand.

Categories: Security

VUPEN vs. Google: They’re Both Right (Mostly)

May 13, 2011 Dan Kaminsky 5 comments

TL,DR: Identifying the responsible party for a bug is hard. VUPEN was able to achieve code execution in Google’s Chrome browser, via Flash and through a special but limited sandbox created for the Adobe code. Chrome ships Flash with Chrome and activates it by default, so there’s no arguing that this a break in Chrome. However, it would represent an incredibly naive metric to say including Flash in Chrome has made it less secure. An enormous portion of the user population installs Flash, alongside Acrobat and Java, and by taking on additional responsibility for their security (in unique and different ways), Google is unambiguously making users safer. In the long run, sandboxes have some fairly severe issues, among which is the penchant for being very slow to repair once broken, if they can even be fixed at all. But in the short run, it’s hard to deny they’ve had an effect, exacerbating the “Hacker Latency” that quietly permeates computer security. Ultimately though, the dam has broken, and we can pretty much expect Flash bearing attackers to drill into Chrome at the next PWN2OWN. Finally, I note that the sort of disclosure executed by VUPEN is far, far too open to both fraudulent abuse and simple error, the latter of which appears to have occurred. We have disclosure for a reason. And then I clear my throat.

SECTIONS
Introduction
Plug-In Security
The Trouble With Sandboxes
Hacker Latency
Uncoordinated Disclosure

EARLY DISCUSSION
Dark Reading: Two Zero Day Flaws Used To Bypass Google Chrome Security
Google, VUPEN spar over Chrome Hack

Introduction
[Not citing in this document, as people are not speaking for their companies and I don’t want to cause trouble there. Please mail me if I have permission to edit your cites into this post. I’m being a little overly cautious here. This post may mutate a bit.]

VUPEN vs. Google. What a fantastic example of the danger of bad metrics.

In software development, we have an interesting game: “Who’s bug is it, anyway?” Suppose there’s some troublesome interaction between a client and a server. Arguments along the lines of “You shouldn’t sent that message, Client!” vs. “You should be able to handle this message, Server!” are legendary. Security didn’t invent this game, it just raised the stakes (and, perhaps, provides guidance above and beyond ‘what’s cheapest’).

In VUPEN vs. Google, we have one of the more epic “Who’s bug is it, anyway?” events in recent memory.

VUPEN recently announced the achievement of code execution in Google Chrome 11. This has been something of an open challenge — code execution isn’t exactly rare in web browsers, but Chrome has managed to survive despite flaws being found in its underlying HTML5 renderer (WebKit). This has been attributed to Chrome’s sandbox: A protected and restricted environment in which malicous code is assumed to be powerless to access sensitive resources. Google has been quite proud of the Chrome sandbox, and indeed, for the last few PWN2OWN contests run by Tipping Point at CanSecWest, Chrome’s come out without a scratch.

Of course, things are not quite so simple. VUPEN didn’t actually break the Chrome sandbox. They bypassed it entirely. Flash, due to complexity issues, runs outside the primary Chrome sandbox. Now, there is a second sandbox, built just for Flash, but it’s much weaker. VUPEN apparently broke this secondary sandbox. Some have argued that this certainly doesn’t represent a break of the Google sandbox, and is really just another Flash vuln.

Is this Adobe’s bug? Is this Google’s bug? Is VUPEN right? Is Google right?

Yes.

VUPEN broke Chrome. You really can’t take that from them. Install Chrome, browse to a VUPEN-controlled page, arbitrary code can execute. Sure, the flaw was in Flash. But Chrome ships Flash, as it ships WebKit, SQLite, WebM, or any of a dozen other third party dependencies. Being responsible for your imported libraries is fairly normal — when I found a handful of code execution flaws in Firefox, deriving from their use of Ogg code from xiph.org, the Firefox developers didn’t blink. They shipped it, they exposed it on a zero-click drive-by surface, they assumed responsibility for the quality of that code. When the flaws were discovered, they shipped fixes.

That’s what responsibility means — responding. It doesn’t mean there will be nothing to respond to.

Google is taking on some interesting responsibilities with its embedding of Flash.

Plug-In Security
Browsers have traditionally not addressed the security of their plugins. When there’s a flaw in Java, we look to Oracle to fix it. When there’s an issue with Acrobat, we look to Adobe to fix it. So it is with Adobe’s Flash. This has come from the fact that:

A) The browsers don’t actually ship the code; rather, users install it after the fact
B) The browsers don’t have the source code to the plugins anyway, so even if they wanted to write a fix, they couldn’t

From the fact that plugins are installed after the fact, one can assign responsibility for security to the user themselves. After all, if somebody’s going to mod their car with an unsafe fuel injection system, it’s not the original manufacturer’s fault if the vehicle goes u in flames. And from the fact that the browser manufacturers are doing nothing but exposing a pluggable interface, one can cleanly assign blame to plugin makers.

That’s all well and good. How’s that working out for us?

For various reasons, not well. Not well enough, anyway. Dino Dai Zovi commented a while back that we needed to be aware of the low hanging fruit that attackers were actually using to break into networks. Java, Acrobat, and Flash remain major attack vectors, being highly complex and incredibly widely deployed. Sure, you could just blame the user some more — but you know what, Jane in HR needs to read that resume. Deal with it. What we’re seeing Google doing is taking steps to assume responsibility over not simply the code they ship, but also the code they expose via their plugin APIs.

That’s something to be commended. They’re — in some cases, quite dramatically — increasing the count of lines of code they’re pushing, not to protect users from code they wrote, but to protect users. Period.

That’s a good thing. That’s bottom line security defeating nice sounding excuses.

The strategies in use here are quite diverse, and frankly unprecedented. For Java, Chrome pivots its user experience around whether your version of Java is fully up to date. If it isn’t, Java stops being zero-click, a dialog drops down, and the user is directed to Oracle to upgrade the plugin. For PDFs, Google just shrugged its shoulders and wrote a new PDF viewer, one constrained enough to operate within their primary sandbox even. It’s incomplete, and will prompt if it happens to see a PDF complex enough to require Acrobat Reader, but look:

1) Users get to view most PDFs without clicking through any dialogs, via Google PDF viewer
2) In those circumstances where Google PDF viewer isn’t enough, users can click through to Acrobat.
3) Zero-click drive-by attacks via Acrobat are eliminated from Chrome

This is how you go from ineffectual complaining about the bulk of Acrobat and the “stupidity” of users installing it, to actually reducing the attack surface and making a difference on defense.

For various reasons, Google couldn’t just rewrite Flash. But they could take it in house, ship it in Chrome, and at minimum take responsibility for keeping the plugin. (Oracle. Adobe. I know you guys. I like you guys. You both need to file bugs against your plugin update teams entitled ‘You have a user interface’. Seriously. Fix or don’t fix, just be quiet about it.)

Well, that’s what Chrome does. It shuts up and patches. And the bottom line is that we know some huge portion of attacks wouldn’t work, if patches were applied. That’s a lot of work. That’s a lot of pain. Chrome is taking responsibility for user behavior; taking these Adobe bugs upon themselves as a responsibility to cover.

Naive metrics would punish this. Naive observers have wrung their hands, saying Google shouldn’t ship any code they can’t guarantee is good. Guess what, nobody can ship code they guarantee is good, and with the enormous numbers of Flash deployments out there, the choice is not between Flash and No Flash, the choice is between New Flash and Old Flash.

A smart metric has to measure not just how many bugs Google had to fix, but how many bugs the user would be exposed to had Google not acted. Otherwise you just motivate people to find excuses why bugs are someone else’s fault, like that great Who’s Bug Is It Anyway on IE vs. Firefox Protocol Handlers.

(Sort of funny, actually, there’s been criticism around people only having Flash so they can access YouTube. Even if that’s true, somebody just spent $100M on an open video codec and gave it away for free. Ahem.)

Of course, it’s not only for updates that Chrome embedding Flash is interesting. There are simply certain performance, reliability, and security (as in memory corruption) improvements you can’t build talking to someone over an API. Sometimes you just need access to the raw code, so you can make changes on both sides of an interface (and retain veto over bad changes).

Also, it gets much more realistic to build a sandbox.

I suppose we need to talk about sandboxes.

The Trouble With Sandboxes

It’s somewhat known that I’m not a big fan of sandboxes. Sandboxes are environments designed to prevent an attacker with arbitrary code execution from accessing any interesting resources. I once had a discussion with someone on their reliability:

“How many sandboxes have you tested?”
“8.”
“How many have you not broken?”
“0.”
“Do you think this might represent hard data?”
“You’re being defeatist!”

It’s not that sandboxes fail. Lots of code fails — we fix it and move on with our life. It’s that sandboxes bake security all the way into design, which is remarkably effective until the moment the design has an issue. Nothing takes longer to fix than design bugs. No other sorts of bugs threaten to be unfixable. Systrace was broken in 2007, and has never been fixed. (Sorry, Niels.) That just doesn’t happen with integer overflows, or dangling pointers, or use-after-frees, or exception handler issues. But one design bug in a sandbox, in which the time between a buffer being checked and a buffer being used is abused to allow filter bypass, and suddenly the response from responsible developers becomes “we do not recommend using sysjail (or any systrace(4) tools, including systrace(1)”.

Not saying systrace is unfixable. Some very smart people are working very hard to try, and maybe they’ll prove me wrong someday. But you sure don’t see developers so publicly and stridently say “We can’t fix this” often. Or, well, ever. Design bugs are such a mess.

VUPEN did not break the primary Chrome sandbox. They did break the secondary one, though. Nobody’s saying Flash is going into the primary sandbox anytime soon, or that the secondary sandbox is going to be upgraded to the point VUPEN won’t be able to break through. We do expect to see the Flash bug fixed fairly quickly but that’s about it.

The other problem with sandboxes, structurally, is one of attack complexity. People assume finding two bugs is as hard as finding one. The problem is that I only need to find one sandbox break, which I can then leverage again and again against different primary attack surfaces. Put in mathematical terms, additional bug hurdles are adders, not multipliers. The other problem is that for all but the most trivial sandboxes (a good example recently pointed out to me — the one for vsftpd) the degrees of freedom available to an attacker are so much greater post-exploitation, than pre, that for some it would be like jumping the molehill after scaling the mountain.

There are just so many degrees of freedom to secure, and so many dependencies (not least of which, performance) on those degrees of freedom remaining. Ugh.

Funny thing, though. Sandboxes may fail in the long term. But they do also, well, work. As of this writing, generic bypasses for the primary Chrome sandbox have not been publicly disclosed. Chrome did in fact survive the last few Tipping Point PWN2OWNs. What’s going on?

Hacker Latency

Sure, we (the public research community, or the public community that monitors black hat activity) find bugs. But we don’t find them immediately. Ever looked at how long bugs sit around before they’re finally found? Years are the norm. Decades aren’t unheard of. Of course, the higher profile the target, the quicker people get around to poking at it. Alas, the rub — takes a few years for something to get high profile, usually. We don’t find the bugs when things are originally written — when they’re cheapest to fix. We don’t find them during iterative development, during initial customer release, as they’re ramped up in the marketplace. No, the way things are structured, it has to get so big, so expensive…then it’s obvious.

Hacker Latency. It’s a problem.

Sandboxes are sort of a double whammy when it comes to high hacker latency — they’re only recently widely deployed, and they’re quite different than defenses that came before. We’re only just now seeing tools to go after them. People are learning.

But eventually the dam breaks, as it has with VUPEN’s attack on Chrome. It’s now understood that there’s a partial code execution environment that lives outside the primary Chrome sandbox, and that attacks against it don’t have to contend with the primary sandbox. We can pretty much expect Flash attacks on Chrome at the next PWN2OWN.

One must also give credit where it’s due: VUPEN chose to bypass the primary sandbox, not to break it. And so, I expect, will the next batch of Chrome attackers. That it’s something one even wants to bypass is notable.

Uncoordinated Disclosure
Alas, there is one final concern. While in this case it happened to be the case that VUPEN did legitimate work, random unsubstantiated claims of security finds that are only being disclosed to random government entities deserve no credibility whatsoever. The costs of running a fraudulent attack are far too low and the damage is much too high. And it’s not merely a matter of fraud — I don’t get the sense that VUPEN knew there were two different sandboxes. The reality of black boxing is you’re never quite sure what’s going on in there, and so while there was some actionable intelligence from the disclosure (“oh heh, the Chrome sandbox has been bypassed”) it was wrong (“oh heh, a Chrome sandbox has been bypassed”) and incomplete (“Flash exploits are potentially higher priority”).

There’s a reason we have disclosure. (AHEM.)

UPDATE 1:

Couple of interesting things. First, as Chris Evans (@scarybeast on Twitter, and on the Sandbox team at Google) pointed out, Tipping Point has has deliberately excluded plugins from being “in-scope” at PWN2OWN. To some degree, this makes sense — people would just show up with their Java exploit and own all the browsers. However, you know, that just happened in the real world, with one piece of malware not only working across browsers, but across PC and Mac too. Once upon a time, ActiveX (really just COM) was derided for shattering the security model for IE, for whatever the security model of the browser did, there was always some random broken ActiveX object to twiddle.

Now, the broken objects are the major plugins. Write once, own everywhere.

Worse, there’s just no way to say Flash is just a plugin anymore, not when it’s shipped as just another responsible component in Chrome.

But Chris Evans is completely right. You can’t just say Flash is fair game for Chrome (and only Chrome) in the next PWN2OWN. That would deeply penalize Google for keeping Flash up to date, which we know would unambiguously increase risk for Chrome users. The naive metric would be a clear net loss. And nor should Flash be fair for all browsers, because it’s actually useful to know which core browser engines are secure and which aren’t. As bad money chases out good, so does bad code.

So, if Dragos can find a way, we really should have a special plugin league, for the three plugins that have +90% deployments — Flash, Acrobat, and Java. If the data says that they indeed fall quite easier than the browsers that house them, well, that’s good to know.

Another issue is sort of an internal contradiction in the above piece. At one point, I say:

VUPEN chose to bypass the primary sandbox, not to break it. And so, I expect, will the next batch of Chrome attackers. That it’s something one even wants to bypass is notable.

But then I say:

I don’t get the sense that VUPEN knew there were two different sandboxes.

Obviously both can’t be true. It’s impossible to know which is which, without VUPEN cluing us in. But, as Jeremiah Grossman pointed out, VUPEN ain’t talking. In fact, VUPEN hasn’t even confirmed the entire supposition, apparently-but-not-necessarily confirmed by Tavis Ormandy et al, that theirs was a bug in Flash. Just more data pointing to these sorts of games being problematic.

Categories: Security

Dan Kaminsky's Blog

Archive

From 0day to 0data: TelnetD

Exploring ReU: Rewriting URLs For Fun And XSRF/HTTPS

These Are Not The Certs You’re Looking For

Black Ops of TCP/IP 2011

New York Times coming out against DNS filtering

On The RSA SecurID Compromise

LA Times Defends DNS

DNS Filtering Threatens the Security and Stability of the Internet

VUPEN vs. Google: They’re Both Right (Mostly)

More Video On DanKam

Email Subscription

Contact Information

Major Projects

Security Talks

Other Research

@dakami

Login