Home > Security > Spelunking the Triangle: Exploring Aaron Swartz’s Take On Zooko’s Triangle

Spelunking the Triangle: Exploring Aaron Swartz’s Take On Zooko’s Triangle

[Heh! This blog post segues into a debate in the comments!]

Short Version: Zooko’s Triangle describes traits of a desirable naming system — and constrains us to two of secure, decentralized, or human readable. Aaron Swartz proposes a system intended to meet all three. While his distribution model has fundamental issues with distributed locking and cache coherency, and his use of cryptographic hashcash does not cause a significant asymmetric load for attackers vs. defenders, his use of an append-only log does have interesting properties. Effectively, this system reduces to the SSH security model in which “leaps of faith” are taken when new records are seen — but, unlike SSH, anyone can join the cloud to potentially be the first to provide malicious data, and there’s no way to recover from receiving bad (or retaining stale) key material. I find that this proposal doesn’t represent a break of Zooko’s Triangle, but it does show the construct to be more interesting than expected. Specifically, it appears there’s some room for “float” between the sides — a little less decentralization, a little more security.


Zooko’s Triangle is fairly beautiful — it, rather cleanly, describes that a naming system can offer:

  1. Secure, Decentralized, Non-Human-Readable names like Z0z9dTWfhtGbQ4RoZ08e62lfUA5Db6Vk3Po3pP9Z8tM.twitter.com
  2. Secure, Centralized, Human-Readable Names like http://www.twitter.com, as declared by trusted third parties (delegations from the DNS root)
  3. Insecure, Decentralized, Human-Readable Names like http://www.twitter.com, as declared by untrusted third parties (consensus via a P2P cloud)

There’s quite the desire to “square Zooko’s triangle” — to achieve Secure, Decentralized, Human-Readable Names. This is driven by the desire to avoid the vulnerability point that centralization exposes, while neither making obviously impossible demands on human memory nor succumbing to the rampantly manipulatable opinion of a P2P mob.

Aaron Swartz and I have been having a very friendly disagreement about whether this is possible. I promised him that if he wrote up a scheme, I’d evaluate it. So here we are. As always, here’s the source material:

Squaring the Triangle: Secure, Decentralized, Human-Readable Names

The basic idea is that there’s a list of name/key mappings, almost like a hosts file, but it links human readable names to arbitrary length keys instead of IP addresses. This list, referred to as a scroll, can only be appended to, never deleted from. Appending to the list requires a cryptographic challenge to be completed — not only do you add (name,key), you also add a nonce that causes the resulting hash from the beginning to be zero.

Aaron grants that there might be issues with initially trusting the network. But his sense is, as long as either your first node is trustworthy, or the majority of the nodes you find out about are trustworthy, you’re OK.

It’s a little more complicated than that.

There are many perspectives from which to analyze this system, but the most interesting one I think is the one where we watch it fail — not under attack, but under everyday use.

How does this system absorb changes?

While Aaron’s proposal doesn’t allow existing names to alter their keys — a fatal limitation by some standards, but one we’ll ignore for the moment — it does allow name/key pairs to be added. It also requires that additions happen sequentially. Suppose we have a scroll with 131 name/key pairs, and both Alice and Bob try to add a name without knowing about the other’s attempt. Alice will compute and distribute 131,Alice, and Bob will compute and distribute 131,Bob.

What now? Which scroll should people believe? Alice’s? Bob’s? Both?

Should they update both? How will this work over time?

Aaron says the following:

What happens if two people create a new line at the same time? The debate should be resolved by the creation of the next new line — whichever line is previous in its scroll is the one to trust.

This doesn’t mean anything, alas. The predecessor to both Alice and Bob’s new name is 131. The scroll has forked. Either:

  1. Some people see Alice, but not Bob
  2. Some people see Bob, but not Alice
  3. Somehow, Alice and Bob’s changes are merged into one scroll
  4. Somehow, the scroll containing Alice and the scroll containing Bob are both distributed
  5. Both new scrolls are rejected and the network keeps living in a 131 record universe

Those are the choices. We in security did not invent them.

There is a belief in security sometimes that our problems have no analogue in normal computer science. But this right here is a distributed locking problem — many nodes would like to write to the shared dataset, but if multiple nodes touch the same dataset, the entire structure becomes corrupt. The easiest fix of course is to put in a centralized locking agent, one node everyone can “phone home to” to make sure they’re the only node in the world updating the structure’s version.

But then Zooko starts jumping up and down.

(Zooko also starts jumping up and down if the output of the system doesn’t actually work. An insecure system at least needs to be attacked, in order to fail.)

Not that security doesn’t change the game. To paraphrase BASF, “Security didn’t make the fundamental problems in Comp Sci. Security makes the fundamental problems in Comp Sci worse.” Distributed locking was already a mess when we trusted all the nodes. Add a few distrusted nodes, and things immediately become an order of magnitude more difficult. Allow a single node to create an infinite number of false identities — in what is known as a Sybil attack — and any semblence of security is gone.

[Side note: Never heard of a Sybil attack? Here’s one of the better stories of one, from a truly devious game called Eve Online.]

There is nothing in Aaron’s proposal that prevents the generation of an infinite number of identities. Arguably, the whole concept of adding a line to the scroll is equivalent to adding an identity, so a constraint here is probably a breaking constraint in and of itself. So when Aaron writes:

To join a network, you need a list of nodes where at least a majority are actually nodes in the network. This doesn’t seem like an overly strenuous requirement.

Without a constraint on creating new nodes, this is in fact a serious hurdle. And keep in mind, “nodes in the network” is relative — I might be connecting to majority legitimate nodes, but are they?

We shouldn’t expect this to be easy. Managing fully distributed locking is already hard. Doing it when all writes to the system must be ordered is already probably impossible. Throw in attackers that can multiply ad infinatum and can dynamically alter their activities to interfere with attempts to insert with specific ordered writes?

Ouch.

Before we discuss the crypto layer, let me give you an example of dynamic alterations. Suppose we organize our nodes into a circle, and have each node pass write access to the next node as listed in their scroll. (New nodes would have to be “pulled into” the cloud.) Would that work?

No, because the first malicious node that got in, could “surround” each real node with an untrusted Sybil. At that point, he’d be consulted after each new name was submitted. Any name he wanted to suppress — well, he’d simply delete that name/key pair, calculate his own, and insert it. By the time the token went around, the name would be long since destroyed.

(Not that a token approach could work, for the simple reality that someone will just lose the token!)

Thus far, we’ve only discussed propagation, and not cryptography. This should seem confusing — after all, the “cool part” of this system is the fact that every new entry added to the scroll needs to pass a computational crypto challenge first. Doesn’t that obviate some of these attacks?

The problem is that cryptography is only useful in cases of significant asymmetric workload between the legitimate user and the malicious attacker. When I have an AES256 symmetric key or an RSA2048 private key, my advantage over an attacker (theoretically) exceeds the computational capacity of the universe.

What about a hash function? No key, just the legitimate user spinning some cycles to come to a hash value of 0. Does the legitimate user have an advantage over the attacker?

Of course not. The attacker can spin the same cycles. Honestly, between GPUs and botnets, the attacker can spin far more cycles, by several orders of magnitude. So, even if you spend an hour burning your CPU to add a name, the bad guy really is in a position to spend but a few seconds cloning your work to hijack the name via the surround attack discussed earlier.

(There are constructions in which an attacker can’t quite distribute the load of cracking the hash — for example, you could encode the number of iterations a hash needed to be run, to reach N zero bits. But then, the one asymmetry that is there — the fact that the zero bits can be quickly verified despite the cost of finding them — would get wiped out, and validation of the scroll would take as long as calculation of the scroll.)

But what about old names? Are they not at least protected? Certainly not by the crypto. Suppose we have a scroll with 10,000 names, and we want to alter name 5,123. The workload to do this is not even 10,000 * time_per_name, because we (necessarily) can reuse the nonces that got us through to name 5,122.
Aaron thinks this isn’t the case:

First, you need to need to calculate a new nonce for the line you want to steal and every subsequent line….It requires having some large multiple of the rest of the network’s combined CPU power.

But adding names does not leverage the combined CPU power of the entire network — it leverages the power of a single node. When the single node wanted to create name 5,123, it didn’t need to burn cycles hashcashing through 0-5122. It just did the math for the next name.

Now, yes, names after 5,123 need to be cracked by the attacker, when they didn’t have to be cracked by the defender. But the attackers have way more single nodes than the defenders do. And with those nodes, the attacker can bash through each single name far, far quicker.

There is a constraint — to calculate the fixed nonce/zerohash for name 6001, name 6000 needs to have completed. So we can’t completely parallelize.

But of course, it’s no work at all to strip content from the scroll, since we can always remove content and get back to 0-hash.

Ah! But there’s a defense against that:

“Each remembers the last scroll it trusted.”

And that’s where things get really interesting.

Complexity is a funny thing — it sneaks up on you. Without this scroll storage requirement, the only difference between nodes in this P2P network are what nodes it happens to stumble into when first connecting. Now, we have to be concerned — which nodes has a node ever connected to? How long has it been since a node has been online? Are any significant names involved between the old scroll and the modern one?

Centralized systems don’t have such complexities, because there’s a ground truth that can be very quickly compared against. You can be a hundred versions out of date, but armed with nothing but a canonical hash you can clear your way past a thousand pretenders to the one node giving out a fresh update.

BitTorrent hangs onto centralization — even in small doses — for a reason. Zooko is a harsh mistress.

Storage creates a very, very slight asymmetry, but one that’s often “all we’ve got”. Essentially, the idea is that the first time a connection is made, it’s a leap of faith. But every time after that, the defender can simply consult his hard drive, while the attacker needs to…what? Match the key material? Corrupt the hard drive? These are in fact much harder actions than the defender’s check of his trusted store.

This is the exact system used by SSH, in lieu of being able to check a distributed trusted store. But while SSH struggles with constant prompting after, inevitably, key material is lost and a name/key mapping must change, Aaron’s proposal simply defines changing a name/key tuple as something that Can’t Happen.

Is that actually fair? There’s an argument that a system that can’t change name/key mappings in the presence of lost keys fails on feature completeness. But we do see systems that have this precise property: PGP Keyservers. You can probably find the hash of the PGP key I had in college somewhere out there. Worse, while I could probably beg and plead my PGP keys offline, nobody could ever wipe keys from these scrolls, as it would break all the zero-hashes that depend on them.

I couldn’t even put a new key in, because if scroll parsers didn’t only return the first name/key pair, then a trivial attack would simply be to write a new mapping and have it added to the set of records returned.

Assume all that. What are the ultimate traits of the system?

The first time a name/key peer is seen, it is retrieved insecurely. From that point forward, no new leap of faith is required or allowed — the trust from before must be leveraged repeatedly. The crypto adds nothing, and the decentralization does little but increase the number of attackers who can submit faulty data for the leap of faith, and fail under the sort of conditions distributed locking systems fail. But the storage does prevent infection after the initial trust upgrade, at the cost of basic flexibility.

I could probably get away with saying — a system that reduces to something this and undeployable, and frankly unstable, can not satisfy Zooko’s Triangle. After all, simply citing the initial trust upgrade, as we upgrade scroll versions from a P2P cloud on the false assumption that an attacker couldn’t maliciously synthesize new records unrecoverably, would be enough.

It’s fair, though, to discuss a few annoying points. Most importantly, aren’t we always upgrading trust? Leaps of faith are taken when hardware is acquired, when software is installed, even when the DNSSEC root key is leveraged. By what basis can I say those leaps are acceptable, but a per-name/key mapping leap represents an unacceptable upgrade?

That’s not a rhetorical question. The best answer I can provide is that the former leaps are predictable, and auditable at manufacture, while per-name leaps are “runtime” and thus unpredictable. But audits are never perfect and I’m a member of a community that constantly finds problems after release.

I am left with the fundamental sense that manufacturer-based leaps of faith are at least scalable, while operational leaps of faith collapse quickly and predictably to accepting bad data.

More interestingly, I am left with the unescapable conclusion that the core security model of Swartz’s proposal is roughly isomorphic to SSH. That’s fine, but it means that if I’m saying Swartz’s idea doesn’t satisfy Zooko due to security, then neither does SSH.

Of course, it shouldn’t. We all know that in the real world, the response by an administrator to encountering a bad key, is to assume the box was paved for some reason and thus to edit known_hosts and move on. That’s obviously insecure behavior, but to anyone who’s administered more than a box or two, that’s life. We’re attempting to fix that, of course, by letting the guy who paves the box update the key via DNSSEC SSHFP records, shifting towards security and away from decentralization. You don’t get to change client admin behavior — or remove client key overrides — without a clear and easy path for the server administrator to keep things functional.

So, yeah. I’m saying SSH isn’t as secure as it will someday be. A day without known_hosts will be a good day indeed.

I’ve had code in OpenSSH for the last decade or so; I hope I’m allowed to say that 🙂 And if you write something that says “Kaminsky says SSH is Insecure” I will be very unhappy indeed. SSH is doing what it can with the resources it has.

The most interesting conclusion is then that there must be at least some float in Zooko’s Triangle. After all, OpenSSH is moving from just known_hosts to integrating a homegrown PKI/CA system, shifting away from decentralization and towards security. Inconvenient, but it’s what the data seems to say.

Looking forward to what Zooko and Aaron have to say.

Categories: Security
  1. Mats Henricson
    January 13, 2011 at 9:53 am

    Hi!

    This is very interesting, and I haven’t read it through yet, but I will.

    However, have you seen the BitDNS discussion?

    http://www.bitcoin.org/smf/index.php?topic=1790.0;all

    • January 13, 2011 at 9:54 am

      I haven’t, but this is basically a test run for taking on BitCoin, so there’s some convergence here.

  2. January 14, 2011 at 3:47 am

    Let me start by saying I very much appreciate the work that went into this essay. If I count correctly, it has three separate critiques of my argument. Unfortunately, all three of them seem based on misunderstanding. Hopefully I can clarify.

    Let me start by explaining that what I wrote is not intended to be a spec for implementers, but merely an existence proof. I think of it as akin to Merkle’s signature scheme — no one uses it today, but it proves public key cryptography is possible. I don’t expect a system like the one I outline to actually be used in production; I merely want to prove that Zooko’s Triangle isn’t a binding constraint.

    As a result, there are several implausible (or at least annoying) assumptions. One, as Dan notes, is that we assume keys don’t change. Dan seems to be willing to ignore this, if I understand him correctly.

    Now to the first objection. Dan asks about the case where two people calculate a valid nonce for the next line of the scroll within the amount of time it takes a line to propagate through the network. The first thing to note is that this should be quite rare. If N is properly calibrated, it should take a large amount of time to find a valid nonce, an amount of time that dwarfs worst-case latency on the Internet.

    But still, it could happen. Which is why I explained how to deal with it; unfortunately Dan doesn’t seem to have understood my explanation. (To be fair, it was a brief aside in a response to a comment on the post.) Let me elaborate:

    Let M be an upper bound on the propagation delay for the network. If you receive two valid lines within M seconds of each other, store both valid lines but don’t add either to your trusted scroll. Eventually you will receive another line. But the nonce in this line will only work with one of the two possible lines. Whichever line it works with is the one to believe. The dispute is thus resolved: add the line it works with and then the new line to your scroll.

    On to the second objection. Dan seems obsessed with Sybil attacks. I can understand why — Sybil attacks are very cool and very powerful — but, sadly, they are totally irrelevant to my proposal since my proposal contains no notion of identity.

    Can a Sybil attack affect joining the network? Sybil attacks would only be relevant if we were choosing our list of introductory nodes from the list of total nodes. But this is absurd and borders on impossibility. Think about how someone actually joins a network. Presumably they download a piece of specialized software. The software contains a list of starting nodes in it. That list is not vulnerable to a Sybil attack. Or maybe they get the software from a friend and the friend includes the address of his own node. Again: not vulnerable to a Sybil attack. The point is that to join the system you need to start with some information from outside the system. But since you are still outside the system, Sybil attacks against the system are irrelevant.

    Can a Sybil attack affect inserts? No. An insert is done by proof of work. It doesn’t matter who finds it; there is no notion of identity attached to the nonce.

    Can a Sybil attack affect propagation? Well, it depends on your propagation system. As one of my simplifications, I didn’t really discuss an optimized propagation structure — I assumed that each node was connected to every other node. Now surely one can replace this with a propagation structure that’s vulnerable to a Sybil attack, but I don’t see why you would and Dan doesn’t provide any reasons.

    Next Dan does make a good suggestion — although he incorrectly dismisses it. It would be great to use non-parallelizalble hashcash. I don’t know enough about hash design to design one myself, but I’m not convinced it’s impossible. You can approach one by changing the assignment from “find x such that H(x) starts with N zeroes” to “find x[1]…x[k] such that H(x[1]), H(x[1] || x[2]), H(x[1] || … || x[3]), … H(x[1] || … || x[k]) all start with N zeroes” and you can twiddle N and k to trade off between verification speed and parallelizability. (The double bars mean string concatenation here.) Also, I believe there are many hash functions where you almost necessarily calculate H(A) as part of calculating H(A || B). (For example, I believe SHA1 has this property.) If that’s the case, then there is very little cost in verification speed.

    Dan then seems to suffer from an elementary misreading. I say to attack it “you need to calculate a new nonce for the line you want to steal and every subsequent line”; Dan says this isn’t true, and then goes on to say exactly what I said: “names _after_ [the one you want] need to be cracked by the attacker”. I’m not sure what happened here.

    Dan concludes by arguing “the first time a name/key peer [sic] is seen, it is retrieved insecurely” thus reducing my security to that of SSH’s known_hosts. But since his attacks above failed, this isn’t actually true. The first time a name/key pair is seen, it’s retrieved securely. It’s then stored and thus can be retrieved securely every time thereafter.

    I hope Dan will correct me if I’m missing something, but as best as I can tell nothing remains of his critique.

  3. January 14, 2011 at 7:47 am

    Alright! We’ve got a discussion going on.

    So there’s an interesting meta-thing going on: I think Aaron’s oversimplifying, and Aaron thinks I’m misunderstanding. Hopefully we can figure this out, eh?

    There’s an old quote, that the map is not the territory. In other words, whatever model one uses to analyze a system, cannot and will not capture all the characteristics of the real thing. Aaron’s system as described is incomplete — he admits this, but says that it’s incomplete in the way that Merkle’s writings were incomplete: They proved the possibility of something.

    Aaron wants to avoid discussing propagation realities. The problem is, propagation of truth is an inherent component of both decentralization and security. Since there’s no debate that non-human-readable names are infeasible, skipping propagation issues skips core vulnerability points in the underlying design challenge itself.

    You can’t just assume everyone in the world gets a message simultaneously. That is an implausible assumption. P2P clouds fragment. Nodes reboot. New nodes arrive after the birth of the network. Old nodes disappear and come back after a month. These aren’t theoretical things; they’re empirical events seen in the field.

    Networks converge on truth, and even then only barely, when few if any nodes are actively malicious. They certainly don’t get it for free.

    Aaron believes convergence on truth is easy. He proposes the following algorithm:

    Let M be an upper bound on the propagation delay for the network. If you receive two valid lines within M seconds of each other, store both valid lines but don’t add either to your trusted scroll. Eventually you will receive another line. But the nonce in this line will only work with one of the two possible lines. Whichever line it works with is the one to believe. The dispute is thus resolved: add the line it works with and then the new line to your scroll.

    The dispute is resolved? Globally? For anyone and everyone?

    No. The dispute is resolved for this particular node, using the next name as a tiebreaker. Distributed locking would not be one of the epic problems in computer science if it was that easy to solve! And who says the next name doesn’t come from the attacker? Nobody! So the attacker sees Alice trying to extend off of 131, quickly synthesizes a name *also* off of 131, distributes it actively, and then pushes another name (which can be the original name under suppression, “Alice”) to win the tiebreaker with everyone.

    Now, Alice has lost her name, and can’t even participate in the network anymore unless she accepts that fact. Worse, the attacker can distribute different scrolls to different parties, and now nobody can figure out where to roll back to safely since it’s not their own names that are corrupt.

    You can’t just imagine propagation issues don’t exist. It’s like designing a building assuming concrete can bear infinite load.

    Aaron wonders why I’m interested in Sybil attacks. It’s simple: They’re how every P2P DNS system dies. And why wouldn’t they be? You have a system that’s by definition utilizing the output of untrusted parties. How couldn’t it be vulnerable to generating a million untrusted parties?

    Aaron tries to get around this by saying — well, there’s no notion of identity here. Given that he’s trying to create a system that represents identities securely, in a human readable fashion, that should give you pause. Then he argues that one can use the initial source of the software to acquire safe identities.

    This is an appeal to centralization — there is one node that gives legitimate introductory information, and this node is better than all other nodes. Zooko strikes again.

    But who says the centralized node even knows any better? It has to get its identities from somewhere too, and frankly, it’s generally taking leaps of faith just like anyone else. My game as an attacker is to either:

    a) Have my nodes returned by the central node, or
    b) Have all the nodes returned by the central node, surrounded by my node

    Given I have infinite identities to play with, this is never hard.

    Aaron thinks a Sybil attack can’t affect inserts, because inserts are done by proof of work. But an attacker can actually do proofs of work faster and better than real nodes, and it does matter who finds it because the prize for doing the work and distributing the result more efficiently is getting to own the name — and the key associated with that name — forever.

    That’s quite the effect.

    Things get interesting when he starts talking about the propagation structure. He seems to accept that Sybil attacks could in fact pose a threat, but assumes that one would have to intentionally choose a structure specially vulnerable to one. He then specifies a system he thinks wouldn’t be vulnerable to a massed identity attack: A full m-to-n mesh, with every node linking to every other node.

    Lets ignore for the moment the sheer non-scalability of such a solution. Lets also ignore the sheer impossibility — what, the mesh can only update if every single node is online? And how exactly does it handle the introduction of new nodes?

    It still fails. Suppose an m-to-n mesh where 900 of 1000 nodes is a Sybil. A legitimate name is attempted to be distributed by one of the legitimate 100. Suppose that node spent a full day working on hashcash — that’s 86,400 seconds. Well, an attacker with 86,400 machines in their botnet (this is nothing, absolutely nothing) breaks that in all of a second, and attaches his own key material to someone else’s name. Even with 1000 connections, no legitimate endpoint is going to be feeding them all in less than a second.

    An illegitimate endpoint? Sure.

    Now, there’s an interesting question: Would non-parallelizable hashcash help? The answer is yes, but constructing it appears to be something of a challenge. Aaron takes an interesting crack at it. He actually (and elegantly, and probably inadvertently) flips my optimized attack on the entire system around: While I was willing and able to reuse the intermediate hashes between Name 0 and 5000, thus requiring me to only sequentially break hash 5000 through 10000, he’s flipping the latter sequentialism into a grand defense, arguing to create groups of nonces that all need to be broken in turn.

    It’s an interesting approach. Certainly, it’s fast to verify, especially if we say that each nonce gets its own block in the hash function. With that, as he says, you’re basically making sure that the intermediate hash is constantly starting with zeros, and you’re not having to rewind back to the beginning of the datastream.

    Where this gets a little messy is in how large the group nonces have to become. Before, we had only one nonce, now we have to have many. How many?

    The game’s afoot! The goal is to limit the advantage of the attacker, who has orders of magnitude more machines than the defender, by enforcing serial computation. Execution time is easy to compute, as:

    time_per_nonce * number_of_nonces

    The former — time_per_nonce — is vulnerable to parallelization. The later, because each result depends on the previous, is forced to be serialized. So where’s the constraint?

    Lets say we wanted the real defender to do one day’s worth of computation. If we had time_per_nonce to be one hour, and required twenty four nonces, we’d have our day’s computation…except an attacker, with a thousand machines, would turn 3600 seconds into 3.6 seconds, times 24 executions, which would be about a minute and a half.

    Suppose we had time_per_nonce set to five minutes, and required 288 nonces. Again, 24 hours. The attacker with a thousand machines would turn each of those 300 second operations into 0.3 second operations, requiring of the attacker, once again…a minute and a half worth.

    Does that mean Aaron’s wrong? No. This process does not work forever. Suppose instead time_per_nonce was dropped down to one second. That would require 86,400 nonces, but lets ignore that. The attacker with one thousand nodes would have to parallelize each round at the scale of 0.001 seconds each.

    There is a point where parallelization runs out of steam, just because the inter-node communication penalty is so high. That point does change over time, and there are interesting things one can do to invalidate things like GPUs and custom hardware (see: bcrypt). But it does seem possible to bound the paralellization of hashcash breaking, at the expense of size.

    (Note that the ultimate size of the group nonce might be less than expected, just because the size of each integer may be greatly diminished by the reduced time_per_nonce value.)

    I suspect there’s probably a more appropriate cryptographic construct to build off of, though.

    So does this save the day for Aaron’s system? Not entirely. One real problem is that, while any one node is in the middle of it’s totally legitimate computation, no other node can be operating as well. If a legitimate operator must spend 24 hours calculating the path from 131 to “Alice”, the moment its new name is published, every other name is invalidated. Effectively such a system can only publish 365 updates per year.

    Our discussions around whether he’s security-equivalent to SSH seem to pivot on whether my attacks against his original proposal work or not. So making that determination can be an exercise for the reader.

    I think I’d like to re-evaluate his system with non-parallelizable hashcash properties, but I think I want to see a more complete proposal first. Otherwise we’re just going to be dancing around eachother.

    I’ll write the hashcash code if he’ll hash out (no pun intended) the full end-to-end design. Again, I don’t actually think this can work, but it’s an interesting enough — to me, anyway — discussion to be willing to continue.

  4. January 14, 2011 at 7:57 am

    Worth mentioning, there are a number of ways of making time_per_nonce high, but the two primary ways are:

    1) Increase the bits that need to be 0’d (N, in Aaron’s formulation)
    2) Increase the number of iterations required per hash attempt

    So, 4 bits set to 0 after 100,000 iterations is always an option. Jerking up #2 increases the workload for the legitimate validator, though.

    I’m pretty confident bcrypt is being wildly reinvented here, but bcrypt was most assuredly not trying to be a proof of work system so there’s at least something here.

  5. January 14, 2011 at 10:36 pm

    So does this save the day for Aaron’s system? Not entirely. One real problem is that, while any one node is in the middle of it’s totally legitimate computation, no other node can be operating as well. If a legitimate operator must spend 24 hours calculating the path from 131 to “Alice”, the moment its new name is published, every other name is invalidated. Effectively such a system can only publish 365 updates per year.

    I had the same thought, but Aaron pointed out that this isn’t quite right: if a single node working on a puzzle has expected time of 24 hours to solve it, then, if you’re using a randomized algorithm, k nodes would have an expected time of 24/k hours. So if you have 1M legitimate nodes, they will grow the list faster than a 100K-node botnet. The fact that the puzzle is parallelizable here is a feature, not a bug.

  6. January 14, 2011 at 10:47 pm

    More interestingly, I am left with the unescapable conclusion that the core security model of Swartz’s proposal is roughly isomorphic to SSH. That’s fine, but it means that if I’m saying Swartz’s idea doesn’t satisfy Zooko due to security, then neither does SSH.

    I think the security model here is closer to, say, twitter account names. If I want to have the name “nikita”, I keep trying to solve puzzles so that I can add it to the list. If I succeed, then that’s my name and I get to use it. (Well, I may want to wait a day to make sure that it’s hard to take away.) If someone else registers it first, then it’s just like twitter, I have to think of something else – say, “nikitab” and try for that name. The fact that the “nikita” account on twitter isn’t me is not a security flaw in Twitter’s account naming system.

  7. January 15, 2011 at 12:12 am

    Nikita,

    Puzzles will take an average of 24 hours to solve — it’s actually something of a problem that the actual solution time is nondeterministic. Meanwhile, each solution is dependent on no solution being inserted into the system before it — if I add to the scroll relative to 131, and you’ve already moved onto 132, you can’t absorb my change (without complex forking semantics).

    As for the observation that this is like Twitter account names, I can (and have) called up Twitter to fix things with my account. The loss of centralization means there’s by definition no Twitter to contact when things muck up. We do see this model on PGP keyservers, when people lose their key and just have to suffer it living forever like a zombie. But that’s a serious problem in that realm too.

    • January 15, 2011 at 12:45 am

      No decentralized system will be able to do better than FCFS semantics for names.

      • January 15, 2011 at 1:20 am

        No system without instant propagation can even guarantee FCFS, since there’s no such thing as first.

  8. Gary
    January 17, 2011 at 5:49 am

    This might be a future solution to the convergence issue: http://web.mit.edu/newsoffice/2011/breaking-bottlenecks-0111.html

    • January 18, 2011 at 12:12 pm

      This stuff comes up fairly often. It always struggles with malicious nodes.

  1. No trackbacks yet.

Leave a comment