Tarsnap critical security bug

tptacek · on Jan 18, 2011

Tarsnap had a CTR nonce collision. It's a bad bug that's fairly common and easy to explain.

CTR mode turns AES into a stream cipher, meaning it can encrypt a byte at a time instead of 16 bytes at a time. It does this by using the block cipher core to encrypt counters, which produces a "keystream" that you can XOR against plaintext to use as a stream cipher.

For this to be secure, as with any stream cipher, it is crucial that the keystream never repeat. If you encrypt two plaintexts under the same keystream, you can XOR them together to cryptanalyze them; even easier, if you know the contents of one of the plaintexts, you can XOR the known plaintext against the ciphertext to recover the keystream!

To avoid repeating keystreams, CTR mode uses a nonce, which is a long cryptographically secure random number concatented to the counter before encrypting.

To avoid that catastrophic security bug, CTR mode users have to make sure the nonce never repeats (and also that the counter never repeats, e.g. by wrapping). We have found both bugs multiple times in shipping products, and now Colin found it in his product.

And so I come to the moral of my story: Colin is clearly a gifted crypto dev. He can talk lucidly and at length about the best ways to design crypto-secured protocols. He has found crypto flaws in major systems before. He is as expert as you could expect anyone to be on any product.

And Colin didn't get it right; what's more, the manner in which he got it wrong was devastating (in cryptographic terms).

Colin handled this well, largely due to the fact that he's an expert and knows how to handle it.

How likely is it that anyone less capable than Colin could have handled it so well? Moreover, if Colin can make a devastating mistake with his crypto code, how many worse mistakes would a non-expert make?

You should avoid writing crypto code if at all possible. Nate Lawson is fond of saying, "you should budget 10 times as much to verification as you do for construction of cryptosystems"; I would amend that only to add a price floor to it, because you cannot get real validation of a cryptosystem for less than many tens of thousands of dollars --- if your system is simple.

inklesspen · on Jan 19, 2011

What should you do if you can't avoid writing crypto code?

I found myself in this situation when I tried to find a bcrypt implementation for Common Lisp. There wasn't one. Folks in #lisp suggested I adapt the blowfish implementation in Ironclad, since 'bcrypt is just blowfish anyway'.

I ended up writing a Lisp wrapper around one of the C implementations, a process documented at my blog (http://www.letsyouandhimfight.com/2010/07/14/cl-bcrypt-a-fir...), but it's unsatisfactory for a couple of reasons:

1) Both the current C implementations are designed to be integrated into libc. The Openwall implementation does have the code factored out into its own file, but there is no support structure for building a shared library. (Python's bcrypt bundles a modified version of the Openwall C source directly with it, for example.) Common Lisp's FFI is intended for working with installed shared libraries

2) There appears to be a bias in the Lisp community towards pure-Lisp implementations, for (hopefully obvious) reasons, so an implementation as hacky as what I came up with is unlikely to see much use.

If I do go back to trying to write a webapp in Common Lisp, I think I will find myself having to reimplement bcrypt in Common Lisp. First, I'll have to find a sufficiently portable method of getting cryptographically secure random numbers; as of the writing of that blog post, there wasn't one that I could find anyone recommending. The more difficult part will be to convert the C code into Lisp code without missing any places where operations on the C types don't precisely correspond to the same operations on the Lisp types (due to, say, overflow).

I'm worried I might get something wrong, but I can't just use the crypto code written by wiser folks than I, because, at least in the Common Lisp community, that code doesn't seem to exist.

tptacek · on Jan 19, 2011

First, don't obsess too much about your password digest. I know this is head-explodey considering the source, but all I'm trying to do by ranting about it is to get people to stop using SHA1 (or SHA256 or Whirlpool or whatever) hashes. The risk to your users for doing that bit wrong is not very high.

Second, my advice about how to do crypto security is very simple:

* Use PGP for data at rest.

* Use TLS for data in motion.

Do not trust your own judgement (say, by using OTR because it "feels" like most of what you need, or trusting that you'll use Keyczar safely) on anything else without a formal external review. In practice, you will almost never need anything more than TLS or PGP.

alexgartrell · on Jan 19, 2011

Any tips as far as a favorably licensed (BSD would be nice, public domain is probably asking too much) TLS library that doesn't suck?

inklesspen · on Jan 19, 2011

Okay, but in the more general case, where something literally does not exist for a particular platform/language and my choices are "write it myself" or "don't use that platform/language", is there any way to feel confident that a choice to write it myself will not be a hideously wrong decision?

khafra · on Jan 19, 2011

I'm not Thomas Ptacek, but I think his answer would be something to the effect of "to be really confident, get tens of thousands of dollars worth of code review before shipping." You may be prominent enough in the CL/hypothetical crypto-less platform community to get most of that for free, but that's the value of the validation needed.

pjscott · on Jan 19, 2011

If you're trying to make something in pure Lisp, your odds of failure are less if you take an existing hashing algorithm (e.g. SHA-256) and just iterate it a bunch of times. Ironclad has SHA-256, so this is really easy:

    (defun slow-hash (password salt &key (iterations 10000))
      "Produces a 256-bit hashed value of password and salt, slowly. Uses
      a tweakable number of iterations, which should not be less than
      1000, and which defaults to 10000."
      (let ((hash (ironclad:make-digest :sha256)))
        ;; First, hash the salt and password
        (ironclad:update-digest hash
           (ironclad:ascii-string-to-byte-array salt))
        (ironclad:update-digest hash
           (ironclad:ascii-string-to-byte-array password))
        ;; Repeatedly hash the hash, to slow things down
        (dotimes (x iterations)
          (ironclad:update-digest hash (ironclad:produce-digest hash)))
        (ironclad:produce-digest hash)))

koenigdavidmj · on Jan 19, 2011

Even after one round, repeating that knocks down your plaintext-space to 256 bit strings (or however long the output is of the hash function that you are using). Tom/Colin, is this actually a problem?

inklesspen · on Jan 19, 2011

That's the case for every hash function. If it's a good hash function, the outputs will be evenly distributed across the entire 256 bit space, which comes out to 2^256 possible outputs. I don't think this is a problem; bcrypt has a much smaller output space.

weichi · on Jan 19, 2011

A comment and a question:

1. It should be fairly trivial to test that your implementation is giving exactly the same output as the c libs (once you have chosen a particular random number that feeds into the algorithm). It seems like the trickiest part of testing will be ensuring that you are using the same character set everywhere.

2. Why is it important to have a "cryptographically strong" PRNG? Doesn't this just turn into a salt? Does a salt generator really need to be cryptographically strong?

Someone please correct me if I am being naive here.

inklesspen · on Jan 19, 2011

1. I worry I might have a bug that returns the proper output for some inputs, but improper output for other inputs.

2. Cryptographically strong random numbers isn't strictly required for a bcrypt salt, I guess. But if I'm building something which I plan to share with other people, I'd rather err on the side of too strong.

wladimir · on Jan 19, 2011

In cryptography you should always use a "cryptographically strong" PRNG, even (especially) if in doubt. There have been to many mistakes with lousy random number generators undermining what would otherwise have been a strong security mechanism.

weichi · on Jan 19, 2011

But we're talking about generating a salt here. As I understand it, the reason you use a salt is to make it much harder to brute-force a dictionary of passwords ahead of time. I don't see how the use of a cryptographically strong PRNG is going to provide any additional security here.

What am I missing?

wladimir · on Jan 21, 2011

I don't know. But if I were to design such a system I'd use a cryptographically secure PRNG just to be sure. It's no extra work and has no drawbacks.

andyv · on Jan 18, 2011

Nonces aren't cryptographically secure random numbers. They merely have to be different for each encryption, which is why even a counter suffices. The problem was, the counter was being incorrectly reset to zero.

  It's just as secure to concatenate a string that is a function of the time of day with the counter.  Another scheme would be to start out with a cryptographically hard number that is incremented each time.

cperciva · on Jan 18, 2011

The problem was, the counter was being incorrectly reset to zero.

The counter was being correctly reset to zero. The nonce was being incorrectly not set to non-zero. (In CTR mode, there is a 64-bit nonce which is different for each message and a 64-bit counter which starts at zero for each message and increments as you move through the message.)

tptacek · on Jan 18, 2011

You're right in the CTR case; I just default to recommending random nonces because that's what you need for CBC.

Natsu · on Jan 19, 2011

And still other times you need secret & random nonces (as for DSA), as Sony learned recently....

cookiecaper · on Jan 19, 2011

The mistake he made is not specifically a "crypto code" problem, it was merely forgetting to increment the new variable, and that kind of thing happens in a lot of non-crypto contexts. I think that the reason people should use tested and well-known crypto code is because the stakes are high and not necessarily because cryptography is super complex; a mistake like this can have serious consequences since most encrypted data is serious by nature. As such, the maxim "don't write your own crypto" could also be applied to any high stakes programmatic endeavor; if you can't tolerate any downtime on your server, "don't write your own server", etc.

The great thing about widely used open-source utilities is the extensive vetting they receive. I was a bit uncomfortable using tarsnap's custom client and now I'm happy I went with duplicity, which is a Python script combining rsync, tar, and gpg to create encrypted archives of your data and only send the differences.

Also, perhaps Colin could look into writing a test suite for tarsnap that would automatically test for mistakes like this. It doesn't sound like the particular applicable exploit is too hard to automate.

cperciva · on Jan 19, 2011

The mistake he made is not specifically a "crypto code" problem, it was merely forgetting to increment the new variable

It wasn't even that. My mistake was refactoring code incorrectly. The increment was there for two years until it got lost in the refactoring.

The great thing about widely used open-source utilities is the extensive vetting they receive

Wearing my FreeBSD Security Officer hat: It's nice to think that, but most open source code gets a shockingly small amount of auditing.

tptacek · on Jan 19, 2011

This is a head-explodey subthread. Sure, Colin, O.K., it wasn't a "crypto mistake", it was a "refactoring mistake that happened to basically turn off your crypto". The fact that perfectly innocent and benign seeming refactoring changes can do that to crypto is exactly why generalist developers should not be writing crypto code.

cperciva · on Jan 19, 2011

I've seen major security vulnerabilities result from losing a single '=' (turning "if (uid == 0)" into "if (uid = 0)") or adding a single '=' (turning "for (...; i < N; ...)" into "for (...; i <= N; ...)"). That's half the typo size of a missing '++'.

Sure, writing crypto code is dangerous. And writing user-authentication code is dangerous. But are you seriously going to say that writing loops is dangerous and generalist developers shouldn't do it?

If the underhanded C contest taught us anything, it's that perfectly innocent and benign seeming changes can introduce security vulnerabilities anywhere.

tptacek · on Jan 19, 2011

If you ask me, "should developers avoid writing web applications in C because it's virtually always unnecessary and practically guarantees memory corruption vulnerabilities", what do you think my answer is going to be?

cperciva · on Jan 19, 2011

Touché :-)

cookiecaper · on Jan 19, 2011

Are we to understand that "crypto developers" never forget to increment something or never have copy-paste errors? It seems most of the difference between a "generalist developer" and a "crypto developer" is the amount of money his employer puts into QA.

wladimir · on Jan 19, 2011

You are right. I use duplicity for remote backups too, I can really recommend it :)

However, in crypto the 'NIH' syndrome is especially prevalent because of the inherent secrecy and paranoia. Especially as there are still a lot of people on the obscurity side of security versus obscurity. A good recent example of this would be Sony...

zdw · on Jan 18, 2011

TL;DR: "Even crypto experts make mistakes. Don't design or write your own crypto code, use someone else's well tested and analyzed code"

ErrantX · on Jan 18, 2011

Even more thought provoking; a mistake that was not due to a crypto mistake, but the sort of programming error you or I make daily....

chaosmachine · on Jan 18, 2011

"if you roll your own crypto, and it breaks, you will look incompetent." - tptacek to cperciva, 313 days ago.

http://news.ycombinator.com/item?id=1183757

(not that I know anything about crypto)

tptacek · on Jan 18, 2011

I was pretty mean to Colin in that post. In the intervening year I've come to respect Colin's practical experience a lot more than I did before, when I thought he was hopelessly academic. I'd like to make it clear that Colin doesn't look incompetant (actually, he seems to be earning accolades from the Twitterverse for how he wrote it up).

But yeah, I hope it's obvious that I see this as a very strong vindication for my argument that generalist devs shouldn't build crypto. At all, ever. Use TLS for data in motion; use PGP for data at rest. Systems much bigger and heavier than yours have gotten away with this.

djcapelis · on Jan 18, 2011

Colin is wildly competent, but I was always uncomfortable with how much he was willing to do his own work without review. He is probably one of the best people in the world to do this work, but even the best person in the world will occasionally make mistakes, as happened here.

Generalist devs shouldn't build crypto. Expert devs shouldn't build crypto without review.

As for academics... a lot of us aren't as hopeless as some of the trash that appears in conferences and journals might appear. :)

ericb · on Jan 18, 2011

> Expert devs shouldn't build crypto without review.

I was surprised that Colin's solution is to personally re-review his code. Good writers know--don't rely on yourself for proofreading. Usually the mental lapse that caused the problem will manifest itself during your review as well.

Disclaimer: I am not a tarsnap user.

cperciva · on Jan 18, 2011

In my experience, I'm very good at proofreading my own writing/code, as long as I wait long enough that I've forgotten it. In this case, I was looking at code which I wrote over a year ago.

But please, go ahead and give the code another read. :-)

djcapelis · on Jan 18, 2011

I sense a great justification for an alcohol budget.

Edit: Totally willing to continue burning karma on this comment if the HN community continues to decide vote it down. I've tried reviewing my own code in a different state of intoxication than when I wrote it and I'm not joking that it can help. I'm still trying to pull resources together for a study on the benefit of different mindframes for peer review. We haven't tried alcohol yet, but frankly it wouldn't be a half bad idea if we could get anyone not to laugh too loudly at the proposal.

danohuiginn · on Jan 19, 2011

Herodotus:

"they are wont to deliberate when drinking hard about the most important of their affairs, and whatsoever conclusion has pleased them in their deliberation, this on the next day, when they are sober, the master of the house in which they happen to be when they deliberate lays before them for discussion: and if it pleases them when they are sober also, they adopt it, but if it does not please them, they let it go: and that on which they have had the first deliberation when they are sober, they consider again when they are drinking."

http://www.gutenberg.org/cache/epub/2707/pg2707.txt

I also agree with you in general, that checking things in different mental states is a good practice. With alcohol, I suspect the benefit is outweighed by the difficulty of spotting bugs when drunk -- but who knows?

djcapelis · on Jan 19, 2011

Well, I am actually a bit more curious about the difference between coding drunk and checking sober vs. coding sober, checking sober... But I suppose checking drunk would be amusing too.

ryanwaggoner · on Jan 19, 2011

Not only code review, but also useful for UI design. Drink 12 beers and then try to use the app you just designed :)

cperciva · on Jan 18, 2011

Cute idea, but I don't drink (for medical reasons).

I suppose I could try reviewing code in both caffeinated and decaffeinated states, but being decaffeinated gives me enough of a headache that I don't think I'd be much use that way.

steveklabnik · on Jan 19, 2011

There's some sort of analogy in here about how alcohol 'loosens you up' could be related to getting in the proper state of mind to 'code fearlessly,' but I can't seem to find it.

Kliment · on Jan 19, 2011

It was referenced in http://xkcd.com/323/ but I can't find a proper reference either.

spoondan · on Jan 19, 2011

How do you assess your effectiveness at proofreading your own code?

I know that I find more mistakes in my code when time reveals the code as it is rather than as it was intended. But I can't say this makes me good enough at proofreading myself. What about the code I've conceived and written in ignorance?

cperciva · on Jan 19, 2011

How do you assess your effectiveness at proofreading your own code?

You count how many bugs you find, then you count how many bugs other reviewers find.

eru · on Jan 18, 2011

> Use TLS for data in motion; use PGP for data at rest.

That's useful advice, if you need and _want_ the guarantees given by TLS or PGP. If you have other needs then a look at, say, off-the-record messaging may be useful.

tptacek · on Jan 19, 2011

I think it's a bad idea to recommend the OTR protocol to people looking for a simple encrypted transport (or simple encrypted record storage). How do you judge whether the guarantees TLS offers are "needed" or not?

eru · on Jan 19, 2011

That judgment is partly outside the more mechanical parts of cryptography. You have to see what your application domain demands.

OTR is just the first example I could think of, that gives different guarantees than most normal cryptosystems. I don't particularly recommend it for anything apart from instant messaging. And I wouldn't recommend implementing your own.

If I speak to you in private (and we know each other), you can be sure you are speaking to me, but you won't be able to proof to any third party anything I said. OTR can give you something like that. PGP can't.

For most application you will be well served with PGP or TLS. But be aware of what baggage they bring. For some areas losing deniability via PGP can be worse than plain text.

tptacek · on Jan 19, 2011

This is a counterfeit argument. PGP loses "deniability" (and "forward secrecy") if by PGP you mean "the PGP user interface". But if what you mean is simply "the PGP cryptosystem" and "the PGP message format of packets and bulk encryption and signatures", then you can grant your system most any property OTR gives you.

This is a moot point, because most systems would never care enough to intricately position all their features just-so to compose OTR-like features out of PGP primitives. What they need is to be able to encrypt anything without implementing trivially exploitable crypto vulnerabilities that were discovered and solved decades ago.

This is a textbook case of everyone's good being strangled by someone's opinion of the perfect.

eru · on Jan 19, 2011

I don't disagree with you. And OTR is just one example, and may even be a straw-man by now. Just be aware that there are other valid choices for cryptosystems, while you still don't have to roll your own.

tptacek · on Jan 19, 2011

Example?

eru · on Jan 20, 2011

Of applications or cryptosystems?

tptacek · on Jan 21, 2011

Cryptosystems.

wolfhumble · on Jan 19, 2011

Just curious: For a similar system to tarsnap where local data is encrypted via pgp and stored on a remote system, what extra benefit is there to use TLS for the data transfer/data i motion?

alnayyir · on Jan 18, 2011

>(not that I know anything about crypto)

Me either, and while I chuckled, I think cperciva is one of the better qualified people to be implementing crypto.

djcapelis · on Jan 18, 2011

No one is qualified to implement crypto without review.

cperciva is extremely qualified to implement crypto, but not without review. I think it is wise that he has implemented a bug bounty procedure. He should make sure it applies to unreleased versions too so maybe someone will put an RSS feed of his SCM checkins into their RSS reader and try and catch bugs as he's making them. :)

It would be nice if he had the money to spring for paying someone else to look at all his changes, but alas... that stuff is expensive!

tptacek · on Jan 18, 2011

Bug bounties are not reviews.

No professional is going to undertake a review on spec. The demand for software security is too high; most of us have our pick of interesting projects that will pay whether we find something or not. We're not unique in that respect; top iPhone developers won't work for you on spec either, not because spec is evil, but because the economics don't work.

Furthermore, you can pay $1000 for XSS bugs and random memory corruption flaws in browsers because fuzzers can find them, because they're luck-of-the-draw findings, and because people are hammering those things whether you pay them or not. But $1000 doesn't pay for a day of qualified review, and no qualified reviewer would suggest less than two weeks for something like Tarsnap.

djcapelis · on Jan 18, 2011

I agree on all of this.

However, since Colin presumably doesn't want to raise his prices to pay for actual review, it is encouraging that he is at least going with bug bounties. These, at the very least, gives us a good excuse to assign them as fun things to do for graduate students with some hope that one will want to procrastinate so hard that they will actually look at the code.

Also I think any reviewer who wanted to get paid would not start with Colin's code as an easy place to find bugs.

cperciva · on Jan 18, 2011

This is one reason why I'm going to be providing bounties for more than just security bugs. Even if people don't expect to find security bugs, they might find a typo in a comment and earn themselves a $1 tarsnap account credit -- and as demonstrated here, simply looking at source code can result in finding bugs even if you weren't originally looking for them.

djcapelis · on Jan 18, 2011

Clever. Especially since "normal" bugs occasionally have that nasty habit of turning into security relevant bugs.

It will be interesting to see how close you can manage to get something resembling good review on a budget. Hopefully other people who are in similar low margin code businesses will keep an eye on your experiment to see how it works out.

Thanks for being so open about how you're trying to make things work. I hope you'll be publishing all the awarded bounties? (I suppose I should just wait for your follow-up entry.)

cperciva · on Jan 18, 2011

Yes, I'll find somewhere on the website to put that list.

jacquesm · on Jan 19, 2011

Finding a bug in tarsnap is something that might be worth considerably more than your bounty to the highest bidder.

cperciva · on Jan 19, 2011

Quite likely. But fortunately for me, most people get nervous at the idea of negotiating with organized crime syndicates.

bmastenbrook · on Jan 19, 2011

Not that I've seen, but I might have a different view of what constitutes an organized crime syndicate than you do.

ezalor · on Jan 19, 2011

Does the government count?

mkramlich · on Jan 19, 2011

shhhhh! they might be listening. illegally.

ezalor · on Jan 20, 2011

They can sue you for making this hypothesis publicly, please think of it (in your interest).

bmastenbrook · on Jan 18, 2011

"since Colin presumably doesn't want to raise his prices to pay for actual review"

Speaking as a Tarsnap user, he ought to. The service is seriously underpriced right now.

djcapelis · on Jan 18, 2011

I think there's a limited market that would pay for this now, you say you would, other users say they would. But this is not something he can offer to only some users as an extra feature, so all his users would have to be comfortable with it. From a business perspective, I imagine he has done testing to figure out what price is right and from a security perspective it would probably is not too unreasonable if he just waited until he had enough volume to allow this to happen with only very limited price increases or at his current margins.

In the meantime, the bug bounty + very qualified developer strategy seems like a reasonably sensible option while the service is presumably, still in its growth phase. I guess we'll find out.

geoffc · on Jan 18, 2011

Amen on that! Tarsnap is a key tool in my stack and I feel like I pay 1/10 of what it is worth relative to the other tools.

requinot59 · on Jan 19, 2011

Yes, thanks Colin, Tarsnap really is gold on today's "cloud era".

paulitex · on Jan 18, 2011

This is a perfect example of why we shouldn't have (or not allowing the use of) ++ in languages. If instead of being buried in a long statement, the increment had been explicit on its own line there's a much smaller chance this bug would have happened.

Which is easier to miss?

aes_ctr(&encr_aes->key, encr_aes->nonce++, buf, len, filebuf + CRYPTO_FILE_HLEN);

or

aes_ctr(&encr_aes->key, encr_aes->nonce, buf, len, filebuf + CRYPTO_FILE_HLEN);

ncr_aes->nonce += 1;

?

eru · on Jan 18, 2011

Python has this feature. But if you want to get serious, eliminate [1] all side-effects like Haskell.

[1] Or rather, make all side effects explicit---including visible to the type system.

palsecam · on Jan 18, 2011

Golang does that, right (++ is a statement, not an expression)?

enneff · on Jan 19, 2011

Yes. Go's assignments, increments, etc cannot be used as expressions.

moconnor · on Jan 19, 2011

For better or worse I've been working on the same codebase for almost a decade; now whenever I patch the code I've got half an eye on how likely it is that I or someone else could accidentally break this code in the future.

If you've got to write code that's not allowed to fail, you can't afford set up little traps like this for yourself.

drx · on Jan 18, 2011

That's fast. I got the email two minutes earlier :)

I meant to say this in an email, but big props to Colin for being transparent about this and responding to the issue the way he did. I'm sure it wasn't an easy weekend.

cperciva · on Jan 19, 2011

It was remarkably fast. I put up the blog post, updated the website, sent out the emails, tweeted, updated the /topic in the #tarsnap IRC channel, and then came here to submit only to find that it already had 5 votes.

miGlanz · on Jan 18, 2011

Colin, one question: let's say I'm not paranoid about my backups' security. So I don't want to re-encrypt anything (even though it's only about 1GB). How does upgrade to 1.0.28 affect existing and new backups (and deduplication among those)?

cperciva · on Jan 18, 2011

Tarsnap 1.0.28 is completely compatible with earlier versions (except for key file format changes in 1.0.22). The only nontrivial change was to make new uploaded data get encrypted correctly.

mrduncan · on Jan 18, 2011

This is an excellent example of why you shouldn't write your own crypto code - Colin is awesome at it and even he makes mistakes.

Edit: To be clear, this isn't aimed at Colin but meant to point out that if he still occasionally gets it wrong there's a pretty good chance that your fancy custom encryption method does too.

aristus · on Jan 18, 2011

This may be a naive question, but are unit tests for encryption code a reasonable idea? eg, each revision is tested against a known set of data and a known set of common attacks.

tptacek · on Jan 18, 2011

It's hard to test the behavior of crypto code (even insecure crypto code can still generate data indistinguishable from random noise), so your unit tests end up tightly coupled to the actual implementation. You can do it, but it's not as clear a win as it is in other settings.

mooneater · on Jan 19, 2011

Its good users were notified to upgrade, but I am surprised he revealed as many details this soon. Does that not further reduce security for the end user for data they have stored with the old version?

cperciva · on Jan 20, 2011

I did consider that, but the NSA knows perfectly well how to attack CTR nonce collisions, and anyone looking at the diff between version 1.0.27 and version 1.0.28 can see what the change was; so I didn't disclose anything which potential attackers couldn't already figure out easily.

JoachimSchipper · on Jan 19, 2011

It's not like running diff against both versions wouldn't pinpoint the problem anyway.

poet · on Jan 18, 2011

When even someone like Colin introduces a crypto bug like this, it makes you wonder. Are we ever going to get to a place where crypto engineering is something the open source community can take on? How long did it take to push people to stop writing C programs with trivial vulnerabilities? And that's something you can write a static analyzer for. No so with crypto.

yellowbkpk · on Jan 18, 2011

Why pick on the open source community? Even people working for commercial entities can write bad code.

In the open source world at least others get to look at the code and find (and perhaps fix) problems.

tptacek · on Jan 18, 2011

I think that second sentence is somewhat wrongheaded. Crypto bugs aren't like normal bugs. Thousands of eyes aren't likely to surface them. Open source does not have a particularly excellent track record with exposing crypto flaws.

Simultaneously, we routinely find crypto flaws on black-box reviews of commercial products, sometimes even in firmware and hardware settings.

To my eyes, it's not the availability of source code that smokes out flaws like this, it's simply the incentive structure. Colin's project gets the attention of someone like Taylor Campbell, but Colin has made a name for himself and for Tarsnap. Even if your project becomes popular, if you aren't shouting from the mountaintops about your use of cryptography, you may be unlikely to garner the specific kind of attention you need.

cperciva · on Jan 18, 2011

Indeed, this bug was found because the Tarsnap source code is open -- someone was looking through out of curiosity when he saw the problem.

poet · on Jan 18, 2011

I should clarify. I'm not picking on the open source community. I'm differentiating the open source community form the private sector because the incentives are different. There are crypto guys in the private sector that can build secure crypto systems for $600/hour. Now, crypto is devilishly hard to do, so there's no guarantee their system would be secure either. But if you have nation-state levels of funding, you certainly can buy a system that would take serious talent and funding to break. On the other hand, open source communities are motivated by intrinsic incentives. Clearly this is enough to implement state-of-the-art operating systems, but is intrinsic motivation enough to implement secure crypto? It may well be that the bar is too high in this area and I think the next decade will yield some interesting results here. Even if we count OpenSSL as a point for open source (generous), that's one reasonably secure system over the course of a decade.

rlpb · on Jan 18, 2011

> I'm differentiating the open source community form the private sector because the incentives are different.

The incentive in the private sector is to maximize profit, which means minimizing costs.

> But if you have nation-state levels of funding, you certainly can buy a system that would take serious talent and funding to break.

You might be able to build such a system, or you can buy a system that just passes all acceptance tests, which is where the incentive is (since this minimizes costs). Given that testing a cryptosystem for correctness is just about impossible, what do you suppose happens?

The best assurance that I get is when I'm told which standard implementation a product uses. If a private entity without a reputation in cryptography told you that they rolled their own, would you trust them? How many crytographers would you trust? I know whom I would, and I don't even need a full hand to count them.

tptacek · on Jan 18, 2011

Colin Percival told you that he uses RSA-2048, AES-256 in CTR mode, and HMAC-SHA256. None of that information helps you with a one-line implementation error that incorrectly handles CTR nonces. That's 'poet's point.

rlpb · on Jan 18, 2011

By "standard implementation", I mean something like "OpenSSL 0.9.8o". This helps me more, since I can be fairly certain that >0 experts have reviewed that code. Given that absolute verification is just about impossible, it's a question of reducing the probability of failure wherever possible. With a private, closed implementation, the number of reviewers is almost certain to be lower.

cperciva · on Jan 20, 2011

By "standard implementation", I mean something like "OpenSSL 0.9.8o". This helps me more, since I can be fairly certain that >0 experts have reviewed that code.

It's a bit more complicated than that. Yes, >0 experts have reviewed OpenSSL code. But <1 experts have reviewed all of the OpenSSL code. Did the bits which matter to you get reviewed? Who knows...

slashclee · on Jan 19, 2011

I would love to know how much money Sony threw at securing the PS3, considering how they made a similar error.

shpxnvz · on Jan 18, 2011

Are we ever going to get to a place where crypto engineering is something the open source community can take on?

I'm not sure I understand the question - are you suggesting that authors of open source security code are less qualified or more bug prone than those who work on closed source software?

One of the promises of open source code is fewer bugs through exposure to many eyes. That seems to be exactly how this security bug was found, according to the blog post. How long do you suppose this bug would have stayed hidden if the source were not available? Personally, I'd guess a lot longer.

JacobAldridge · on Jan 19, 2011

I'm a massive fan of tarsnap, even though I have no need for it and probably don't have the technical competence to use it anyway. Given that using the product isn't why I'm a fan, I can only put that down to Colin - and this post, with its technical openness, easy to understand (for a layman-of-sorts) crypto and code explanation, and humility demonstrates why.

All that, plus explaining how to delete and offering a refund will probably cost only a small number of picodollars, and is worth a lot more to tarsnap's credibility.

nwmcsween · on Jan 18, 2011

I planned on modifying tarsnap to work on local files and upload to different resources (dropbox, local, etc) but as I dug in and looked at the license I noticed this little gem in COPYING "Redistribution and use in source and binary forms, without modification, is permitted for the sole purpose of using the "tarsnap" backup service". Why even provide source if the license doesn't allow me to do anything with said source? I can't create a patch without being litigated and I won't due to that.

tptacek · on Jan 18, 2011

This is silly. Colin clearly isn't going to "litigate" against you. He provides the source precisely so that people can find bugs in it. It's a commercial product, though, not a GNU project. It's a good thing that he provided source code, and it's disingenuous and petty to try to punish him for doing that.

vluft · on Jan 18, 2011

Also, that way he doesn't have to compile/package it for every system somebody might want to run it on.

nwmcsween · on Jan 18, 2011

Why not just have a non-compete clause within the license. At least legally this would allow for internal use with local backups. I highly doubt execution of a similar service using tarsnap would even happen, if that's what the worry is.

tptacek · on Jan 18, 2011

Presumably because he has better things to do with his time. His commercial competitors universally do not release their source code at all. Stop picking on him for his license.

requinot59 · on Jan 18, 2011

> it's disingenuous and petty to try to punish him

Where does he try to "punish" him?!

Plus it's a real legal concern, sure in practice cperciva will not sue him, but still...

palsecam · on Jan 18, 2011

"No matter how secure Tarsnap's design is, however, you don't run the design on your computer — you run the code. For this reason, all of the source code to the Tarsnap client is available. You don't need to simply trust that Tarsnap does things right (and that it isn't a trojan planted by the US government): You can read the source code and check for yourself."

Last § of https://www.tarsnap.com/security.html

cperciva · on Jan 20, 2011

I planned on modifying tarsnap to work on local files and upload to different resources (dropbox, local, etc) but as I dug in and looked at the license I noticed this little gem in COPYING...

Exactly. That is deliberate. I don't want to end up competing with my own code.

Why even provide source if the license doesn't allow me to do anything with said source?

So that people can audit it if they wish to do so.

cpach · on Jan 19, 2011

Are you sure you can't do that?

http://cr.yp.to/softwarelaw.html

alecco · on Jan 18, 2011

"Tarsnap compresses its chunks of data before encrypting them. While the compresion is not perfect (there are, for instance, some predictable header bits), I do not believe that enough information is leaked to make such a ciphertext-only attack feasible."

That part is very important. Compress then encrypt. Here you see competent crypto applications playing safe covering for unexpected problems. I say well done Colin! Full disclosure and best practices.

tptacek · on Jan 18, 2011

Compression prior to encryption is generally a good practice, but as Colin points out, it doesn't actually do much to mitigate this bug; in a bulk encryption setting, you're going to find known compressed plaintexted to back keysteam out of.

It's true, and Colin's right to point it out, that it's unlikely that this bug will be exploited (you have to be Colin to do it, and it's a general PITA to deal with), but I wouldn't want anyone to have the impression that CTR mistakes are survivable just because you compress.

DavidSJ · on Jan 18, 2011

Colin, thanks for the explanation. I suggest an additional change to your pre-release process: all code must be peer reviewed. This is by far the most effective quality control measure you can implement, much more so than unit testing or "double-checking". I wouldn't trust any mission-critical production code that hasn't been peer-reviewed, much less crypto code.

cperciva · on Jan 19, 2011

Code review is always good, but some code deserves more checking that other code. There are some parts of Tarsnap where the worst that could happen is that you'll get some mangled messages printed to the terminal -- that code is clearly not as deserving of testing as the core cryptographic functionality.

DavidSJ · on Jan 19, 2011

I didn't word what I said as well as I could have, but what I meant was to emphasize the "require review before submission" part, not the "all code" part. Do you have mandatory peer review on security-critical code already?

stavros · on Jan 18, 2011

Cryptography is hard, but someone has to do it.

there · on Jan 18, 2011

tarsnap had a government backdoor! colin just did this to get publicity! colin just did this to get a free security audit!

</sarcasm>

cperciva · on Jan 19, 2011

I've never had a DoD grant revoked. :-)

Groxx · on Jan 19, 2011

This seems like it would be a fantastic reason to develop a crypto-linter. Not that I think such a thing would be easy - but it could be immensely simplified by defining things like a "must change" attribute on parameters like that "encr_aes" pointer in library code, to flag potential incorrect use.

tptacek · on Jan 19, 2011

Buffer overflows are a fantastic reason to develop secure code linters. I'll let you know when we figure out how to make one that can count reliably.

cperciva · on Jan 19, 2011

Make sure that it can also divide. My favourite Coverity glitch is when it looked at some IPv6 code and announced "assuming i % 8 != 0" and "assuming i == 128" at the same time. (And then claimed that we would access ipv6addr[16].)

joanou · on Jan 19, 2011

I applaud the openness of this submittal. Nobody's perfect and the topic is difficult and the implementation is tricky to get absolutely right.

At AltDrive, we use a nonce generated w/ secure random and that is used for encrypting an entire file in CTR (EAX) mode. The issue with 64k chunks does not apply. The mature and well-respected BouncyCastle AES-256 libraries are used from the low level API. Usage of the API was independently reviewed by the BouncyCastle organization. I can share that on the AltDrive blog if anyone is interested. http://altdrive.com

apowell · on Jan 18, 2011

Will a backup created in 1.0.28 still be de-duplicated against backups created in previous versions of Tarsnap?

cperciva · on Jan 19, 2011

Yes. This is why you need to upload data with a new key in order to ensure that it is re-encrypted.

grourk · on Jan 19, 2011

Looks like you rolled a hard six. ;-)

Splines · on Jan 18, 2011

Isn't this the same category of bug that led to the Sony master key leak? Amusing, if so.

tptacek · on Jan 18, 2011

Sort of but not really. Apart from the fact that Sony broke DSA (a wildly different crypto primitive than AES-CTR), the mistake Sony made was more fundamental to the design. I think you're right to notice that in both cases a nonce wasn't actually a nonce.

monological · on Jan 18, 2011

tarsnap.com has exceeded their app engine quota. Not good for business.

cperciva · on Jan 19, 2011

Tarsnap doesn't use app engine. Google's index went nuts and I have absolutely no clue why. The real Tarsnap website is at http://www.tarsnap.com/

dauphin · on Jan 19, 2011

My new online code analyser program, sponsored by AMD, could have prevented this bug.