Thoughts On Secure Deletion In 2001
To the best of my knowledge, Peter Guttman(sp?) has demonstrated for years
now that there is no form of over-writing which makes any substantial
difference to the ability to recover previously written data from a
Guttman’s paper, “Secure Deletion of Data from Magnetic and Solid-State
Memory” (available at
http://www.cs.auckland.ac.nz/~pgut001/secure_del.html ) has
become something of a classic, and for good reason: It’s absolutely
fascinating reading, describing in detail what most of us suspected and some
of us never imagined.
The paper, however, is five years old, and quite frankly needs to be
understood in that context.
Now, I’m *not* saying that Guttman’s points are flawed, just that it’s
likely that the mechanisms used to recover data from 300 Megabyte drives
probably don’t scale to 80 Gigabyte disks using GMR(Gigantic
Magnetoresistance) technology. The extra surface area and analog bit
density used to divine past generations of data has almost certainly been
exploited in the 16x explosion in data density since Guttman’s paper was
My *guess*, however, is that as drive densities have increased, the
requirements for more and more advanced error correction(to increase yields
on platters with miniscule deformities) has led to greater redundancy and
increased platter space to entirely redundant–and
Furthermore, it’s impossible that drive scanning technology hasn’t advanced
in sync with drive capacity–the bottom line is, somebody needed to design
the sensors to work out the kinks from each generation of disk. Companies
like OnTrack(who, incidentally, worked very well for me) have made rather
successful businesses of proven what’s gone is not necessarily gone.
So essentially, Guttman was right, and Guttman is probably still right. But
the technologies used to recover deleted data has probably advanced just as
much as the technology used to store the data in the first place.
My understanding of current “high security” standards wrt the re-use of
disks which previously contained classified materials is that they only be
re-used in similarly classified systems, or, are destroyed beyond any form
of molecular reconstruction (e.g. melted).
Exactly. It is the job of the medium to store information. It is the job
of the incinerator to delete it. Violation of the barriers between
establishing functionality and enforcing security leads to systems that
allow too much access to an unstable service.
So to suggest that your perceived EFS flaw can be resolved by over-writing
is naive. The only solution is to encrypt in memory or use some removable
partition as the temp space.
Russ, you’re absolutely correct about the need for memory encryption, though
removable media has equivalent risks(with the exception of possibly being
more conveniently incinerated). The correct behavior is for a disk to never
receive anything that gives it plaintext-equivalent access to any of the
actual information contained within the encrypted data. That means no
decryption keys ever get written, no passwords get saved, and most
importantly, *no plaintext data gets stored, not even “temporarily”*. The
moment an “Encrypted File System” writes a plaintext version of the data to
the disk, all is lost–whether or not an apparently laughable delete(really,
“dab white-out on the page number on the index in the back of the book)
operation is actually carried out.
Lets not forget–an encrypted file system exists for *no other reason* but
to resist attack. Encryption does not add speed. It does not add
stability. It does not add anything *but* resistance against an attacker
who lacks the key material. If Rickard’s analysis is correct–something
that should be independently verified–EFS offers attackers a rich array of
simple attacks that do not require discovery of the key material. You can
draw your own conclusions from that.
Addendum to my thoughts on the apparent EFS design flaw, which is actually
less significant than originally announced. Essentially, only files that
are converted FROM plaintext TO ciphertext are temped, meaning the bug only
affects files that were plaintext on the disk in the first place. There’s
still a problem–temp files aren’t overwritten, not even once–but EFS
doesn’t become *at all* farce of a FS I thought it was. (Of course, I
didn’t know this as I wrote much of what’s below, so if something reads
Specific kudos to Scott Culp at Microsoft, whose response to Rickard’s post
was well researched and nicely done.
1) As quite a few people noticed, I pasted in the wrong URL for Peter
Gutmann’s Secure Deletion paper. Ironic that, for all that I tend to talk
about the dangers of intermittent failures(as opposed to the clear-cut loss
of service that tech support is generally built to verify and address),
Windows’ occasional tendancy to ignore a copy request would hit me.
The *correct* URL is as follows.
This is incredible reading, even now, five years after it was authored.
[Yes, Ben had to make a special post with the above URL, but I’m repeating
it as an exhortation to everyone: Read It!]
2) According to Russ Cooper(editor of NTBugTraq), Gutmann, as of two years
ago, said that the increased disk densities weren’t yet posing a problem for
disk data recovery. This fits well with the presumption that, as densities
go up, redundancies and error correction codes also increase their
effectiveness. A moderately interesting facet of memory design is that
apparently nearly every single DIMM has defects, but each chip on that DIMM
has extra blocks and integrated circuitry to detect bad memory and
transparently reroute into the extra storage areas. This increases yields
by providing tolerance against minor imperfections, at the cost of a
slightly larger die.
More importantly to us, however, is the reassertion that logical reality has
no required relationship with physical reality. Memory can be logically
sequential and physically random. IP addresses can be logically grouped and
physically diverse. Content on a hard drive can be logically erased but
physically immortal, due perhaps to a freak accident that causes a block to
be marked as bad and transparently duplicated elsewhere. Remove from a hard
drive read head the constraints of size, mass production, writability,
non-destructivity, and even the requirement to survive shipping, and it
becomes very clear how simply because *one* physical apparatus (the
read/write head shipped with the hard drive) cannot logically analyze a set
of faded impressions in the magnetic media, that no other physical apparatus
Interestingly enough, the variation in “physical agility” doesn’t just apply
to readability; the ability to create physical impressions is also something
that varies absolutely unpredictably according to the flow of time, money,
and criticality. Since the sanctity of physical impressions are exactly
what biometric systems attempt to authenticate against, one should realize
that a similar risk factor exists in biometric spoofing as exists in
multi-generation data recovery–more risk, since you probably don’t need a
clean room and expensive sensors to spoof a $25 optical fingerprint scanner;
less risk, since hard drive data recovery doesn’t require aquisition of the
secret(although it’s likely that the last person to use that $25 fingerprint
scanner will leave their prints on the scanner!)
The advantage of cryptography is that, overall, the belief that there’s no
efficient way to factor the product of two large primes is more “secure”
than the belief that there’s no way to physically spoof a given set of
physical properties. Done correctly, it allows us to *ignore* the
possibility that our physical data routes might get compromised. No, we
don’t lose *all* physical security constraints, but we’re able to physically
isolate the value of our information from the bulk of our data. The whole
idea of an EFS is to grant the theoretical attacker full physical access to
a hard drive and *still* maintain security– all the attackers receive is
encrypted bulk data.
The problem, of course, is where to put the decryption key. Having a
plaintext decryption key next to a whole pile of encrypted data is about as
smart as etching the combination to a half ton safe right next to the wheel.
(Given the cash rush into crypto, that hasn’t stopped anyone from deploying
such systems. The system Crypto-Gram linked to at http://www.gianus.com/
comes to mind.)
So this is where crypto, for all its logical manipulations, is forced to
intersect with the physical world. Many systems depend on human memory to
contain some secret, although there are arguments that nobody can remember
more entropy than a computer can brute force through. Other systems use
hardware tokens to store their secrets. As the crypto is isolated to its
own subsystem, the physical requirements of that system can be tuned to only
need to protect that limited amount of data and nothing more.
That doesn’t mean they’re foolproof. Human memory is defeatable via
bribery, rubber hose cryptoanalysis, and the aforementioned size constraint.
Most tokens are defeatable via side channel attacks. But the odds of a
device built for secure storage surviving assault are quite a bit better
than a system whose primary function is to, well, *function*. A hard
drive’s *job* is to store and allow retrieval of information in unnaturally
dense and arbitrary formations. We should not be surprised when such a
system succeeds, even if we’d rather it fail.
What should surprise us is when a cryptosystem *built* to prevent the
relevance of an unauthorized successful read makes presumptions that the
underlying medium will fail to reveal deleted data. That’s unfortunately
what Microsoft’s EFS implementation is doing–they’re writing plaintext
information, trivially deleting it after encryption, and saying the file has
been protected behind a cryptographic key.
3) Timothy Miller mentioned something moderately important: Placing a
system into hibernation has the effect of dumping all live memory to disk in
plaintext. This obviously compromises whatever happens to be in memory,
including the decryption keys that *need* to be in memory in order for
everyday access to function. Couple quirks which deserve mention:
First, causing a system to drop into hibernation mode is conceivably a poor
man’s forensic toolkit. Although it’s definitely conceivable that malware
might detect the hibernation process and deploy countermeasures to prevent
its detection, the concept of code defending against *anything* is possible,
and moderately more likely vs. userland apps that need to be loaded from
remote sources instead of something integrated with the kernel. We’ve
already seen at least one virus that prevents connection to antivirus
websites, for instance.
Second, simply encrypting the memory dump isn’t necessarily going to help:
Where does the system put the key, presuming there’s no hardware token to
query? How does the system retrieve a password to decrypt that key without
loading up the rest of Windows? The MS worldview isn’t exactly moving
towards having a Non-Win32 window pop up asking for a password, but bringing
up a full Win32 environment arguably requires coming out of full
hibernation–meaning the system needs to be able to load up system RAM
without querying the user for a decryption code.
Finally, if arbitrary users can send a machine into temporary
hibernation(issue the command, then have a remote host send a Wake-On-LAN
magic packet), and has a pathway to do raw reads against the file
system(this doesn’t necessarily require physical access!), the EFS won’t
help–once the system comes back up, the attacker just needs to sort through
the hibernation data to retrieve key material. Essentially, the EFS doesn’t
win you anything above straight NTFS file permissions, since the key to
decrypt the NTFS file is available in the same pile the permissions you’re
bypassing are located.
Mind you, a token doesn’t really help in this circumstance, since most
tokens aren’t used for bulk decryption. Generally, the token will be used
to decrypt some key file into memory, and that key will be used to encrypt
and decrypt files.
Actually, I’m moderately curious how EFS does key selection–on a per file
basis? Per block? Is there salting? File system crypto is moderately
difficult, due to issues like crash resistance, appending data to arbitrary
points within a file, etc. This buglet happened due to an allowance made
for crash resistance–it’d be interesting to see whether anything else was
exposed due to specific allowances made for this functional domain.