Urandom fun fact

In turns out that if you’re wiping an external 1 TB hard disk using pseudorandom garbage, the process is CPU-bound and not I/O-bound:

$ sudo dd if=/dev/urandom of=/dev/sdb bs=4K
dd: writing `/dev/sdb': No space left on device
244190647+0 records in
244190646+0 records out
1000204886016 bytes (1.0 TB) copied, 136230 s, 7.3 MB/s

For those of you who have trouble dividing by 3600 in your head, 136,230 seconds works out to about 37.8 hours, with the CPU pegged at 100%. (Well, 50% since it’s a dual-core system, but whatever.)

My guess is that the process of actually encrypting the disk (once I initialize a file system on it) will take even longer, assuming that AES-256 encryption is slower than whatever PRNG algorithm Linux uses to drive /dev/urandom is.

Edit: Actually, that’s extremely incorrect. Encrypting the partition doesn’t actually write anything to it other than some kind of header identifying it as an encrypted partition. Yes, that means that almost all sectors are initially garbage if you try to decrypt them, but with a brand-new partition all sectors are initially garbage anyway. Creating the file system itself doesn’t try to read anything, either; it just writes to the sectors that will make up its index. And once you’ve mounted the file system, you’ll never try to read uninitialized sectors anyway, since there aren’t any files there.

In other words, the only O(n) step when creating an encrypted disk is wiping its previous contents; everything else is O(1) or O(log n) at worst. So why wipe with pseudorandom garbage instead of all zeros, which would be much faster? It’s (hopefully) computationally infeasible to distinguish uninitialized sectors (which look like random garbage because they are random garbage) from encrypted sectors (which look like random garbage because they’re encrypted with a strong algorithm). Not being able to tell where the data even is on the disk makes an attacker’s job more difficult.

Thanks to strong encryption, an attacker now has to either throw a lot of CPU power at the problem or use alternative means for recovering the data.