ディスクの完全消去

提供: ArchWiki
2015年2月3日 (火) 16:19時点におけるKusakata (トーク | 投稿記録)による版 (ページの作成:「Category:セキュリティ Category:ファイルシステム en:Securely wipe disk {{Related articles start}} {{Related|ファイルリカバリ}} {{Related3...」)
(差分) ← 古い版 | 最新版 (差分) | 新しい版 → (差分)
ナビゲーションに移動 検索に移動

関連記事

ディスクワイプを行うには全てのビットに新しいデータを書き込みます。

ノート: この記事における"ディスク"にはループバックデバイスも含めます。

一般的なユースケース

デバイス上に残っているデータを全て消去

デバイスを完全に、二度と復旧できないように消去する最も一般的な理由としては、デバイスを廃棄するまたは売り飛ばす場合が考えられます。デバイス上に (暗号化されていない) データが残っている可能性を考えると、簡単にデータをとり出される前に対処しておいたほうが良いでしょう。データを盗むことはファイルリカバリソフトウェアなどを使えば朝飯前です。

ディスク上の全てのデータを素早く消去したいという場合、/dev/zero のようなシンプルなパターンを使うのが一番効率がよく、それでいて十分なランダム性を確保できます。詳しくはデータの残留で説明しています。

ビットを全て上書きしてデータを消去すると通常のシステムの機能 (標準の ATA/SCSI コマンドなど) やハードウェアのインターフェイスでリカバリすることはできなくなります。上で述べたようなファイルリカバリソフトウェアがデータを復旧させようとしたら、プロプライエタリなストレージハードウェアの機能を使う必要が出てきます。

HDD の場合、ドキュメントになってないドライブコマンドが存在するとか、デバイスのコントローラやファームウェアを弄って再配置セクタ (S.M.A.R.T. が使用を止めた不良セクタ) を読み出すなどしないかぎり、データの復旧は不可能になります。

ディスク消去には物理ストレージの形式によって様々な問題が存在します。中でもフラッシュメモリデバイスや旧式の磁気ストレージ (古い HDD や、フロッピーディスク、テープなど) は注意する必要があります。

ブロックデバイス暗号化の準備

完全に消去した領域にディスク暗号化を設定したいという場合、暗号強度が高い乱数生成器 (Random Number Generator、以下 RNG と呼称します) による、ランダムデータを使用するべきでしょう。

Wikipedia:Random number generation を参照してください。

警告: If Block device encryption is mapped on a partition that contains anything else than random/encrypted data, disclosure of usage patterns on the encrypted drive is possible and weakens the encryption being comparable with filesystem-level-encryption. Never use /dev/zero, simple patterns (badblocks, eg.) or other unrandom data before setting up Block device encryption if you are serious about it!

対象の選択

ノート: Fdisk will not work on GPT formatted devices. Use gdisk (gptfdisk) instead.

Use fdisk to locate all read/write devices the user has read acess to.

Check the output for lines that start with devices such as /dev/sdX.

This is an example for a HDD formatted to boot a linux system:

# fdisk -l
Disk /dev/sda: 250.1 GB, 250059350016 bytes, 488397168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00ff784a

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048      206847      102400   83  Linux
/dev/sda2          206848   488397167   244095160   83  Linux

Or the Arch Install Medium written to a 4GB USB thumb drive:

# fdisk -l
Disk /dev/sdb: 4075 MB, 4075290624 bytes, 7959552 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x526e236e

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           0      802815      401408   17  Hidden HPFS/NTFS

If you are worried about unintentional damage of important data on the primary computer, consider using an isolated environment such as a virtual environment (VirtualBox, VMWare, QEMU, etc...) with direct connected disk drives to it or a single computer only with a storage disk(s) that need to be wiped booted from a Live Media(USB, CD, PXE, etc...).

ブロックサイズの選択

See also Wikipedia:Dd (Unix)#Block size, blocksize io-limits.

If you have a Advanced Format hard drive it is recommended that you specify a block size larger than the default 512 bytes. To speed up the overwriting process choose a block size matching your drive's physical geometry by appending the block size option to the dd command (i.e. bs=4096 for 4KB).

fdisk prints physical and logical sector size for every disk.

Alternatively sysfs does expose information:

/sys/block/sdX/size
/sys/block/sdX/queue/physical_block_size
/sys/block/sdX/queue/logical_block_size
/sys/block/sdX/sdXY/alignment_offset
/sys/block/sdX/sdXY/start
/sys/block/sdX/sdXY/size

消去するブロックを手動で計算

In the following the determination of the data area to wipe is done in an example.

A block storage devices contains sectors and a size of a single sector that can be used to calculate the whole size of device in bytes. You can do it by multiplying sectors with size of the sector.

As an example we use the parameters with the dd command to wipe a partition:

# dd if=data_source of=/dev/sdX bs=sector_size count=sector_number seek=partitions_start_sector

Here you will see only a part of output of fdisk -l /dev/sdX with root, showing the example partition information:

Device     Boot      Start        End         Sectors     Size  Id Type
/dev/sdXA            2048         3839711231  3839709184  1,8T  83 Linux
/dev/sdXB            3839711232   3907029167  67317936    32,1G  5 Extended

The first line of the fdisk output shows the disk size in bytes and logical sectors:

Disk /dev/sdX: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors

To calculate size of a single logical sector use echo $((2000398934016 / 3907029168)) or use data from the second line of fdisk output:

Units: sectors of 1 * 512 = 512 bytes

To calculate physical sectors that will make it work faster we can use the third line:

Sector size (logical/physical): 512 bytes / 4096 bytes

To get size in the physical sectors you will need the known disk size in bytes divided with physical sectors echo $((2000398934016 / 4096)), you can get size of the storage device or partition on it even with the blockdev --getsize64 /dev/sdX(Y) command.

ノート: In the examples below we will use the logical sector size.

To wipe partition /dev/sdXA the example parameters with logical sectors would be used like this:

# dd if=data_source of=/dev/sdXA bs=512 count=3839709184 seek=2048

Or, to wipe the whole disk (count= optional):

# dd if=data_source of=/dev/sdX bs=512 count=3907029168 seek=0
警告: Without count= or if count= miss-configured to point outside of the possible size, it will show an error that the devices is ended and no future writes are possible when it comes to the end of it.

データソースの選択

As just said If you want to wipe sensitive data you can use anything matching your needs.

If you want to setup block device encryption afterwards, you should always wipe at least with an encryption cipher as source or even pseudorandom data.

For data that is not truly random your disk's writing speed should be the only limiting factor. If you need random data, the required system performance to generate it may extremely depend on what you choose as source of entropy.

ノート: Everything regarding Benchmarking disk wipes should get merged there.

ランダムではないデータ

Overwriting with /dev/zero or simple patterns is considered secure in most resources. In the case of current HDD's it should be sufficient for fast disk wipes.

警告: A drive that is abnormally fast in writing patterns or zeroing could be doing transparent compression. It is obviously presumable not all blocks get wiped this way. Some #Flash memory devices do "feature" that.

パターン書き込みのテスト

#Badblocks can write simple patterns to every block of a device and then read and check them searching for damaged areas (just like memtest86* does with memory).

As the pattern is written to every accesible block this effectively wipes the device.

ランダムデータ

For differences between random and pseudorandom data as source, please see Random number generation.

ノート: Data that is hard to compress (random data) will get written slower, if the drive logic mentioned in the #Unrandom data warning tries compressing it. This should not lead to #Data remanence though. As maximum write-speed is not the performance-bottleneck it can get completely neglected while wiping disks with random data.

暗号データ

When preparing a drive for full-disk encryption, sourcing high quality entropy is usually not necessary. The alternative is to use an encrypted datastream. For example, if you will use AES for your encrypted partition, you would wipe it with an equivalent encryption cipher prior to creating the filesystem to make the empty space not distinguishable from the used space.

対象の上書き

選択したドライブをユーティリティを使って上書きします。都合に合わせて選択してください。

dd

Core utilities#dd を参照してください。

警告: There is no confirmation regarding the sanity of this command so repeatedly check that the correct drive or partition has been targeted. Make certain that the of=... option points to the target drive and not to a system disk.

Zero-fill the disk by writing a zero byte to every addressable location on the disk using the /dev/zero stream. iflag and oflag as below will try to disable buffering, which is senseless for a constant stream.

# dd if=/dev/zero of=/dev/sdX iflag=nocache oflag=direct bs=4096

Or the /dev/urandom stream:

# dd if=/dev/urandom of=/dev/sdX bs=4096

The process is finished when dd reports, No space left on device:

dd: writing to ‘/dev/sdb’: No space left on device
7959553+0 records in
7959552+0 records out
4075290624 bytes (4.1 GB) copied, 1247.7 s, 3.3 MB/s

高度な例

Get the number of sectors (NUM_BLOCKS), the sector size (LOGIC_BLOCK_SIZE) and (optionally) total disk size in bytes (DISK_SIZE), of the device to be wiped:

# fdisk -l /dev/sdX 
Disk /dev/sdX: 21,5 GiB, 23045603328 bytes, 45010944 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

and use them in the following command to randomize the drive/partition using a randomly-seeded AES cipher from OpenSSL (displaying the optional progress meter with pv):

# openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt </dev/zero \
    | pv -bartpes <DISK_SIZE> | dd bs=<LOGIC_BLOCK_SIZE> count=<NUM_BLOCKS> of=/dev/sdX

The command above creates a 128 byte encryption key seeded from /dev/urandom. AES-256 in CTR mode is used to encrypt /dev/zero's output with the urandom key. Utilizing the cipher instead of a pseudorandom source results in very high write speeds and the result is a device filled with AES ciphertext.

See also Dm-crypt/Drive preparation#dm-crypt wipe before installation for a similar approach.

shred

shred is a Unix command that can be used to securely delete individual files or full devices so that they can be recovered only with great difficulty with specialised hardware, if at all. shred uses three passes, writing pseudo-random data to the device during each pass. This can be reduced or increased.

The following command invokes shred with its default settings and displays the progress.

# shred -v /dev/sdX

Alternatively, shred can be instructed to do only one pass, with entropy from e.g. /dev/urandom.

# shred --verbose --random-source=/dev/urandom -n1 /dev/sdX

Badblocks

For letting badblocks perform a disk wipe, a destructive read-write test has to be done:

# badblocks -c <NUMBER_BLOCKS> -wsv /dev/<drive>

hdparm

hdparm supports ATA Secure Erase, which is functionally equivalent to zero-filling a disk. It is however handled by the hard-drive firmware itself, and includes "hidden data areas". As such, it can be seen as a modern-day "low-level format" command. SSD drives reportedly achieve factory performance after issuing this command, but may not be sufficiently wiped (see #Flash_memory).

Some drives support Enhanced Secure Erase, which uses distinct patterns defined by the manufacturer. If the output of hdparm -I for the device indicates a manifold time advantage for the Enhanced erasure, the device probably has a hardware encryption feature and the wipe will be performed to the encryption keys only.

For detailed instructions on using ATA Secure Erase, see the Linux ATA wiki.

secure-delete

The secure-deleteAUR package from AUR provides several utilites for secure erasion, including sfill, which deletes only free space in a specified mount. For example:

# sfill -v /

See the tools list for more info.

データの残留

Wikipedia:Data remanence を参照してください。

The residual representation of data may remain even after attempts have been made to remove or erase the data.

Residual data may get wiped by writing (random) data to the disk with a single or even more than one iteration. However, more than one iteration may not significantly decrease the possibility to reconstruct the data of hard disk drives. See #Residual magnetism.

オペレーティングシステム、プログラム、ファイルシステム

The operating system, executed programs or journaling file systems may copy your unencrypted data throughout the block device. When writing to plain disks this should only be relevant in conjunction with one of the above.

If the data can get exactly located on the disk and was never copied anywhere else, wiping with random data can be thoroughgoing and impressively quick as long there is enough entropy in the pool.

A good example is cryptsetup using /dev/urandom for wiping the LUKS keyslots.

ハードウェア特有の問題

フラッシュメモリ

Write amplification and other characteristics make Flash memory a stubborn target for reliable wiping. As there is a lot of transparent abstraction in between data as seen by a device's controller chip and the operating system sight data is never overwritten in place and wiping particular blocks or files is not reliable.

Other "features" like transparent compression (all SandForce SSD's) can compress your /dev/zero or pattern stream so if wiping is fast beyond belief this might be the case.

Disassembling Flash memory devices, unsoldering the chips and analyzing data content without the controller in between is feasible without difficulty using simple hardware. Data recovery companys do it for cheap money.

For more information see: Reliably Erasing Data From Flash-Based Solid State Drives.

不良セクタ

If a hard drive marks a sector as bad, it cordons it off, and the section becomes impossible to write to via software. Thus a full overwrite would not reach it. However because of block sizes, these sections would only amount to a few theoretically recoverable KB.

残留磁気

A single, full overwrite with zeros or random data does not lead to any recoverable data on a modern high-density storage device.[1] Indications otherwise refer to single residual bits; reconstruction of byte patterns is generally not feasible.[2] See also [3], [4] and [5].

Overwriting old magnetic storage devices (e.g. floppy disks, magnetic tape, early-generation hard drives) only once can instead allow the wiped data to be reconstructed by analyzing the measured residual magnetics, due to much lower memory storage density. Such devices can get disassembled in a cleanroom and then analyzed with equipment like a magnetic force microscope. This method of data recovery requires however substantial financial resources. For this reason, it is advisable to overwrite old storage devices multiple times; degaussing is another practiced countermeasure, and to ensure that data has been completely erased, most resources even advise physical destruction.