「Dd」の版間の差分
(→バイナリーファイルのパッチ適用: 削除) |
(→ディスク関連や他の使用場面: 追加) |
||
146行目: | 146行目: | ||
# dd if=/dev/zero of=/dev/sd''X'' bs=440 count=1 |
# dd if=/dev/zero of=/dev/sd''X'' bs=440 count=1 |
||
+ | |||
+ | == ディスク関連や他の使用場面 == |
||
+ | |||
+ | As some readers might already realized, the {{man|1|dd}} core utility has an [https://unix.stackexchange.com/a/12538#12538 quite different] command-line syntax compared to other utilities. Moreover, while supporting [https://unix.stackexchange.com/a/12538#12538 some unique features not found in other commodity utilities], several default behaviour (and sometimes, inability) it has is either [https://www.pixelbeat.org/docs/coreutils-gotchas.html#dd less-ideal] or [[dd#Partial_read:_copied_data_is_smaller_than_requested|potential error-prone]] if applied to specific scenario. For that reason, users may want to use some alternatives that better in some aspects in lieu of the ''dd'' core utility. |
||
+ | |||
+ | That said, it is still worth to note that since ''dd'' is a [[coreutils|core utility]], which is installed by default on Arch and many other systems, is preferable to some alternatives or more specialized utilities if it is inconvenient to install a new package on your system. |
||
+ | |||
+ | To cover the two aspects that addressed above, this section is dedicated to summarise the features of {{man|1|dd}} core utility that rarely found in other commodity utilities, in a form that resemble [[pacman/Rosetta]] article but with quantities of examples being cut down to as few and simple as possible to just enough to examine these features of ''dd'' (as denoted by ''i.e.'' or ''To''-clause in "Tip:" box under subsection), either in practice or ''pseudocode''. |
||
+ | |||
+ | {{note|To keep this dedicated section in a reasonable length, comparison of alternative utilities only includes packages found in official repositories, in which case has obvious advantage over others being mentioned, and written in more details if necessary.<br>For more alternatives, see [[coreutils#dd alternatives]].}} |
||
+ | |||
+ | === バイナリファイルにブロック単位のパッチをインプレースに適用する === |
||
+ | It is not an uncommon practice to use ''dd'' as a feature-limited binary file patcher in automated shell scripts since it is capable to {{ic|seek}} the output file by given offset before writing, and to block-by-block (or byte-y-byte if {{ic|1=bs=1}}) in-place patching the output file by adding the {{ic|1=conv=notrunc}} option. |
||
+ | |||
+ | For example, to modify the timestamp part of first member in a {{man|5|cpio|Portable ASCII Format}} archive, which starts at the 49th byte of the file (or with offset of {{ic|0x30}} if you prefer hex notation): |
||
+ | |||
+ | $ touch a-randomly-chosen-file |
||
+ | $ bsdtar -cf example-modify-ts.cpio --format odc -- a-randomly-chosen-file |
||
+ | |||
+ | $ printf '%011o' "$(date -d "2019-12-21 00:00:00" +%s)" | dd conv=notrunc of=example-modify-ts.cpio seek='''48''' oflag=seek_bytes |
||
+ | |||
+ | {{note|The currently undocumented [https://github.com/coreutils/coreutils/commit/140eca15c4a3d3213629a048cc307fde0d094738 {{ic|seek_bytes}} output flag] is specified above for seeking the output in offset of bytes instead of blocks before starting to {{man|2|write}} to output.}} |
||
+ | |||
+ | {{tip|To print byte stream from command-line input hex notation, use {{man|1|basenc|base16}} and/or {{man|1|printf}}.}} |
||
+ | |||
+ | {{tip|In this feature grid (''i.e.'' {{man|2|write}} ''with offset, with'' no truncation), instead of using ''dd'', one may consider using a shell that supporting {{man|2|lseek}} operation on shell-opened file descriptor if in case of: |
||
+ | |||
+ | * the input file of ''dd'' is a pipe connected with a program that utilize {{man|2|splice}} system call, and the user want to avoid unnecessary userspace I/O of ''dd'' for a better performance |
||
+ | * or, to avoid frequently {{man|2|fork}} in shell script loop to reduce performance penality |
||
+ | |||
+ | Then it would be necessary to let that shell open the file descriptor at first, and perform some seeking operation on the file descriptor to assign this file descriptor as output end of corresponding utility that use {{man|2|splice}} system call (or a shell builtin command does not forking, as in following example of {{man|1|zshmodules|sysseek}}):{{hc|$ zsh|<nowiki>$ local +xr openToWriteFD |
||
+ | $ zmodload zsh/system |
||
+ | $ sysopen -wo cloexec -u openToWriteFD example-modify-ts.cpio |
||
+ | $ sysseek -u $openToWriteFD 48 |
||
+ | $ printf '%011o' "$(date -d "2019-12-21 00:00:00" +%s)" >&${openToWriteFD}</nowiki>}} |
||
+ | {{warning|Avoiding using this approach if you cannot ensure the program which you need it write with offset really utilize {{man|2|splice}} (which implies that the program does not perform any kind of seeking or truncating on its output). Some programs may seek/truncate on input/output file descriptor by themselves even if this behaviour is unspecified on command line flags, which invalidates your shell's {{man|3|lseek}} call, or unexpectedly truncates on opened file descriptor.}}}} |
||
+ | |||
+ | === VFAT ファイルシステムイメージのボリュームラベルを表示する === |
||
+ | {{tip|For this specific scenario, a more practical choice is {{pkg|file}}.}} |
||
+ | |||
+ | To read the filesystem [https://wiki.osdev.org/FAT#FAT_32_2 volume label of an VFAT] image file, which should be in total length of 11 bytes that padded by ASCII spaces, with an offset of {{ic|0x047}}: |
||
+ | |||
+ | $ truncate -s ''33M'' empty-hole.img |
||
+ | $ fatlabel empty-hole.img LabelMe |
||
+ | |||
+ | $ dd iflag=skip_bytes,count_bytes count=11 skip=$((0x047)) if=empty-hole.img | sed -e 's% *$%%' |
||
+ | {{note|[https://github.com/coreutils/coreutils/commit/140eca15c4a3d3213629a048cc307fde0d094738 Both input flags are currently undocumented]: |
||
+ | * The former {{ic|skip_bytes}} instruct ''dd'' to seek (or ''skip'' if input is not seekable) on input file by offset in quatity of byte instead of number of blocks before starting to {{man|2|read}} from it. |
||
+ | * The later {{ic|count_bytes}} allow user to specifiy the total quantity of blocks to copy from input file ''in byte'', instead of a number of blocks. It could be confusing since specifing this option could still subject to partial {{man|2|read}}, think it like a fractional value of [[dd#Partial_read:_copied_data_is_smaller_than_requested|input block {{ic|count}}]] to better understand this behaviour.}} |
||
+ | |||
+ | {{tip|''To transfer data from input (with an offset) to output within given length'', in shell scripting, one may also consider {{man|1|curl|r,}} as a commodity alternative that use range notation instead.{{note|''curl'' does not support seeking/skipping when input file is a device/pipe, another alternative {{man|1|socat}} does support these operation on input file (incl. block device, ''excl. pipe and character device'') but is less commodifying than ''curl'':{{bc|<nowiki>$ socat -u -,seek=$((0x047)),readbytes=11 - < empty-hole.img | sed -e 's% *$%%'</nowiki>}}}}}} |
||
+ | |||
+ | === パイプで繋がれたコマンド間で sponge する === |
||
+ | {{tip|As already noted in title, a practical choice for this specific scenario is {{man|1|sponge}}, which supports atomic write by writing to {{ic|<nowiki>${TMPDIR:-/tmp}</nowiki>}} at first.}} |
||
+ | |||
+ | In following example, to avoid unnecessary long-lasting TCP connection on input end if the output end blocks longer than expected, one may put a ''dd'' between two commands with an output block size certainly larger than input while still reasonably smaller than available memory: |
||
+ | |||
+ | $ curl -qgsfL <nowiki>http://example.org/mirrors/ftp.archlinux.org/mirrored.md5deep</nowiki> | dd ibs=128k obs=200M | ''poor-mirroring-script-that-perform-mirroring-on-input-paths-line-by-line-wo-buffer-entire-list-first'' |
||
+ | |||
+ | {{warning|It should never be considered as a generic alternative to {{man|1|sponge}} as ''dd'' truncates the output file before starting the entire copy operation.}} |
||
+ | |||
+ | === サイズの制限付きでデータを転送する === |
||
+ | It's a common practice to use ''dd'' in a data streaming shell script for limiting total length of data that a piped command may consume. For example, to inspect an ustar header block ({{man|5|tar|POSIX ustar Archives}}) using a shell script function in a streaming manner: |
||
+ | {{note|The {{ic|B}} suffix in argument to {{ic|count}} option is a [https://github.com/coreutils/coreutils/commit/97e9778296ead515e77a64942b84f88dcf36a176 newerly introduced] feature as of GNU coreutils v9.1 that has same effect of [[Dd#Printing_volume_label_of_a_VFAT_filesystem_image_(i.e._read_in_given_length,_with_offset)|{{ic|count_bytes}} input flag]], is potentially confusable with option in forms like {{ic|1=count=256k}} which indicate ''dd'' to copy 262144 input blocks instead of bytes.}} |
||
+ | {{bc|<nowiki>hexdump-field() { |
||
+ | set -o pipefail |
||
+ | printf '%s[%d]:\n' $1 $2 |
||
+ | dd count=${2}B status=none | hexdump -e $2'/1 "%3.2x"' -e '" | " '$2'/1 "%_p" "\n"' |
||
+ | } |
||
+ | |||
+ | inspect-tar-header-block() { |
||
+ | local -a hdrstack=( |
||
+ | name 100 |
||
+ | mode 8 |
||
+ | uid 8 |
||
+ | gid 8 |
||
+ | size 12 |
||
+ | mtime 12 |
||
+ | checksum 8 |
||
+ | typeflag 1 |
||
+ | linkname 100 |
||
+ | magic 6 |
||
+ | version 2 |
||
+ | uname 32 |
||
+ | gname 32 |
||
+ | devmajor 8 |
||
+ | devminor 8 |
||
+ | prefix 155 |
||
+ | pad 12 |
||
+ | ) |
||
+ | set - ${hdrstack[@]} |
||
+ | while test $# -gt 0; do |
||
+ | hexdump-field $1 $2 || return |
||
+ | shift 2 |
||
+ | done |
||
+ | }</nowiki>}} |
||
+ | $ bsdtar -cf - /dev/tty /dev/null 2>&- | dd count=1 skip=1 status=none | inspect-tar-header-block |
||
+ | {{tip|''To streaming data from input to output within given length'', an alternative is {{man|1|pv|S,}} which supports {{man|2|splice}} system call. |
||
+ | {{note|Another candidate alternative is {{man|1|head|c}}, though [[coreutils#Alternatives|implementation other than GNU coreutils]] and glibc [https://unix.stackexchange.com/a/12538#12538 may consume more data than requested], causing data misalignment issue in a streamingly shell script.}}}} |
||
+ | {{Tip|In addition to above feature grid, if input file shall be {{man|2|lseek}}'d by specific offset before streaming, and the output end of ''dd'' is a pipe connected with a program utilize {{man|2|splice}}, then, as an alternative, one may consider use: |
||
+ | * a shell with builtin seeking capability (''as already mentioned [[#Patching a binary file, block-by-block in-place|as alternative in a previous subsection]].'') |
||
+ | * or, a Bourne-like shell (e.g. [[bash]]), with help of {{man|1|xxd|s}} for one-off {{man|2|lseek}} on shell-opened file descriptor, |
||
+ | and {{man|1|pv|S,}} (''mentioned above'') as in following [[bash]] example (assuming that file descriptor has not allocated in shell by {{ic|ls -l /proc/self/fd}} in bash at first):{{hc|$ bash|<nowiki>$ exec 9<dummy-but-rather-large.img |
||
+ | $ xxd -g 0 -l 0 -s $((0x47ffff)) <&9 |
||
+ | $ pv -qSs 104857601200 <&9 |</nowiki> ''program-that-process-load-of-data-but-does-not-limit-read-length-as-desired-nor-support-offset-read'' |
||
+ | $ exec 9<&-}}{{note|Though incompatible with POSIX and some non-GNU implementation, it's feasible to replace the usage of {{man|1|xxd|s}} in above example with ''dd'' in conjunction of {{ic|1=count=0}} and {{ic|skip}} options as [https://github.com/coreutils/coreutils/blob/4fd708810ce0e0d967c4c14e1ff2ff7b43440b58/tests/dd/skip-seek-past-file.sh#L74 an example in coreutils test suite].}}}} |
||
+ | |||
+ | === ブータブルなディスクイメージをブロックデバイスに書き込み、任意で進捗情報を表示する === |
||
+ | |||
+ | See [[USB flash installation medium#Using_basic_command_line_utilities]] for examples of commodity utilities include the potential least adapted ''dd'' for that case. |
||
+ | {{tip|''To write the content of a file to block device (with progress indicator)'', a suggested alternative is {{man|1|dd_rescue|W}}. It is capable to avoid unnecessary writing in case of overwriting the old version of image on device with a newer version.}} |
||
== Troubleshooting == |
== Troubleshooting == |
2022年10月7日 (金) 17:07時点における版
dd はファイルの変換とコピーを主な目的とする コアユーティリティ です。
cp と同様にデフォルトでは dd はファイルのビットごとのコピーを作成しますが、低レベルの I/O フロー制御機能を備えています。
詳細は、dd(1) またはフルドキュメントを参照してください。
インストール
dd は GNU coreutils の一部です。このパッケージ内の他のユーティリティについては、Core utilities を参照してください。
ディスクの複製と復元
dd コマンドはシンプルでありながら多機能で強力なツールです。ファイルシステムの種類や OS に関係なく、コピー元からコピー先へブロック単位でコピーすることができます。ライブ CD のようなライブ環境から dd を使用するのが便利です。
パーティションの複製
物理ディスク /dev/sda
のパーティション 1 から、物理ディスク /dev/sdb
のパーティション 1 へ:
# dd if=/dev/sda1 of=/dev/sdb1 bs=64K conv=noerror,sync status=progress
ハードディスク全体の複製
物理ディスク /dev/sda
から物理ディスク /dev/sdb
へ:
From physical disk /dev/sda
to physical disk /dev/sdb
:
# dd if=/dev/sda of=/dev/sdb bs=64K conv=noerror,sync status=progress
MBR (つまりブートローダ)、すべてのパーティション、UUID、データを含むドライブ全体のクローンを作成します。
bs=
はブロックサイズを設定します。デフォルトは512バイトで、これは1980年代前半以降のハードドライブの「古典」的なブロックサイズですが、最も便利なものではありません。64KBや128KBなど、より大きな値を使用してください。また、「ブロックサイズ」だけでなく、読み取りエラーの伝搬にも影響を与えるため、以下の警告をお読みください。詳細は、[1] と [2] を参照して、自分の使用例に最適な bs 値を見付けてください。noerror
はすべての読み取りエラーを無視して操作を続けるように dd に指示します。dd のデフォルトの動作は、いかなるエラーでも停止します。sync
は読み込みエラーがあった場合、入力ブロックをゼロで埋め、データのオフセットは同期されたままになります。status=progress
は、操作がいつ完了するかを推測するために使用できる転送統計を表示します。
dd ユーティリティには、技術的に「入力ブロックサイズ」(IBS)と「出力ブロックサイズ」(OBS)があります。bs
を設定すると、実質的に IBS と OBS の両方を設定することになります。通常、ブロックサイズが例えば 1MiB の場合、dd は 1024×1024 バイトを読み込み、同じバイト数を書き込みます。しかし、読み取りエラーが発生すると、事態はおかしくなります。多くの人は、noerror,sync オプションを使うと、dd が「読み込みエラーをゼロで埋める」と思っているようですが、そうではありません。dd はドキュメントによると、読み込み完了後に OBS から IBS のサイズを埋める、つまりブロックの最後にゼロを追加するのです。つまり、ディスクの場合、512 バイトの読み取りエラーが読み取りの最初に1回発生しただけで、事実上 1MB 全体がめちゃくちゃになってしまうのです: 12ERROR89 は 120000089 ではなく 128900000 となります。
ディスクにエラーがないことが確認できれば、ブロックサイズを大きくしてコピーを進めることができ、コピーの速度が数倍向上します。例えば、Celeron 2.7GHz のシステムで、bs を 512 から 64K に変更すると、コピー速度が 35MB/s から 120MB/s になります。ただし、コピー元のディスクで発生した読み取りエラーは、コピー先のディスクではブロックエラーとして発生することに注意してください。
パーティションテーブルのバックアップ
fdisk#パーティションテーブルのバックアップとリストア または gdisk#パーティションテーブルのバックアップとリストア を参照。
Create disk image
Boot from a live medium and make sure no partitions are mounted from the source hard drive.
Then mount the external hard drive and backup the drive:
# dd if=/dev/sda conv=sync,noerror bs=64K | gzip -c > /path/to/backup.img.gz
If necessary (e.g. when the resulting files will be stored on a FAT32 file system) split the disk image into multiple parts (see also split(1)):
# dd if=/dev/sda conv=sync,noerror bs=64K | gzip -c | split -a3 -b2G - /path/to/backup.img.gz
If there is not enough disk space locally, you may send the image through ssh:
# dd if=/dev/sda conv=sync,noerror bs=64K | gzip -c | ssh user@local dd of=backup.img.gz
Finally, save extra information about the drive geometry necessary in order to interpret the partition table stored within the image. The most important of which is the cylinder size.
# fdisk -l /dev/sda > /path/to/list_fdisk.info
システムの復元
システムを復元するには:
# gunzip -c /path/to/backup.img.gz | dd of=/dev/sda
イメージが分割されている場合は、代わりに以下を使用してください:
# cat /path/to/backup.img.gz* | gunzip -c | dd of=/dev/sda
MBR のバックアップと復元
ディスクに変更を加える前に、ドライブのパーティションテーブルとパーティションスキームをバックアップしておくと良いでしょう。また、同じパーティションレイアウトを複数のドライブにコピーするためにバックアップを使うこともできます。
MBR はディスクの先頭 512 バイトに格納されています。MBR は4つの部分から成ります:
- 始めの 440 バイトにはブートストラップコード (ブートローダ) が含まれています。
- 次の 6 バイトにはディスクのシグネチャが含まれています。
- 次の 64 バイトにはパーティションテーブルが含まれています (各16バイトの4つのエントリ、各プライマリパーティションに1つのエントリ)。
- 最後の 2 バイトにはブートシグネチャが含まれています。
MBR を mbr_file.img
として保存するには:
# dd if=/dev/sdX of=/path/to/mbr_file.img bs=512 count=1
完全な dd ディスクイメージから MBR を取り出すこともできます:
# dd if=/path/to/disk.img of=/path/to/mbr_file.img bs=512 count=1
バックアップから MBR を復元するには (注意。このコマンドは既存のパーティションテーブルを破壊し、ディスク上のすべてのデータにアクセスできなくなります):
# dd if=/path/to/mbr_file.img of=/dev/sdX bs=512 count=1
ブートローダを復元したいだけで、プライマリパーティションテーブルのエントリに興味はないならば、単に MBR の先頭 440 バイトを復元すれば良いだけです:
# dd if=/path/to/mbr_file.img of=/dev/sdX bs=440 count=1
パーティションテーブルだけを復元するには、以下のコマンドを使う必要があります:
# dd if=/path/to/mbr_file.img of=/dev/sdX bs=1 skip=446 count=64
ブートローダーの削除
MBR ブートスラップコードを消去するには(別のオペレーティングシステムを完全に再インストールする必要がある場合に役立つ場合があります)、最初の440バイトのみをゼロにする必要があります:
# dd if=/dev/zero of=/dev/sdX bs=440 count=1
ディスク関連や他の使用場面
As some readers might already realized, the dd(1) core utility has an quite different command-line syntax compared to other utilities. Moreover, while supporting some unique features not found in other commodity utilities, several default behaviour (and sometimes, inability) it has is either less-ideal or potential error-prone if applied to specific scenario. For that reason, users may want to use some alternatives that better in some aspects in lieu of the dd core utility.
That said, it is still worth to note that since dd is a core utility, which is installed by default on Arch and many other systems, is preferable to some alternatives or more specialized utilities if it is inconvenient to install a new package on your system.
To cover the two aspects that addressed above, this section is dedicated to summarise the features of dd(1) core utility that rarely found in other commodity utilities, in a form that resemble pacman/Rosetta article but with quantities of examples being cut down to as few and simple as possible to just enough to examine these features of dd (as denoted by i.e. or To-clause in "Tip:" box under subsection), either in practice or pseudocode.
バイナリファイルにブロック単位のパッチをインプレースに適用する
It is not an uncommon practice to use dd as a feature-limited binary file patcher in automated shell scripts since it is capable to seek
the output file by given offset before writing, and to block-by-block (or byte-y-byte if bs=1
) in-place patching the output file by adding the conv=notrunc
option.
For example, to modify the timestamp part of first member in a cpio(5) § Portable ASCII Format archive, which starts at the 49th byte of the file (or with offset of 0x30
if you prefer hex notation):
$ touch a-randomly-chosen-file $ bsdtar -cf example-modify-ts.cpio --format odc -- a-randomly-chosen-file
$ printf '%011o' "$(date -d "2019-12-21 00:00:00" +%s)" | dd conv=notrunc of=example-modify-ts.cpio seek=48 oflag=seek_bytes
VFAT ファイルシステムイメージのボリュームラベルを表示する
To read the filesystem volume label of an VFAT image file, which should be in total length of 11 bytes that padded by ASCII spaces, with an offset of 0x047
:
$ truncate -s 33M empty-hole.img $ fatlabel empty-hole.img LabelMe
$ dd iflag=skip_bytes,count_bytes count=11 skip=$((0x047)) if=empty-hole.img | sed -e 's% *$%%'
パイプで繋がれたコマンド間で sponge する
In following example, to avoid unnecessary long-lasting TCP connection on input end if the output end blocks longer than expected, one may put a dd between two commands with an output block size certainly larger than input while still reasonably smaller than available memory:
$ curl -qgsfL http://example.org/mirrors/ftp.archlinux.org/mirrored.md5deep | dd ibs=128k obs=200M | poor-mirroring-script-that-perform-mirroring-on-input-paths-line-by-line-wo-buffer-entire-list-first
サイズの制限付きでデータを転送する
It's a common practice to use dd in a data streaming shell script for limiting total length of data that a piped command may consume. For example, to inspect an ustar header block (tar(5) § POSIX ustar Archives) using a shell script function in a streaming manner:
hexdump-field() { set -o pipefail printf '%s[%d]:\n' $1 $2 dd count=${2}B status=none | hexdump -e $2'/1 "%3.2x"' -e '" | " '$2'/1 "%_p" "\n"' } inspect-tar-header-block() { local -a hdrstack=( name 100 mode 8 uid 8 gid 8 size 12 mtime 12 checksum 8 typeflag 1 linkname 100 magic 6 version 2 uname 32 gname 32 devmajor 8 devminor 8 prefix 155 pad 12 ) set - ${hdrstack[@]} while test $# -gt 0; do hexdump-field $1 $2 || return shift 2 done }
$ bsdtar -cf - /dev/tty /dev/null 2>&- | dd count=1 skip=1 status=none | inspect-tar-header-block
ブータブルなディスクイメージをブロックデバイスに書き込み、任意で進捗情報を表示する
See USB flash installation medium#Using_basic_command_line_utilities for examples of commodity utilities include the potential least adapted dd for that case.
Troubleshooting
Partial read: copied data is smaller than requested
Files created with dd can end up with a smaller size than requested if a full input block is not available for various reasons (i.e. the underlying read(2) system call returns early.) This can happen when reading from a pipe(7), or when reading a device file like /dev/urandom
and /dev/random
(e.g. due to hardcoded limitation of underlying kernel device driver or insufficient entropy.), in conjunction of count=n
option where n
is the number of input block(s) to copy to output.
It is possible, but not guaranteed, that dd will warn you about such kind of issue:
dd: warning: partial read (X bytes); suggest iflag=fullblock
The solution is to do as the warning says, add iflag=fullblock
option in addition to the input file option to the dd command. For example, to create a new file filled up with random data in total length of 40 megabytes:
$ dd if=/dev/urandom of=new-file-filled-by-urandom.bin bs=40M count=1 iflag=fullblock
When reading from a pipe, an alternative to iflag=fullblock
is to limit bs
to the PIPE_BUF
constant value as defined in linux/limits.h
to make the pipe(7) I/O atomic. For example, to prepare a text file filled up will random alphanumeric string in total length of 5 megabytes:
$ LC_ALL=C tr -dc '[:alnum:]' </dev/urandom | dd of=passtext-5m.txt bs=4k count=1280
Since the output file is not a pipe, one may prefer to use ibs
and obs
options to set block size separately for the (input) pipe and the (output) on-disk file. For example, to set a more efficient block size for output file:
$ LC_ALL=C tr -dc '[:alnum:]' </dev/urandom | dd of=passtext-5m.txt ibs=4k obs=64k count=1280