GNU parallel
GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.
Usage examples / tutorial
- Install GNU parallel from the CLI (or, just use your distro's repo):
$ wget pi.dk/3 -qO - | bash -x
- Basic usage:
$ find . -name "*.foo" | parallel grep bar
The above is the parallel equivalent to:
$ find . -name "*.foo" -exec grep bar {} +
This searches in all files in the current directory and its subdirectories whose name end in .foo
for occurrences of the string bar
. The parallel command will work as expected unless a file name contains a newline. In order to avoid this limitation one may use:
$ find . -name "*.foo" -print0 | parallel -0 grep bar
The above command uses the null character to delimit file names.
$ find . -name "*.foo" | parallel -X mv {} /tmp/trash
The above command uses {}
to tell parallel
to replace {}
with the argument list.
$ find . -maxdepth 1 -type f -name "*.ogg" | parallel -X -r cp -v -p {} /home/media
The command above does the same as:
$ cp -v -p *.ogg /home/media
however, the former command which uses find
/parallel
/cp
is more resource efficient and will not halt with an error if the expansion of *.ogg is too large for the shell.
- Example of using "pipes" and "records" to separate STDIN/STDOUT:
$ cat foo.fasta >RECORD1 ATGGCTGTCTTCTTGCTTGCCACTTCCACCATAATGTTCCCAACGAAGATAGAAGCAGCA GATTGTAATGGTGCATGTTCACCTTTCGAGGTGCCACCGTGCCGCTCAAGTGATTGTCGT TGTGTCCCTATAGGACTATTTGTTGGTTTCTGCATACATCCAACTGGACTTTCATCTGTT >RECORD2 GCGAAGATGGTCGACGAACATCCCAACTTATGTCAATCTGATGATGAATGCATGAAGAAA GGAAGTGGCAATTTTTGCGCTCGTTACCCTAATAATTATATCGATTATGGATGGTGTTTT GACTCTGATTCTGAAGCACTGAAAGGCTTCTTGGCCATGCCTAGGGCAACCACCAAGTAA
$ cat foo.fasta | parallel --pipe --recstart '>' -N1 cat';' echo ===== >RECORD1 ATGGCTGTCTTCTTGCTTGCCACTTCCACCATAATGTTCCCAACGAAGATAGAAGCAGCA GATTGTAATGGTGCATGTTCACCTTTCGAGGTGCCACCGTGCCGCTCAAGTGATTGTCGT TGTGTCCCTATAGGACTATTTGTTGGTTTCTGCATACATCCAACTGGACTTTCATCTGTT ===== >RECORD2 GCGAAGATGGTCGACGAACATCCCAACTTATGTCAATCTGATGATGAATGCATGAAGAAA GGAAGTGGCAATTTTTGCGCTCGTTACCCTAATAATTATATCGATTATGGATGGTGTTTT GACTCTGATTCTGAAGCACTGAAAGGCTTCTTGGCCATGCCTAGGGCAACCACCAAGTAA =====
$ printf '=%.0s' {1..79} $ printf %79s | tr " " "=" $ seq -s= 79 | tr -d '[:digit:]' $ perl -E 'say "=" x 79' $ head -c 79 < /dev/zero | tr '\0' '='
$ cat /usr/share/dict/words | parallel --pipe --blocksize 500k wc
$ seq 1 10 |shuf | parallel --pipe --files -N 3 sort -n | parallel -mj1 sort -nm $ rm -f /tmp/*.par $ seq 1 10 |shuf | parallel --pipe --files -N 3 sort -n | parallel -mj1 sort -nm {} ";"rm {}
See also
External links
- Official website
- GNU parallel tutorial (same as running:
`man parallel_tutorial`
) - GNU Parallel videos on YouTube