GNU parallel
GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.
Usage examples / tutorial
- Install GNU parallel from the CLI (or, just use your distro's repo):
$ wget pi.dk/3 -qO - | bash -x
- Basic usage:
$ find . -name "*.foo" | parallel grep bar
The above is the parallel equivalent to:
$ find . -name "*.foo" -exec grep bar {} +
This searches in all files in the current directory and its subdirectories whose name end in .foo for occurrences of the string bar. The parallel command will work as expected unless a file name contains a newline. In order to avoid this limitation one may use:
$ find . -name "*.foo" -print0 | parallel -0 grep bar
The above command uses the null character to delimit file names.
$ find . -name "*.foo" | parallel -X mv {} /tmp/trash
The above command uses {} to tell parallel to replace {} with the argument list.
$ find . -maxdepth 1 -type f -name "*.ogg" | parallel -X -r cp -v -p {} /home/media
The command above does the same as:
$ cp -v -p *.ogg /home/media
however, the former command which uses find/parallel/cp is more resource efficient and will not halt with an error if the expansion of *.ogg is too large for the shell.
- Multiple commands as arguments:
$ cat a.txt | xargs -I % sh -c 'command1; command2; ...'
# ~OR~
$ cat a.txt | parallel 'command1 {}; command2 {}; ...; '
- Example of using "pipes" and "records" to separate STDIN/STDOUT:
$ cat foo.fasta >RECORD1 ATGGCTGTCTTCTTGCTTGCCACTTCCACCATAATGTTCCCAACGAAGATAGAAGCAGCA GATTGTAATGGTGCATGTTCACCTTTCGAGGTGCCACCGTGCCGCTCAAGTGATTGTCGT TGTGTCCCTATAGGACTATTTGTTGGTTTCTGCATACATCCAACTGGACTTTCATCTGTT >RECORD2 GCGAAGATGGTCGACGAACATCCCAACTTATGTCAATCTGATGATGAATGCATGAAGAAA GGAAGTGGCAATTTTTGCGCTCGTTACCCTAATAATTATATCGATTATGGATGGTGTTTT GACTCTGATTCTGAAGCACTGAAAGGCTTCTTGGCCATGCCTAGGGCAACCACCAAGTAA
$ cat foo.fasta | parallel --pipe --recstart '>' -N1 cat';' echo ===== >RECORD1 ATGGCTGTCTTCTTGCTTGCCACTTCCACCATAATGTTCCCAACGAAGATAGAAGCAGCA GATTGTAATGGTGCATGTTCACCTTTCGAGGTGCCACCGTGCCGCTCAAGTGATTGTCGT TGTGTCCCTATAGGACTATTTGTTGGTTTCTGCATACATCCAACTGGACTTTCATCTGTT ===== >RECORD2 GCGAAGATGGTCGACGAACATCCCAACTTATGTCAATCTGATGATGAATGCATGAAGAAA GGAAGTGGCAATTTTTGCGCTCGTTACCCTAATAATTATATCGATTATGGATGGTGTTTT GACTCTGATTCTGAAGCACTGAAAGGCTTCTTGGCCATGCCTAGGGCAACCACCAAGTAA =====
$ printf '=%.0s' {1..79}
$ printf %79s | tr " " "="
$ seq -s= 79 | tr -d '[:digit:]'
$ perl -E 'say "=" x 79'
$ head -c 79 < /dev/zero | tr '\0' '='
$ cat /usr/share/dict/words | parallel --pipe --blocksize 500k wc
$ seq 1 10 |shuf | parallel --pipe --files -N 3 sort -n | parallel -mj1 sort -nm
$ rm -f /tmp/*.par
$ seq 1 10 |shuf | parallel --pipe --files -N 3 sort -n | parallel -mj1 sort -nm {} ";"rm {}
See also
External links
- Official website
- GNU parallel tutorial (same as running:
`man parallel_tutorial`) - GNU Parallel videos on YouTube