GNU parallel
From Christoph's Personal Wiki
GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.
Usage examples / tutorial
- Install GNU parallel from the CLI (or, just use your distro's repo):
$ wget pi.dk/3 -qO - | bash -x
$ cat foo.fasta >RECORD1 ATGGCTGTCTTCTTGCTTGCCACTTCCACCATAATGTTCCCAACGAAGATAGAAGCAGCA GATTGTAATGGTGCATGTTCACCTTTCGAGGTGCCACCGTGCCGCTCAAGTGATTGTCGT TGTGTCCCTATAGGACTATTTGTTGGTTTCTGCATACATCCAACTGGACTTTCATCTGTT >RECORD2 GCGAAGATGGTCGACGAACATCCCAACTTATGTCAATCTGATGATGAATGCATGAAGAAA GGAAGTGGCAATTTTTGCGCTCGTTACCCTAATAATTATATCGATTATGGATGGTGTTTT GACTCTGATTCTGAAGCACTGAAAGGCTTCTTGGCCATGCCTAGGGCAACCACCAAGTAA
$ cat foo.fasta | parallel --pipe --recstart '>' -N1 cat';' echo ===== >RECORD1 ATGGCTGTCTTCTTGCTTGCCACTTCCACCATAATGTTCCCAACGAAGATAGAAGCAGCA GATTGTAATGGTGCATGTTCACCTTTCGAGGTGCCACCGTGCCGCTCAAGTGATTGTCGT TGTGTCCCTATAGGACTATTTGTTGGTTTCTGCATACATCCAACTGGACTTTCATCTGTT ===== >RECORD2 GCGAAGATGGTCGACGAACATCCCAACTTATGTCAATCTGATGATGAATGCATGAAGAAA GGAAGTGGCAATTTTTGCGCTCGTTACCCTAATAATTATATCGATTATGGATGGTGTTTT GACTCTGATTCTGAAGCACTGAAAGGCTTCTTGGCCATGCCTAGGGCAACCACCAAGTAA =====
$ printf '=%.0s' {1..79} $ printf %79s | tr " " "=" $ seq -s= 79 | tr -d '[:digit:]' $ perl -E 'say "=" x 79' $ head -c 79 < /dev/zero | tr '\0' '='
$ cat /usr/share/dict/words | parallel --pipe --blocksize 500k wc
$ seq 1 10 |shuf | parallel --pipe --files -N 3 sort -n | parallel -mj1 sort -nm $ rm -f /tmp/*.par $ seq 1 10 |shuf | parallel --pipe --files -N 3 sort -n | parallel -mj1 sort -nm {} ";"rm {}
External links
- Official website
- GNU parallel tutorial (same as running:
`man parallel_tutorial`
) - GNU Parallel videos on YouTube