Difference between revisions of "GNU parallel"

From Christoph's Personal Wiki
Jump to: navigation, search
(Usage examples / tutorial)
(Usage examples / tutorial)
 
(One intermediate revision by the same user not shown)
Line 17: Line 17:
 
  $ find . -name "*.foo" -print0 | parallel -0 grep bar
 
  $ find . -name "*.foo" -print0 | parallel -0 grep bar
  
The above command uses the [[null character]] to delimit file names.
+
The above command uses the null character to delimit file names.
  
 
  $ find . -name "*.foo" | parallel -X mv {} /tmp/trash
 
  $ find . -name "*.foo" | parallel -X mv {} /tmp/trash
Line 30: Line 30:
  
 
however, the former command which uses <code>find</code>/<code>parallel</code>/<code>cp</code> is more resource efficient and will not halt with an error if the expansion of *.ogg is too large for the shell.
 
however, the former command which uses <code>find</code>/<code>parallel</code>/<code>cp</code> is more resource efficient and will not halt with an error if the expansion of *.ogg is too large for the shell.
 +
 +
* Multiple commands as arguments:
 +
$ cat a.txt | xargs -I % sh -c 'command1; command2; ...'
 +
# ~OR~
 +
$ cat a.txt | parallel 'command1 {}; command2 {}; ...; '
  
 
* Example of using "pipes" and "records" to separate STDIN/STDOUT:
 
* Example of using "pipes" and "records" to separate STDIN/STDOUT:

Latest revision as of 02:36, 20 March 2015

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.

Usage examples / tutorial

  • Install GNU parallel from the CLI (or, just use your distro's repo):
$ wget pi.dk/3 -qO - | bash -x
  • Basic usage:
$ find . -name "*.foo" | parallel grep bar

The above is the parallel equivalent to:

$ find . -name "*.foo" -exec grep bar {} +

This searches in all files in the current directory and its subdirectories whose name end in .foo for occurrences of the string bar. The parallel command will work as expected unless a file name contains a newline. In order to avoid this limitation one may use:

$ find . -name "*.foo" -print0 | parallel -0 grep bar

The above command uses the null character to delimit file names.

$ find . -name "*.foo" | parallel -X mv {} /tmp/trash

The above command uses {} to tell parallel to replace {} with the argument list.

$ find . -maxdepth 1 -type f -name "*.ogg" | parallel -X -r cp -v -p {} /home/media

The command above does the same as:

$ cp -v -p *.ogg /home/media

however, the former command which uses find/parallel/cp is more resource efficient and will not halt with an error if the expansion of *.ogg is too large for the shell.

  • Multiple commands as arguments:
$ cat a.txt | xargs -I % sh -c 'command1; command2; ...'
# ~OR~
$ cat a.txt | parallel 'command1 {}; command2 {}; ...; '
  • Example of using "pipes" and "records" to separate STDIN/STDOUT:
$ cat foo.fasta
>RECORD1
ATGGCTGTCTTCTTGCTTGCCACTTCCACCATAATGTTCCCAACGAAGATAGAAGCAGCA
GATTGTAATGGTGCATGTTCACCTTTCGAGGTGCCACCGTGCCGCTCAAGTGATTGTCGT
TGTGTCCCTATAGGACTATTTGTTGGTTTCTGCATACATCCAACTGGACTTTCATCTGTT
>RECORD2
GCGAAGATGGTCGACGAACATCCCAACTTATGTCAATCTGATGATGAATGCATGAAGAAA
GGAAGTGGCAATTTTTGCGCTCGTTACCCTAATAATTATATCGATTATGGATGGTGTTTT
GACTCTGATTCTGAAGCACTGAAAGGCTTCTTGGCCATGCCTAGGGCAACCACCAAGTAA
$ cat foo.fasta | parallel --pipe --recstart '>' -N1 cat';' echo =====
>RECORD1
ATGGCTGTCTTCTTGCTTGCCACTTCCACCATAATGTTCCCAACGAAGATAGAAGCAGCA
GATTGTAATGGTGCATGTTCACCTTTCGAGGTGCCACCGTGCCGCTCAAGTGATTGTCGT
TGTGTCCCTATAGGACTATTTGTTGGTTTCTGCATACATCCAACTGGACTTTCATCTGTT
=====
>RECORD2
GCGAAGATGGTCGACGAACATCCCAACTTATGTCAATCTGATGATGAATGCATGAAGAAA
GGAAGTGGCAATTTTTGCGCTCGTTACCCTAATAATTATATCGATTATGGATGGTGTTTT
GACTCTGATTCTGAAGCACTGAAAGGCTTCTTGGCCATGCCTAGGGCAACCACCAAGTAA
=====
$ printf '=%.0s' {1..79}
$ printf %79s | tr " " "="
$ seq -s= 79 | tr -d '[:digit:]'
$ perl -E 'say "=" x 79'
$ head -c 79 < /dev/zero | tr '\0' '='
$ cat /usr/share/dict/words | parallel --pipe --blocksize 500k wc
$ seq 1 10 |shuf | parallel --pipe --files -N 3 sort -n | parallel -mj1 sort -nm
$ rm -f /tmp/*.par
$ seq 1 10 |shuf | parallel --pipe --files -N 3 sort -n | parallel -mj1 sort -nm {} ";"rm {}

See also

External links