Glob()

From Christoph's Personal Wiki
Revision as of 01:43, 16 June 2012 by Christoph (Talk | contribs) (See also)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
The correct title of this article is glob(). The initial letter is capitalized due to technical restrictions.

glob() is a Unix library function that expands file names using a pattern matching notation reminiscent of regular expression syntax but without the expressive power of true regular expressions. The word "glob" is also used as a noun when discussing a particular pattern, e.g. "use the glob *.log to match all those log files".

The term glob is now used to refer more generally to limited pattern matching facilities of this kind in other contexts. Larry Wall's Programming Perl discusses glob in the context of the Perl language. Similarly, Tcl contains both true regular expression matching facilities and a more limited kind of pattern matching often described as globbing.

Glob metacharacters

*  # match zero or more characters
?  # match one character
[] # group characters

shopt options

Note: You can set each of the options below with shopt -s <option>.

cdspell 
This will correct minor spelling errors in a cd command, so that instances of transposed characters, missing characters and extra characters are corrected without the need for retyping.
cmdhist 
This is very much a matter of taste. Defining this will cause multi-line commands to be appended to your bash history as a single line command. This makes for easy command editing.
dotglob 
This one allows files beginning with a dot ('.') to be returned in the results of path-name expansion.
extglob 
This will give you ksh-88 egrep-style extended pattern matching or, in other words, turbo-charged pattern matching within bash. The available operators are:
?(pattern-list) 
Matches zero or one occurrence of the given patterns
*(pattern-list) 
Matches zero or more occurrences of the given patterns
+(pattern-list) 
Matches one or more occurrences of the given patterns
@(pattern-list) 
Matches exactly one of the given patterns
!(pattern-list) 
Matches anything except one of the given patterns

Examples

Note: Globs must match the entire filename, not just part of it (unlike regular expressions). By default in bash, dot-files (e.g. ~/.bashrc) are not returned.

  • Match anything
*
  • Match anything with a full stop in it
*.*
  • Match anything three characters long
???
  • Match anything that starts with 'bar' and ends in .txt
bar*.txt
  • Match any capital letter
[A-Z]
  • Match any single character that is between the brackets (i.e. b, a, or r):
[bar]
  • Match anything that starts with A, B, or C, in either case
[ABCabc]*
  • Match any single character that is not between the brackets
[^bar]
  • Copy all 'dsc0001.jpg', 'dsc0002.jpg', etc. files to tmp/ dir:
cp dsc????.jpg tmp/
  • List all files in /usr/bin starting with 'k' or 'x':
ls /usr/bin/[kx]*
  • List all files in a series
ls -l {a*,b*,*etc*}
  • List all non PDF and PostScript files in the current directory:
ls -la !(*.p@(df|s))
  • Install all RPMs in a given directory, except those built for the "noarch" architecture:
rpm -Uvh /usr/src/RPMS/!(*noarch*) 

Newer shells usually support extended glob capabilities. If extglob is enabled (shopt -s extglob), bash allows constructs like

rm !(*.c|*.h)

to delete all files except .c and .h.

To match dot-files in Bash, enable the dotglob option.

Using glob with rename

Given the files foo1, ..., foo9, foo10, ..., foo278, the following two commands

rename foo foo0 foo?
rename foo foo0 foo??

will turn them into foo001, ..., foo009, foo010, ..., foo278.

See also

External links