Difference between revisions of "Perl"

From Christoph's Personal Wiki
Jump to: navigation, search
(My favourites)
 
(20 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
'''Perl''' is a dynamic programming language.
 
'''Perl''' is a dynamic programming language.
 +
see: [[Perl/Scripts|scripts]] for examples
  
 
==Regex==
 
==Regex==
Line 16: Line 17:
  
 
===My favourites===
 
===My favourites===
 +
*[[Perl/Modules/Lingua]]
 
*[http://dbi.perl.org/ DBI]
 
*[http://dbi.perl.org/ DBI]
 +
*[http://search.cpan.org/~abigail/Regexp-Common-2.120/lib/Regexp/Common.pm Regexp::Common]
 +
*[http://search.cpan.org/~abw/Math-Bezier-0.01/Bezier.pm Math::Bezier]
 
*[http://search.cpan.org/~petdance/WWW-Mechanize-1.20/lib/WWW/Mechanize.pm WWW::Mechanize] (see: [http://www.perl.com/pub/a/2003/01/22/mechanize.html])
 
*[http://search.cpan.org/~petdance/WWW-Mechanize-1.20/lib/WWW/Mechanize.pm WWW::Mechanize] (see: [http://www.perl.com/pub/a/2003/01/22/mechanize.html])
 
*[http://search.cpan.org/~gwilliams/WWW-Search-PubMed-1.002/lib/WWW/Search/PubMed.pm WWW::Search::PubMed] (see: [http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html])
 
*[http://search.cpan.org/~gwilliams/WWW-Search-PubMed-1.002/lib/WWW/Search/PubMed.pm WWW::Search::PubMed] (see: [http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html])
 
*[http://search.cpan.org/~muenalan/WWW-Search-NCBI-PubMed-0.01/lib/WWW/Search/NCBI/PubMed.pm WWW::Search::NCBI::PubMed]
 
*[http://search.cpan.org/~muenalan/WWW-Search-NCBI-PubMed-0.01/lib/WWW/Search/NCBI/PubMed.pm WWW::Search::NCBI::PubMed]
 +
*[http://search.cpan.org/~gaas/libwww-perl-5.808/lib/LWP/Simple.pm LWP::Simple] — simple procedural interface to LWP
 +
*[http://search.cpan.org/~jgamble/Math-Polynomial-Solve-2.11/lib/Math/Polynomial/Solve.pm Math::Polynomial::Solve]
 +
*[http://search.cpan.org/dist/Lingua-EN-Keywords-2.0/Keywords.pm Lingua::EN::Keywords]
 
*[http://search.cpan.org/~spectrum/MediaWiki-1.08/lib/MediaWiki.pm MediaWiki]
 
*[http://search.cpan.org/~spectrum/MediaWiki-1.08/lib/MediaWiki.pm MediaWiki]
 +
*[http://petdance.com/ack/ ack] — a grep-like tool
 
*[[mvs]]
 
*[[mvs]]
  
==BioPerl==
+
===Upgrade CPAN===
See: http://www.bioperl.org/wiki/Main_Page
+
% perl -MCPAN -e shell
 +
cpan>install Bundle::CPAN
 +
cpan>q
  
==Perl and [[MySQL]]==
+
*Force CPAN to produce a list of all the modules that have updates and update them:
This section will just list a bunch of random examples. They are useful to give you an idea of what you can do with Perl, MySQL, and the [[:Category:Linux Command Line Tools|CLI]].
+
/usr/bin/perl -MCPAN -e 'CPAN::Shell->install(CPAN::Shell->r)'
  
''Note: These examples were taken from my course in Comparative Microbial Genomics at CBS (in Denmark).''
+
==Perlrun==
 +
Here is an excerpt from '<code>man perlrun</code>' about the important command line switches used when doing perl one-liners.
 +
    -a  turns on autosplit mode when used with a -n or -p.  An implicit
 +
          split command to the @F array is done as the first thing inside the
 +
          implicit while loop produced by the -n or -p.
 +
              perl -ane 'print pop(@F), "\n";'
 +
          is equivalent to
 +
              while (<>) {
 +
                  @F = split(' ');
 +
                  print pop(@F), "\n";
 +
              }
 +
          An alternate delimiter may be specified using -F.
 +
 
 +
    -e commandline
 +
          may be used to enter one line of script.  If -e is given, Perl will
 +
          not look for a script filename in the argument list.  Multiple -e
 +
          commands may be given to build up a multi-line script.  Make sure to
 +
          use semicolons where you would in a normal program.
 +
 
 +
    -n  causes Perl to assume the following loop around your script, which
 +
          makes it iterate over filename arguments somewhat like sed -n or
 +
          awk:
 +
              while (<>) {
 +
                  ...            # your script goes here
 +
              }
 +
          Note that the lines are not printed by default.  See -p to have
 +
          lines printed.  If a file named by an argument cannot be opened for
 +
          some reason, Perl warns you about it, and moves on to the next file.
 +
 
 +
    -p  causes Perl to assume the following loop around your script, which
 +
          makes it iterate over filename arguments somewhat like sed:
 +
              while (<>) {
 +
                  ...            # your script goes here
 +
              } continue {
 +
                  print or die "-p destination: $!\n";
 +
              }
 +
          If a file named by an argument cannot be opened for some reason,
 +
          Perl warns you about it, and moves on to the next file.  Note that
 +
          the lines are printed automatically.  An error occuring during
 +
          printing is treated as fatal.  To suppress printing use the -n
 +
          switch.  A -p overrides a -n switch.
 +
 
 +
==BioPerl==
 +
See: http://www.bioperl.org/wiki/Main_Page
  
<pre>
+
==See also==
mysql -B -e "update cmp_genomics.features set note = '' where user = USER() and note not like 'tcs%' or note is null"
+
*[http://perldoc.perl.org/index-functions.html Perl functions A-Z]
foreach accession (AE017042 AE016879 AL111168 AL645882 AP008232 AP009048 BA000021 CP000034)
+
*[http://perldoc.perl.org/perlop.html perlop] &mdash; Perl operators
  foreach type (ecf s54 s70)
+
*[http://search.cpan.org/dist/CPANPLUS/ CPANPLUS] (aka CPAN++) &mdash; a more modern version of CPAN.pm
    cat source/$accession.proteins.$type.sigmas.hmmsearch | \
+
    perl -ne 'next unless /^CDS_(\d+)\-(\d+)_DIR([\-\+]+)\s+([0-9\-\.e]+)\s+([0-9\-\.e]+)\s+/;\
+
              my ($start,$stop,$dir,$score,$evalue) = ($1,$2,$3,$4,$5);\
+
              next unless $score > 0;\
+
              print "update cmp_genomics.features set note = \"Sigma Factor '$type'\" \
+
                      where start=$start and stop = $stop and user = user() and accession = \"'$accession'\";\n";'\
+
    | mysql
+
  end
+
end
+
</pre>
+
<pre>
+
mysql -B -e "update cmp_genomics.features set note = '' where user = USER() and note not like 'sigma%' or note is null"
+
foreach accession (AE017042 AE016879 AL111168 AL645882 AP008232 AP009048 BA000021 CP000034)
+
  foreach type (RRreciever HisKA_1 HisKA_2 HisKA_3 HWE_HK)
+
    cat source/$accession.$type.TCS.hmmsearch | \
+
    perl -ne 'next unless /^CDS_(\d+)\-(\d+)_DIR([\-\+]+)\s+([0-9\-\.e]+)\s+([0-9\-\.e]+)\s+/;\
+
              my ($start,$stop,$dir,$score,$evalue) = ($1,$2,$3,$4,$5);\
+
              next unless $score > 0; \
+
              print "update cmp_genomics.features set note = \"TCS '$type'\" \
+
                      where start=$start and stop = $stop and user = user() and accession = \"'$accession'\";\n";'\
+
    | mysql -B
+
  end
+
end
+
</pre>
+
  
 
==External links==
 
==External links==
 
*[http://perldoc.perl.org/ Perl version 5.8.8 documentation]
 
*[http://perldoc.perl.org/ Perl version 5.8.8 documentation]
 
*[http://perldoc.perl.org/perlre.html Perl regular expressions]
 
*[http://perldoc.perl.org/perlre.html Perl regular expressions]
 +
*[http://perldoc.perl.org/perlrequick.html perlrequick] &mdash; Perl regular expressions quick start
 +
*[http://perldoc.perl.org/perlretut.html perlretut] &mdash; Perl regular expressions tutorial
 
*"''[http://www.perl.org/books/beginning-perl/ Beginning Perl]''" &mdash; full book online (as PDFs).
 
*"''[http://www.perl.org/books/beginning-perl/ Beginning Perl]''" &mdash; full book online (as PDFs).
 +
*[http://wiki.python.org/moin/PerlPhrasebook Perl/Python Phrasebook]
 +
*[http://www.rosettacode.org/wiki/Main_Page Rosetta Code]
 +
*[http://www.perlmonks.org/?node_id=632023 Yet Another Rosetta Code Problem (Perl, Ruby, Python, Haskell, ...)]
 +
*[http://tldp.org/LDP/GNU-Linux-Tools-Summary/html/text-manipulation-tools.html Text manipulation tools] &mdash; from GNU/Linux Command-Line Tools Summary
 +
*[http://wiki.mandriva.com/en/Policies/Perl Mandriva Perl library packaging policy] (wiki)
 
*[[wikipedia:Perl]]
 
*[[wikipedia:Perl]]
 +
*[[wikipedia:Plain Old Documentation]] (aka POD)
 +
===Resources/Books===
 +
*'''''Minimal Perl: For UNIX and Linux People''''' by Tim Maher. ISBN 1-9323-9450-8.
  
 
{{stub}}
 
{{stub}}
 
[[Category:Scripting languages]]
 
[[Category:Scripting languages]]

Latest revision as of 04:15, 31 March 2008

Perl is a dynamic programming language.

see: scripts for examples

Regex

see: Regular expression

Search and replace all "foo" with "bar" in filename:

perl -i -pe 's/foo/bar/gi' filename

Modules

Search and download: http://search.cpan.org/

Installing

perl -MCPAN -e shell
#Or,
perl -MCPAN -e "install Example::Module"

My favourites

Upgrade CPAN

% perl -MCPAN -e shell
cpan>install Bundle::CPAN
cpan>q
  • Force CPAN to produce a list of all the modules that have updates and update them:
/usr/bin/perl -MCPAN -e 'CPAN::Shell->install(CPAN::Shell->r)'

Perlrun

Here is an excerpt from 'man perlrun' about the important command line switches used when doing perl one-liners.

    -a   turns on autosplit mode when used with a -n or -p.  An implicit
         split command to the @F array is done as the first thing inside the
         implicit while loop produced by the -n or -p.
              perl -ane 'print pop(@F), "\n";'
         is equivalent to
             while (<>) {
                 @F = split(' ');
                 print pop(@F), "\n";
             }
         An alternate delimiter may be specified using -F.
    -e commandline
         may be used to enter one line of script.  If -e is given, Perl will
         not look for a script filename in the argument list.  Multiple -e
         commands may be given to build up a multi-line script.  Make sure to
         use semicolons where you would in a normal program.
    -n   causes Perl to assume the following loop around your script, which
         makes it iterate over filename arguments somewhat like sed -n or
         awk:
             while (<>) {
                 ...             # your script goes here
             }
         Note that the lines are not printed by default.  See -p to have
         lines printed.  If a file named by an argument cannot be opened for
         some reason, Perl warns you about it, and moves on to the next file.
    -p   causes Perl to assume the following loop around your script, which
         makes it iterate over filename arguments somewhat like sed:
             while (<>) {
                 ...             # your script goes here
             } continue {
                 print or die "-p destination: $!\n";
             }
         If a file named by an argument cannot be opened for some reason,
         Perl warns you about it, and moves on to the next file.  Note that
         the lines are printed automatically.  An error occuring during
         printing is treated as fatal.  To suppress printing use the -n
         switch.  A -p overrides a -n switch.

BioPerl

See: http://www.bioperl.org/wiki/Main_Page

See also

External links

Resources/Books

This article is curently a "stub". This means it is an incomplete article needing further elaboration.

I always welcome suggestions, comments, and criticism. If you have something to contribute to this site, please follow this link: Contributing Information. Thank you!