Perl Language

Perl stands in for “Practical Extraction and Reporting Language” even though there is no authorized acronym for Perl. Perl was created by Larry Wall in 1987 when he was employed on a bug reporting system and “AWK”- a programming language he was using for the purpose was not helping him much. He is still the chief architect and developer of Perl. If we want to define Perl in one sentence: Perl is a high-level, interpreted, dynamic programming language.

Perl is a programming language specially designed for text editing. It is now widely used for a variety of purposes including Linux system administration, network programming, web development, etc. In scientific computing, Perl is particularly used in Bioinformatics

Advances in biology have generated an ocean of data that needs careful computational analysis to extract useful information. Bioinformatics uses computation and mathematics to interpret biological data, such as DNA sequence, protein sequence, or three-dimensional structures, and depends on the ability to integrate data of various types. The Perl language is well suited to this purpose.

An important practical skill in bioinformatics is the ability to quickly develop scripts (short programs) for scanning or transforming large amounts of data. Perl is an excellent language for scripting, because of its compact syntax, broad array of functions, and data orientation. The following are a few of the attributes of Perl that make it an attractive choice.

  • Perl provides powerful ways to match and manipulate strings through the use of regular expressions. Changing file formats from one to another is a matter of contorting strings as required.

  • Modularity that makes it easy to write programs as libraries (called modules).

  • Perl system calls and pipes can be used to incorporate external programs.

  • Dynamic loaders of Perl that help extend Perl with programs written in C as well as create compiled libraries that can be interpreted by the Perl interpreter.

  • Perl is a good prototyping language and is easy to code. New algorithms can be easily tested in Perl before using a rigorous language.

Using the CPAN module

The CPAN module automates or at least simplifies the make and install of perl modules and extensions. It includes some primitive searching capabilities and knows how to use LWP, HTTP::Tiny, Net::FTP and certain external download clients to fetch distributions from the net.

These are fetched from one or more mirrored CPAN (Comprehensive Perl Archive Network) sites and unpacked in a dedicated directory.

The CPAN module also supports named and versioned bundles of modules. Bundles simplify handling of sets of related modules. See Bundles below.

The package contains a session manager and a cache manager. The session manager keeps track of what has been fetched, built, and installed in the current session. The cache manager keeps track of the disk space occupied by the make processes and deletes excess space using a simple FIFO mechanism.

All methods provided are accessible in a programmer style and in an interactive shell style.

Users can install their own modules from the head node and they will be installed on the $HOME folder for usage later on any compute node.

Preparing the CPAN Shell

Enter interactive mode by running:

perl -MCPAN -e shell

or directly by:

cpan

If this is the first time you are configuring the shell, you will be greeted with a message like:

CPAN.pm requires configuration, but most of it can be done automatically.
If you answer 'no' below, you will enter an interactive dialog for each
configuration option instead.

Would you like to configure as much as possible automatically? [yes]

You can continue with the automatic configuration, suitable for most users. Next you will see this warning and message:

Warning: You do not have write permission for Perl library directories.

To install modules, you need to configure a local Perl library directory or
escalate your privileges.  CPAN can help you by bootstrapping the local::lib
module or by configuring itself to use 'sudo' (if available).  You may also
resolve this problem manually if you need to customize your setup.

What approach do you want?  (Choose 'local::lib', 'sudo' or 'manual')
 [local::lib]

Normal users are not allowed to add, delete or change files on system folders. The warning is telling you that. On a cluster, as normal user the best choice is to use [local::lib] as it will write all the files on folders in your $HOME. The folders are $HOME/.cpan and $HOME/perl5.

The next message is:

Autoconfigured everything but 'urllist'.

Now you need to choose your CPAN mirror sites.  You can let me
pick mirrors for you, you can select them from a list or you
can enter them by hand.

Would you like me to automatically choose some CPAN mirror
sites for you? (This means connecting to the Internet) [yes]

You can select yes to this question and you will be directed to one of the mirrors for CPAN. At WVU we have one mirror that can be used instead. If you want to use it, answer no to the question and add the mirror as shown below:

Would you like me to automatically choose some CPAN mirror
sites for you? (This means connecting to the Internet) [yes] no

Would you like to pick from the CPAN mirror list? [yes] no
Now you can enter your own CPAN URLs by hand. A local CPAN mirror can be
listed using a 'file:' URL like 'file:///path/to/cpan/'

CPAN.pm needs at least one URL where it can fetch CPAN files from.

Please enter your CPAN site: [] http://fireball-public.phys.wvu.edu/mirror/CPAN/
Enter another URL or ENTER to quit: []
New urllist
  http://fireball-public.phys.wvu.edu/mirror/CPAN/

The next message is about add some lines to your .bashrc:

local::lib is installed. You must now add the following environment variables
to your shell configuration files (or registry, if you are on Windows) and
then restart your command line shell and CPAN before installing modules:

PATH="/users/username/perl5/bin${PATH:+:${PATH}}"; export PATH;
PERL5LIB="/users/username/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export PERL5LIB;
PERL_LOCAL_LIB_ROOT="/users/username/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCAL_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT;
PERL_MB_OPT="--install_base \"/users/username/perl5\""; export PERL_MB_OPT;
PERL_MM_OPT="INSTALL_BASE=/users/username/perl5"; export PERL_MM_OPT;

Would you like me to append that to /users/username/.bashrc now? [yes]

Answering yes, will add those lines to your .bashrc, something that you can later edit directly.

Finally, you get into the shell:

ommit: wrote '/users/gufranco/.cpan/CPAN/MyConfig.pm'

You can re-run configuration any time with 'o conf init' in the CPAN shell
Terminal does not support AddHistory.

cpan shell -- CPAN exploration and modules installation (v1.9800)
Enter 'h' for help.

cpan[1]>

Installing Perl modules

The Comprehensive Perl Archive Network (CPAN) currently has more than 180000 Perl modules and you can install those for your own usage without requesting global installation on the cluster.

For example, lets assume that you want to install Math::EllipticCurve::Prime a package for elliptic curves over a prime field. Log into the CPAN shell and execute:

capn[1]> install Math::EllipticCurve::Prime

After installation you will see a message and your prompt is returned:

Appending installation info to /users/username/perl5/lib/perl5/x86_64-linux-thread-multi/perllocal.pod
BRIANC/Math-EllipticCurve-Prime-0.003.tar.gz
/usr/bin/make install  -- OK

cpan[2]>