A Guide to Reading Perl

Jan 9

his tutorial is intended as a reference for experienced programmers. It gives a short descriptions of some of the main aspects of perl one is likely to find in the script they just downloaded. This is not a tutorial on writing perl scripts.

I. Operators

# will comment any thing *after* the pound sign.

die “now\n”; # this line kills the program with the message “now”.

#$x = 2; this is not evaluated.

Operators have the *same* precedence as in C. as follows:

 
    left	terms and list operators (leftward)
    left	->
    nonassoc	++ --
    right	**
    right	! ~ \ and unary + and -
    left	=~ !~
    left	* / % x
    left	+ - .
    left	< < >>
    nonassoc	named unary operators
    nonassoc	<> < = >= lt gt le ge
    nonassoc	== != < => eq ne cmp
    left	&
    left	| ^
    left	&&
    left	||
    nonassoc	..  ...
    right	?:
    right	= += -= *= etc.
    left	, =>
    nonassoc	list operators (rightward)
    right	not
    left	and
    left	or xor

All should be self-explanatory. (more info)

II. Variables

Perl is not like many other languages when it comes to variables. Where must require you to declare a variable, along with a type, perl does not. The 3 types of variables are as follows:

Scalar: This includes all the single variables one might encounter. This variable is denoted by the cash symbol such as: $myVar Scalar variables hold? all your information such as a string, (single or double quoted) chars, integers, floating point numbers, etc, everything.

Array: These variables are just what you would expect to find in any other language. With one difference, perl does not natively support multidimensional arrays. (Where one element of the array is another array, in this case the array becomes part of the list.) Arrays are denoted by the “and” symbol such as: @myArray An element of an array is called in this fashion: $myArray[0]; Note the use of $ not @ elements start at zero and may be referenced by another variable as seen here: $myArray[$x]

Associative Array: Also known as a “hash” this is an extremely useful data type. These are denoted by the percent sign such as: %myHash. The most common hash you will encounter is %ENV. An associative array is unique. This opens a wonderful world for perl programmers. Each “element” in hash is actually a key and each key has a value. When one wants to call the value of a key they do so in the following manner: $myHash{$myKey} notice the use of $ again not % and the use of so called “curly brackets”. This facilitates the ease of reading and programming. Suppose you want to create add a new user to a database. Instead of creating an array with each element part of their information and trying to remember which element stores which information you can creat a hash like thus: my %newUser then simply add key/value pairs like this:


$newUser{‘user_id’} = ‘100001’;
$newUser{‘user_rank’} = ‘lamer’;

Now instead of having to remember which element of an array is the user_id we simply call $newUser{‘user_id’} and we have it.

Example:

if ((!$ENV{'HTTP_REFERER'}) || (!$ENV{'CONTENT_LENGTH'}) || (!$ENV{'HTTP_ACCEPT'}) || (!$ENV{'HTTP_USER_AGENT'})) {

if ($log_warnings == 1) {

&log_msg("WARNING : Missing HTTP header(s) from $ENV{'REMOTE_ADDR'} = $form{'name'} ($form{'magic'})\n");

}

&error('bad_headers');

}

Here we see the common $ENV for CGI programming. It’s easy now to read $ENV{‘HTTP_REFERER’} we know right away what we’re looking for here.

III. Special Variables

As with most languages perl supports system variables, already programmed into the interpreter. There are many but the most common you’ll encounter will probably be:

@_ this is a list created in a sub which contains all the arguments passed
$_ similar to @_ The default input, output, and pattern-searching space.
$| if set to non zero (usually 1 or 666 [don’t ask]) forces a flush after every write or print.
$$ Alternatives: $PROCESS_ID, $PID yup you guessed it, returns the pid. Can be altered w/ fork cmd.
$1 .. $n the pattern matched by the nth parenthetical closure in a regex. Ie: m/(b{2})(goat)/; $1 being the first parenthetical closure would equal ‘bb’ and $2 being the second would equal ‘goat’.

IV. A Crash Course in Regex (Regular Expressions)

Operators:
=~ Match
!~ Does not match

Since expressions for regex are treated as double quoted strings here’s the escape chars and what they do:

 
    \t		tab                   (HT, TAB)
    \n		newline               (LF, NL)
    \r		return                (CR)
    \f		form feed             (FF)
    \a		alarm (bell)          (BEL)
    \e		escape (think troff)  (ESC)
    \033	octal char (think of a PDP-11)
    \x1B	hex char
    \x{263a}	wide hex char         (Unicode SMILEY)
    \c[		control char
    \N{name}	named char
    \l		lowercase next char (think vi)
    \u		uppercase next char (think vi)
    \L		lowercase till \E (think vi)
    \U		uppercase till \E (think vi)
    \E		end case modification (think vi)
    \Q		quote (disable) pattern metacharacters till \E

And now the special chars of regex:

 
    \	   Quote the next metacharacter
    ^	   Match the beginning of the line
    .	   Match any character (except newline)
    $	   Match the end of the line (or before newline at the end)
    |	   Alternation
    ()	   Grouping
    []	   Character class  
    *	   Match 0 or more times
    +	   Match 1 or more times
    ?	   Match 1 or 0 times
    {n}    Match exactly n times
    {n,}   Match at least n times
    {n,m}  Match at least n but not more than m times

Three pattern matching expressions:

Match: m/regex/; the m is optional if using // however is required if you want to use a dif char such as m!regex!
Substitute: s/find/replace/; again you can change delineators s!find!replace!
Translation: tr/abc/ABC/; simple rot-13 in perl: $line =~ tr/a-zA-Z/n-za-mN-ZA-M/;

V. File I/O

Open this is probably the most common way to open a file in perl

open FILEID, "path/to/file/" || die "Could not open file - $!\n";

The FILEID is what you will use to address the file from here on out. The part in quote is the file to open. you can open for write with “>path/tofile”, append with “>>path/to/file” and read with “path/to/file” (also “< is *not* needed) || tells the program to die if the open fails (|| = logical or) the variable $! Is the last error returned by a function. Now we have the file open what can we do? Well we could through it into a loop reading data, from it or whatever, I like to dump it into an array like this: @myData = ;

Now @myData contains all the data from the file, each element is a line (beaks at \n)

A while loop that does the same thing:

while ($line = chop(<FILEID>)) {

do something;

}

VI. Subroutines

The format for a sub as they are commonly referred to as is:

sub my_sub {

my ($var1, $var2) = @_;

do something

}

This is how you create a function within perl. Since perl 5.0 can call them just like any function my_sub($goat, $cheese); older version of perl require you do use the ampersand like this: &my_sub($goat, $cheese) or in the case of no arguments you can just &my_sub, my_sub()

If you want to include functions from another file you have two choices. First if it is a modular file you need to use “use” like this:

use CGI;

if it’s simply another perl file you have then you use “require” like this:

require ‘/path/to/perl/file.pl’;

I hope this helps you understand perl, I’m sure you’ll grow to love it.

Eric Jackowski