| Home > Lecture Notes > Introduction to Perl | Resources > Perl |
- About This Guide
- perl - Practical Extraction and Report Language
- Getting Perl
- Running a Perl Program
- Perl Statements & Expressions
- Perl Data Types and Operators
- Perl Input and Output
- Quoting Text Strings in Perl
- Program Control - Selection (Conditional, Branching)
- Program Control - Iteration (Repetition, Loops)
- Text Processing
- Where to Next?
- Example Source Code
- Copyright Notice
This is a very brief introduction to the Perl programing language. The target audience is community college students taking their first-year "computer concepts" classes, most of whom have not taken an operating systems class and do not have UNIX accounts. My goal is to provide enough introductory background and information so that Perl can be used as a vehicle for demonstrating broad conceptual topics such as languages in general; differences between low-level and high-level languages; compilers & interpreters; algorithms and basic control structures; and fundamental database concepts. This guide is not intended to be a complete Perl tutorial nor a programming tutorial. It is intended, however, to provide enough basic information for someone new to programming or Perl and unfamiliar with UNIX to get started finding and using a Perl binary distribution for the Windows platform. It is essentially entry-level stuff that I would liked to have had available in one document when I first tried to figure out Perl.
"Perl is designed to make the easy jobs easy and the hard jobs possible." - Larry Wall
Perl is a "glue" language (some would say it has become a "duct tape" language) that combines features and syntax from several common languages, most notably C, UNIX command shells, Awk, Sed, and even a bit of Basic. Perl is written in C and was developed on UNIX systems which most noticeably define Perl's roots and personality. However, Perl is a lot like the English language in that it freely borrows key words, operators, and syntax from other languages that it finds useful. The current version of Perl is quite large and complex compared to earlier versions, however, like a word processor or spread sheet program, it is relatively easy to use the basic features to accomplish common tasks without getting bogged down by complex features.
Perl is an interpreted language - sort of. Like many other contemporary interpreted languages, Perl compiles its source code (an ASCII text file) into an intermediate "byte code" language after scanning for syntax errors, than interprets the compiled byte code. Unlike traditional compiled languages, however, the intermediate binary code is not written and saved as a machine readable object file. Each time a Perl program is run the intermediate compile step is repeated. One of the disadvantages of interpreted languages commonly given is the translation time penalty each time the program is run. While this is generally true, Perl is quite fast and efficient on contemporary operating systems. On the other hand a commonly cited advantage of interpreters is their use as quick and (relatively) easy prototyping tools, an area where Perl excels.
Perl programs (a.k.a. "scripts") are ASCII text files. All you need to write a Perl program is a text editor. However to run a Perl program you need Perl installed on your system. Unfortunately the expression "Perl is designed to make the easy jobs easy..." does not necessarily extend to installing Perl, which generally means downloading the source code and compiling it on your system. This is not a problem on UNIX systems where a C compiler is essentially part of the operating system. However, Windows does not ship with a compiler and most users do not install one. Therefore numerous Perl binary distributions (pre-built Perl packages compiled for specific operating systems and hardware) are made available. See the CPAN (Comprehensive Perl Archive Network): Perl Ports (Binary Distributions) page at http://www.perl.com/CPAN/ports/
Perl version numbers follow the pattern "major version", "minor version", "patch level." Therefore Perl 5.6.1 is Perl version 5, minor version 6, patch level 1. Like virtually all computer software Perl only gets bigger and more complex with each version. The following major Perl releases illustrate the result of "feeping creaturism," a phrase coined by Larry Wall, Perl's creator and chief architect, referring to his urge to add "just one more feature."
Perl 4.036, the last stable Perl 4 release, was 1 megabyte
including documentation when compiled for 32-bit MS-DOS. The bare
essentials could be pared down to about 650 Kb, which would fit on a
DOS boot disk and still have room for a decent editor and some
utilities.
"Big Perl," Perl 4 for 32-bit DOS final release: OCT 1994
Perl 5.005_03, the last of the "5.005" series, is
roughly 7.5 megabytes without the HTML formatted documentation when
compiled for Win32 and no added modules or libraries. The HTML docs
add around 9 Mb.
Last release: MAR 1999
Perl 5.6.1.
The ActiveState distribution is about 39 megabytes (8.6 Mb download).
Compiled for Win32 with basic defaults from CPAN sources Perl 5.6.1 is about
20 / 13 Mb (with / without HTML documentation)
Last release: APR 2001
Perl 5.8.6, the current "stable" release (December 2004). The ActiveState distribution is about 61 megabytes installed (12.5 Mb download).
ActivePerl is a good, actively maintained and supported Perl binary distribution for Windows. It is available for free download at http://www.activestate.com/ActivePerl/ You will need to register with your name, e-mail address, and company. There are 2 downloads for Windows: the "MSI package" for NT, 2K, XP or the "AS package" for Win9x & ME. The download from ActiveState will have an additional number that identifies the current build number such as "... 5.8.0.804 ...".
Robert's Perl Tutorial - Installation at http://www.webhart.net/hemptreats/perl.html has some good tips for installing the ActiveState distribution. I personally choose not to associate Perl source files in Windows Explorer so that by default they are opened in my editor rather than executed. And I highly recommend selecting an installation directory structure without spaces in the path name which will prevent much frustration.
If you need or want earlier versions of Perl for any reason ActiveState distributions can be download directly. See Resources & Links: Perl for Active State links.
Perl is a Command Line application. In Windows you need to:
Windows Command Prompt <Setting the PATH> <The Perl Command Line>
Use either of the following methods to open a Windows Command Prompt session:
| Command Prompt Shortcut | RUN Command |
|---|---|
|
|
<Windows Command Prompt> Setting the PATH <The Perl Command Line>
When you enter a command on the command line in a Command Prompt window the command interpreter (COMMAND.COM or CMD.EXE) searches for an executable file with the name you typed (1) first in the current working directory, then (2) from a special list of directories on your system called the PATH. You can see the list that is used for the search by entering the command PATH on the command line.
If you type a Perl command and get one of the following messages then the main Perl program, perl.exe, is not in the PATH.
command not found 'perl' is not recognized as an internal or external command, operable program or batch file.
If you are not sure exactly where your Perl is installed use the DIR /S command to find perl.exe and set your PATH accordingly:
D:\>dir /s perl.exe Volume in drive D is [ variable depending on your system ] Volume Serial Number is [ variable depending on your system ] Directory of D:\ActivePerl\bin 06/17/2002 21:33 20,480 perl.exe D:\>path %PATH%;D:\ActivePerl\bin
In the example above D:\> is the command
prompt, it is not part of the command you type.
See Command Line User Interface at
http://spot.pcc.edu/~mhuffman/notes/dos/dos.htm
for additional notes regarding Command Prompt sessions.
You can also set your path permanently in Windows XP/2000 for all subsequent sessions in System Properties as follows:
|
|
| Editing a local user's PATH Environment variable from Windows XP/2000 System Properties | |
<Windows Command Prompt> <Setting the PATH> The Perl Command Line
In its simplest form, the Perl command line is just the command perl followed by either the name (including extension) of your Perl program source file or the option -e followed by a quoted string (using double quotes) consisting of the statement(s) you want Perl to immediately execute.
perl [options] filename perl -e " perl statement; [perl statement; ...] "
where filename is the name of your Perl program or perl statement is one or more Perl statements.
Example:
# hello.pl print "Hello, World!\n"; # ubiquitous first program
A:\>perl hello.pl Hello, World! A:\>perl -e "print \"Hello, World!\n\";" Hello, World!
Notice that in the Windows operating system we only have the double quote character available for quoting strings or commands on the command line, so for Perl "one liners" we need to "escape" the double quote character with a backslash ("\") where ever it is required in Perl statements between the opening and closing double quote character.
If, when you try to execute a Perl program, you get a message similar to this:
Can't open perl script "hello.pl": No such file or directory
The source file (hello.pl in the example) is not in the directory where you entered the command. Either copy your program source file to your current working directory, or Change Directory (CD command) to the location where your source files are. If you also have to change disk drives enter the drive letter and a colon (e.g. A:) then make your Change Directory command(s).
C:\>a: A:\>cd scripts A:\scripts>perl hello.pl Hello, World!
An expression is a chunk of Perl syntax that evaluates to some value; it can be used any place where a variable is used, pretty much just like in mathematics.
A statement is a command telling the computer what to do, not unlike a statement in English: "Turn left on Main Street," "Close the door," etc. In Perl statements must be terminated with a semicolon (";"). Simple statements are expressions that are evaluated. Compound statements are a block of simple statements enclosed in the "curly brace" characters "{" and "}" that get evaluated as if they were one single statement. A semicolon does not terminate the closing curly.
The "#" character is the comment delimiter. Anything following this character to the end of the line is ignored by Perl. There is no "multi-line" comment delimiter in Perl.
$cost = 12.5; # simple statement evaluates to 12.5
$price = $cost * 0.2; # simple statement evaluates to value of
# the expression $cost * 0.2
if ($cost > 100) { # statements get evaluated as a block only
$price = $cost * 0.15; # if $cost is greater than 100
$markup = 10.00;
}
Data Types <Operators> <Comparison Operators>
Unlike languages that define types and sizes of numbers (integers, floating point, etc.), characters, strings, etc. as distinct data types, Perl treats data as individual "things" called scalars and groups (or more precisely lists) of things called arrays and hashes.
A scalar is a single "thing." However, in Perl a single thing can be just about anything we want it to be: a number (integer or floating point), a character, a "string" of characters in one word, or several words. In fact this entire HTML document could be one long Perl scalar datum. Scalar variables in Perl always start with the "$" character. Think if it as a stylized "s" for scalar. The text that follows the $ is the variable name. Variable names are case sensitive, must begin with a letter or the under-score character ( "_" ), and my consist of any combination of letters, digits, and under-score.
An array is a list of scalars referenced by their numerical index starting at 0. An array of 5 test scores can be viewed as a horizontal list going from score[0] at the left through score[4] at the right, or as a vertical list starting at score[0] and working "up" to score[4]. Array variables always start with the "@" character. Think if it as a stylized "a" for array.
# initialize array of 5 test scores from a list of literal numeric values @score = (93, 88, 100, 65, 79);
$score[0] $score[1] $score[2] $score[3] $score[4] $score[0] [ 93]
[ 93] [ 88] [ 100] [ 65] [ 79] $score[1] [ 88]
$score[2] [ 100]
$score[3] [ 65]
$score[4] [ 79]
Perl has functions specially designed to move data in and out of arrays. The shift() function shifts the array left and returns the element shifted out of the array and the unshift() function inserts an element into the array from the left.
@score = (93, 88, 100, 65, 79); $test_score = shift(@score); # $test_score is 93 # @score is now (88, 100, 65, 79) unshift(@score, 99); # @score is now (99, 88, 100, 65, 79)
The pop() function pops and returns the element off the right (or from the highest index) and the push() function adds elements to the right (the next higher index, which is really the "top" of the list).
@score = (93, 88, 100, 65, 79); $test_score = pop(@score); # $test_score is 79 # @score is now (93, 88, 100, 65) push(@score, 99); # @score is now (93, 88, 100, 65, 99)
A hash (or "associative array") is a list of "pairs of scalars" where the left side of the pair is the key and the right side is the value, commonly called "key-value pairs." Hash variables always start with the "%" character. Think of % as symbolic for 2 scalars side-by-side.
$count = 10; # a scalar (integer, stored internally as a "double")
$name = "Rocky"; # also a scalar (string)
# ... and his friends, an array (of strings)
@friends = ("Bullwinkle", "Boris", "Natasha");
# $friends[0] = "Bullwinkle"; $friends[1] = "Boris"; ...
# a hash, or associative array
%score = ('Mickey', 93,
'Minnie', 88,
'Huey', 100,
'Louie', 65,
'Dewy', 79);
# $score{'Mickey'} = 93
# $score{'Huey'} = 100
# ...
Notice that the first element in the @friends array is $friends[ 0 ], the second element is $friends[ 1 ], etc.
<Data Types> Operators <Comparison Operators>
Perl uses many familiar or intuitive operators, such as +, -, *,and / for addition, subtraction, multiplication, and division. The exponentiation operator is **. Perl also has many other operators taken from various languages. A common programming task is adding (or subtracting, or multiplying, ...) a value to a variable and then storing the new result back to the same location in memory. In Perl many operators can be combined with the assignment operator, = , to form "binary assignment" operators that perform the indicated operation and make the assignment. $x = $x + 4 can be written $x += 4. Another common operation is incrementing or decrementing a variable (such as a loop counter or array index) by one and storing the result. While you could write $i += 1, a more succinct way to do the same thing in Perl is $i++.
Strings are implemented in a variety of ways depending on the language, but common among all languages is the need to combine strings (as well as break them apart) to make new strings. Perl uses the "." ("dot") operator to concatenate (combine) strings. The following examples show different ways to perform these operations in Perl.
$total = $total + $subTotal;
$total += $subTotal; # same thing (also *=, /=, etc.)
$count = $count - 1;
$count -= 1; # equivalent
$count--; # also equivalent ("post-decrement")
--$count; # or this "pre-decrement"
$sum = $num1 + 43.56; # as expected, add 2 numbers, store in $sum
$name = "Dudley"."Do-Right"; # concatenate 2 strings
# $name = "DudleyDo-Right"
The difference between $i++ and ++$i is subtle. The first case, called "post-increment," $i is evaluated first, then incremented. The second case, "pre-increment," the increment takes place immediately, then $i is evaluated based on the new value.
An operator supported in virtually all programming languages that may be new is the modulus division operator. Modulus division returns the remainder rather than the quotient. In Perl the modulus operator is %, as it is in all the "C Family" of languages.
$div_result = 19 / 4; # 4.75 $mod_result = 19 % 4; # 3; (19 / 4) = 4, remainder 3
Modulus division is useful for many algorithms ; recall the procedure for converting a decimal number to binary or hexadecimal. We started with the highest power of the base that could be divided into our decimal number, then kept working down by dividing the remainder by each successive lower power of the base.
<Data Types> <Operators> Comparison Operators
Numeric comparisons are tested with the familiar < and > symbols used in mathematics for "less than" and "greater than". "Less than or equal to" and "greater than or equal to" use the <= and >= operators respectively. Equality is tested with the == operator and inequality ( "not equal to" ) is tested with the != operator. Again this models the "C Family" of languages. Basic and the languages derived from Basic (Visual Basic, Visual Basic for Application, VBScript) use = for equality and <> for inequality. Since Perl does not distinguish numeric data from text (string) data in scalars, the operators lt, gt, le, ge, eq , and ne are used to compare strings. Perl also has the "comparison operator" <=> for numbers and cmp for strings.
Comparison Numeric String Returns
Less than < lt True if $a is less than $b
Greater than > gt True if $a is greater than $b
Less than or equal <= le True if $a is not greater than $b
Greater than or equal >= le True if $a is not less than $b
Equal == eq True if $a is equal to $b
Not equal != ne True if $a is not equal to $b
Comparison <=> cmp 0 if equal, 1 if $a is greater,
-1 if $b is greater
The comparison operator, sometimes called the "space ship operator," is typically implemented in other languages as a function such as strcmp() in C.
The most common form of input is reading "lines" of text entered from the keyboard or from files that have been opened and assigned a file handle. One of the great advances in operating system design has been to view input and output devices as files. The keyboard is just a special file referred to by a file handle called STDIN and the console screen is STDOUT. There is a third file handle called STDERR which is also usually the console screen. Perl reads from the keyboard by putting the name of the file handle (STDIN) in angle brackets and assigning the returned value to a scalar variable.
One common and standard way to display output in Perl is the print function. Technically print takes as its first argument the file handle of the opened file where you want the output to go. However, because writing to the standard output device is so common, STDOUT is the default file handle to print and is not required, although there is no harm in using it.
Here is a standard dialog between Perl and a user:
# input1.pl print "Enter your name: "; # prompt user for input # print STDOUT "Enter your name: "; # also valid $name = <STDIN>; # get their name print "Hello ", $name, "welcome to CIS 121!"; # display greeting
A:\>perl input1.pl Enter your name: Fred Hello Fred welcome to CIS 121! A:\>
Notice that the second part of our greeting, "welcome to CIS 121!" was displayed on the next line even though there is nothing in our program to indicate we wanted it displayed that way. When we read input from the keyboard the user pressed the ENTER key to signal end of input. When Perl read the user's name typed on the keyboard it also read the ENTER key press, called a newline character. Usually we don't want the newline character, and Perl has a convenient function called chomp() that takes a single argument, a scalar variable, and chomps the newline from the end. If the variable does not have a newline chomp() leaves it unchanged. There is a similar function called chop() that chops off the last character in a string regardless what it is.
$our_hero = "Rocky\n"; # "Rocky" and a newline
chomp $our_hero; # $our_hero = "Rocky"
chomp $our_hero; # no change, $our_hero = "Rocky"
chop $our_hero; # oops, just turned our hero into a statue!
# $our_hero = "Rock"
Note that chomp() is not available in Perl 4. Most languages would require at least a couple lines of code and one or two function calls to twiddle with the string, but "Perl makes easy jobs easy . . ."
#!perl -w # # input2.pl print "Enter your name: "; # prompt user for input $name = <STDIN>; # get their name chomp $name; # discard newline print "Hello ", $name, ", welcome to CIS 121!" # display greeting
A:\>perl input2.pl Enter your name: Barney Hello Barney, welcome to CIS 121! A:\>
Function calls "nest" such that the return value from one function can be used as an argument to an "outer" function call provided that the inner function's return type is the same as the type expected by the outer function. You will generally see 2 lines from the above listing combined like this:
chomp($name = <STDIN>); # get name and chomp the newline
Notice, however, that now we need to enclose the argument to chomp() in parentheses since it is an expression rather than a simple scalar variable. Where there is no ambiguity, such as most of the calls to print() used in this guide, Perl lets you skip using parentheses for function calls. In such cases the function acts more like an operator. This is just one feature (of many) that some people cite as an annoying inconsistency in Perl. On the other hand, some people also see this as an example of a higher level "high-level" language that lets you write your program closer to the way you think about the problem.
This example also demonstrates a good use of the "shebang" line:
#!perl -w
"#" is commonly pronounced "sharp" and "!" pronounced "bang," thus "shebang." The Windows operating system does not recognize the line like UNIX does, but if the line is present Perl will scan it for any options that you would normally include on the command line. The -w option enables warnings, usually a good idea, even more so when first learning the language. You will often see the shebang with a path: #!/usr/bin/perl which is the path to the Perl interpreter on most UNIX systems. In Windows and DOS you can leave the path in the shebang and add options to the end.
Double Quote - " Interpolation " <Single Quote> <Operators qq & q> <Back Quote>
In the previous examples a comma-separated list was provided as the argument to the print() function with literal strings enclosed in double-quote characters ( " . . . " ). If we want to include characters in a string such as a TAB or a line break (NEWLINE) that do not have a printable representation we need to use an "escape character" followed by a code for the special character. In Perl the escape character is the backslash ( \ ). When Perl sees the backslash in a string it skips the backslash and treats the next character as a special case (called a "meta character"). To include a TAB in a string use "\t ", and to add a newline use "\n". There are other meta characters, and you can create any special character using the backslash followed by the ASCII character code in octal or hexadecimal (prefixed with "x"). You also use the backslash to "escape" characters that Perl uses for special purposes, like the double-quote character and the backslash itself. In these cases Perl sees the backslash, says "hmm, special character coming...," ignores the backslash, reads the next character and says "well, its not one of the meta characters I recognize, so I will use the next character literally and not apply any special meaning." Some examples should help:
#!perl -w
#
# quote1.pl
$cost = 10;
print "Name\tAddress\n"; # "Name", 1 TAB stop, "Address", NEWLINE
print "Name\011Address\012"; # same thing using octal ASCII code
print "Name\x09Address\x0A"; # same thing using hexadecimal ASCII code
print "The backslash (\"\\\") \"escapes\" the next character\n";
# The backslash ("\") "escapes" the next character
print "Please pay \$$cost\n" # escape first "$", interpolate $cost
A:\>perl quote1.pl Name Address Name Address Name Address The backslash ("\") "escapes" the next character Please pay $10 A:\>
So far this is pretty straight-forward and similar to many other high-level languages. However, Perl also "interpolates" scalar and array variables in double-quoted strings. The following 2 lines produce the same output:
print "Hello ", $name, ", welcome to CIS 121!"; # display greeting print "Hello $name, welcome to CIS 121!"; # interpolate $name
$foo = "**FOO**";
$bar = "**BAR**";
$foobar = "--FOOBAR--";
print "$foobar"; # --FOOBAR--
print "${foo}bar"; # **FOO**bar
print "$foo$bar"; # **FOO****BAR**
print "$foobarbar; # prints nothing because $foobarbar is not defined
# (-w will generate warning)
<Double Quote> Single Quote - ' No Interpolation ' <Operators qq & q> <Back Quote>
If you do not want meta characters and variables to be interpolated then quote strings with the single-quote character ( ' . . . ' ). In single- quoted strings only the backslash and the single-quote can be escaped, all other characters are treated literally. Since the "$" and "@" are quoted literally it follows that any text following them are not recognized as a variable names, therefore scalar and array variables are not interpolated.
#!perl -w
#
# quote2.pl
$pi = 3.14159;
$e = 2.71828;
$pie = "apple";
@numbers = (0, 1, 2, 3);
@words = ("xip", "foo", "bar", "bas");
print "\$pie = $pie\n"; # $pie = apple
print '$', "{pi}e = ${pi}e\n"; # ${pi}e = 3.14159e
print '\$pi\$e = ', "$pi$e\n"; # $pi$e = 3.141592.71828
print "\n";
print 'Single-quoted scalar:\t $pie\n'; # Single-quoted scalar:\t $pie\n
print "\n";
print "My numbers are:\t @numbers\n"; # array variables interpolated
print "My words are:\t @words\n"; # with spaces between elements
A:\>perl quote2.pl $pie = apple ${pi}e = 3.14159e \$pi\$e = 3.141592.71828 Single-quoted scalar:\t $pie\n My numbers are: 0 1 2 3 My words are: xip foo bar bas A:\>
<Double Quote> <Single Quote> Quote Operators qq & q <Back Quote>
You can define your own quoting characters with the qq and q operators if you need to use several single-quote or double- quote characters in your string and don't want to suffer from chronic "backslash-itis." Be sure, however, to choose a character that you know you will not need in your string. Of course you can always escape your new quoting character, but that sort of defeats the purpose and probably does not add anything to readability.
print "The backslash (\"\\\") \"escapes\" the next character\n";
print qq#The backslash ("\\") "escapes" the next character\n#;
print 'Binary place holders are 1\'s, 2\'s, 4\'s, 8\'s, etc.', "\n";
print q#Binary place holders are 1's, 2's, 4's, 8's, etc.#, "\n";
print qq!"Be careful\!" he shouted.!;
You have to decide for yourself what makes the string more readable, but Perl gives you the choice. One of the Perl mottos is "There's more than one way to do it." (TMTOWTDI, sometimes pronounced "tim- toady").
<Double Quote> <Single Quote> <Operators qq & q> Back Quote - ` OS Command `
Finally, there is yet a third way to quote a string using the "back- quote" character ( ` . . . ` ), also called the grave accent. Perl passes back-quoted text to the underlying operating system command processor as a command and returns as a string whatever output the operating system produced. The utility and significance of this feature on Windows and DOS systems cannot be over emphasized. In the past it was common to write various small utility or "helper" programs in Assembly, C, or BASIC to get bits and pieces of operating system information into and out of batch files. Perl glues that mish-mash together in one language. You can even enclose Perl code in a "batch file wrapper" and run the Perl script as an executable program like on UNIX systems.
Here are some simple Windows/DOS commands whose output stored as program variables might prove useful when making decisions about subsequent commands and file processing:
$version = `ver`; # OS version
$volume = `vol`; # Volume information for drive
$cwd = `cd`; # Current Working Directory
@src_files = `dir /b *.htm*`; # array of HTML files
foreach (@src_files) {chomp} # remove trailing newline
# (foreach is discussed later)
@src_files = `dir /b *.htm*`; # array of HTML files
chomp @src_files; # more than one way to do it...
chomp(@src_files = `dir /b *.htm*`); # ...yet another way to do it
The missing semicolon following the chomp expression in the line foreach (@srcFiles) {chomp} is not a mistake and does not cause an error. Technically the final statement in a block of statements does not require a terminating semicolon, although it does no harm to include it. My personal style is to always use the semicolon except when writing a single- statement block on one line as in the example.
The Perl control statement that allows a program to branch to different places in the program and preform actions based on some condition is if with the optional addition of elsif and/or else:
if (a is true) {
# do this
}
elsif (b is true) {
# otherwise do this
}
else {
# if neither a nor b are true then do this
}
The elsif and/or else are not required. Note in particular the spelling of elsif. If you have any experience with other languages you might be tempted to type "elseif", "else if", or "elif". Note also that, unlike many syntactically similar languages, the "curly braces" are required, even for single-line conditional statements.
if <if else> <if elsif> <unless>
#!perl -w
#
# if.pl
print "What is your total order? ";
chomp($order = <STDIN>); # how much did customer spend
if ($order < 100) { # check for first condition
$discount = $order * .1;
} # curly braces required
if ($order >= 100) { # check for second condition
$discount = $order * .15;
}
print "Your discount is \$$discount\n"; # display results
A:\>perl if1.pl What is your total order? 50 Your discount is $5 A:\>perl if1.pl What is your total order? 120 Your discount is $18 A:\>
Look closely at the final print statement. Instead of calling print with a comma separated list of arguments we use one double-quoted argument where the $discount scalar variable is in embedded in the quoted string. This is called "variable interpolation" in Perl. If you specifically do not want variable interpolation to occur use single quotes.
Notice also the use of several "\" characters. The back-slash is used to convey special meaning to the next immediate character. Sometimes it is used to refer to a character that cannot be represented as a printable ASCII character, such as the newline ("\n") in previous examples. If, on the other hand, we want to use one of Perl's "special meaning characters" (like the back-slash itself or the dollar sign) as a literal character we use the back-slash to say "take whatever the next character is and treat it literally." In this example Perl displays a "$" followed by the value that is stored in the $discount variable. To work with path names in Windows and DOS it is important to remember to "escape" the "\":
print "C:\\perl\\bin"; # output: C:\perl\bin print "C:\perl\bin"; # output: C:perin
In the example above "\b" means "backspace," so the second line interprets the first back slash as print the literal character "p" and the second back slash followed by the letter "b" says print a backspace, thus erasing the "l" and continuing on to the end of the line.
<if> if else <if elsif> <unless>
Rather than use two if statements the if . . . else construct is more descriptive of what we really want to do: if the customer spent less than a given amount give them one discount, otherwise give them a bigger discount:
#!perl -w
#
# ifelse.pl
print "What is your total order? "; # how much did customer spend
chomp($order = <STDIN>);
if ($order < 100) {
$discount = $order * .1;
}
else { # better, clearer that we are
$discount = $order * .15; # doing an "either - or" test
}
print "Your discount is \$$discount\n"; # display results
<if> <if else> if elsif <unless>
You can also add additional tests with one or more elsif statements following if. A final else following elsif is not required, although it is generally good programming practice to have a "default" condition.
#!perl -w
#
# elsif.pl
print "What is your total order? ";
chomp($order = <STDIN>); # how much did customer spend
if ($order < 100) { # check for first condition
$discount = $order * .1;
}
elsif ($order < 500) { # check for next condition
$discount = $order * .15;
}
else { # easy to add additional test
$discount = $order * .2; # big spender gets big discount
}
print "Your discount is \$$discount\n"; # display results
A:\>perl ifelse2.pl What is your total order? 80 Your discount is $8 A:\>perl ifelse2.pl What is your total order? 200 Your discount is $30 A:\>perl ifelse2.pl What is your total order? 600 Your discount is $120 A:\>
<if> <if else> <if elsif> unless
Perl frequently has one or more alternate constructs to accomplish the same task. Remember, "There's more than one way to do it." If you want to reverse the "sense" of a conditional expression use unless ( condition ) , which is another way of saying if ( not condition ) . It all depends upon how you understand the problem and what form best expresses the logic.
#!perl -w
#
# unless.pl
print "What is your total order? "; # how much did customer spend
chomp($order = <STDIN>);
unless ($order >= 100) { # same as "if ($order < 100) { ..."
$discount = $order * .1; # give them the lower discount
}
else {
$discount = $order * .15; # give them the better discount
}
print "Your discount is \$$discount\n"; # display results
while, for, until <foreach> <do { } while/until>
The fundamental construct to iterate, or repeat a block of code more than once, is the while ( condition ) loop. Technically all loops can be written using while. However, when the number of iterations is known in advance a for loop is usually better. To reverse the sense of while use the until (condition ) loop. Recall TMTOWTDI.
The following (rather unexciting) examples count from 0 to 9 inclusive:
#!perl -w
#
# loop.pl
# Demonstrate while, for, and until loops by counting
# from 0 to 9 inclusive
print "\nwhile loop:\n\t"; # start new line, tab in for output
$i = 0; # loop index counter
$max_count = 10; # number of times to repeat
while ($i < $max_count) {
print "$i "; # print $i followed by space
# same as: print "$i, " ";
++$i; # same as: $i = $i + 1
# IMPORTANT! increment the index so
} # we don't have an infinite loop
print "\nfor loop:\n\t"; # start new line, tab in for output
for ($i = 0; $i < 10; ++$i) { # when the number or iterations are
print "$i "; # known in advance this form is
} # "self-documenting
print "\nuntil loop:\n\t"; # start new line, tab in for output
$i = 0; # loop index counter
$max_count = 10; # number of times to repeat
until ($i == $max_count) { # same as: while ($i < $max_count)
print "$i "; # print $i followed by space
++$i; # increment the index!
}
A:\>perl loop.pl while loop: 0 1 2 3 4 5 6 7 8 9 for loop: 0 1 2 3 4 5 6 7 8 9 until loop: 0 1 2 3 4 5 6 7 8 9 A:\>
<while, for, until> foreach <do { } while/until>
Perl has an additional loop construct for iterating through lists, the foreach ( list ) loop. It is well suited to processing a list of items when you don't know (or don't want to take the time and trouble to figure out) how many items you have in the list.
#!perl -w
#
# foreach.pl
use strict; # apply restrictions
my $jay_char; # declare variable before use
# ...or declare and initialize at same time
my @names = ("Rocky", "Bullwinkle", "Peabody", "Sherman", "Boris", "Natasha");
print "Some Jay Ward characters:\n";
foreach $jay_char (@names) {
print "\t$jay_char\n"; # indent 1 tab stop, print name
}
A:\>perl foreach.pl Some Jay Ward characters: Rocky Bullwinkle Peabody Sherman Boris Natasha A:\>
Typing all those double quote characters and commas in the above example is a lot of extra work (and error-prone if you are inclined to make typing mistakes). People using Perl tend to do a lot of quoting and tend to do a lot of work with lists. Perl is "lazy programmer friendly" and provides numerous alternate ways to quote things like strings and lists of strings. The list of names from the previous example could also be written like this:
@names = qw(Rocky Bullwinkle Peabody Sherman Boris Natasha);
"qw( list )" acts more like an operator, and says "treat this list as quoted words delimited by white space."
Discussion of the foreach construct is probably as good a place as any to mention the "implied $_ scalar" since you will see it in virtually any example Perl code outside of this guide. Perl gives you numerous "built_in" variables in addition to the variables that you explicitly declare and use. The most common and frequently used variable is $_ and many Perl operations that expect a scalar variable will use $_ if you do not specify one (however any operation that uses $_ can always use a normal scalar). The previous foreach() loop could have been written like this:
foreach (@names) {
print "\t$_\n"; # implied $_ scalar in foreach()
}
The print() function will also use $_ in the absence of an explicit scalar:
#!perl -w
#
# foreach2.pl
@word = ("foo\n", "bar\n", "bas\n");
print "Using implied \"\$_\" scalar:\n";
foreach (@word) { # implied foreach $_ ( ... )
print; # implied print $_;
}
A:\>perl foreach2.pl Using implied "$_" scalar: foo bar bas A:\>
<foreach> <while, for, until> do { } while/until
The next loop is a bit more involved; it prompts the user for a number and saves it in a variable called $sum, then enters a loop where it prompts for another number to add to sum. Notice that after we get the first number we are ready to enter the loop, but we want to make at least one pass through the loop so that we always get at least the second number before the user has a chance to quit. The do statement followed by a block of code to be executed while a condition is true or until a condition is true is perfect for this logic. At the end of each loop the user is asked if she wants to continue.
Notice the condition in the until statement. =~ is the "match" operator (!~ is "does not match"). $answer is checked for a match using a regular expression between the "/ /" characters. "^" says match starting at the beginning of the line, "n" says look for an "n", and ".*" says match 0 or more of any character following the "n". Finally, the trailing "i" says to do a case insensitive search. So to quit the program our user must enter, at the very least, an "n" or "N". Anything else and she will prompted for another number. Some other responses that will exit are "No," "NOOOOOO," "nay," and "nada." Answers like "I don't think so" and "maybe" will cause the loop to continue.
#!perl -w
#
# sumloop.pl
# add numbers until user says no
# demonstrate do {} until
# uses the binding operator =~ to match pattern entered by user
# with regular expression
print "Enter first number: "; # get first number to get started
chomp($sum = <STDIN>);
do {
print "Enter next number: ";
chomp($n = <STDIN>); # get next number
$sum += $n; # same as: $sum = $sum + $n
print "Sum = $sum\n"; # display the sum
print "Continue? (y/n)";
chomp($answer = <STDIN>); # add another number?
} until ($answer =~ /^n.*/i); # loop until 1st char of $answers
# matches 'n' or 'N'
A:\>perl sumloop.pl Enter first number: 3 Enter next number: 5.5 Sum = 8.5 Continue? (y/n)y Enter next number: 1.2 Sum = 9.7 Continue? (y/n)x Enter next number: 10 Sum = 19.7 Continue? (y/n)NO A:\>
Perl excels at processing lists of text, typically stored in text files. In fact, Perl has also been called a "Pathologically Eclectic Rubbish Lister."
#!perl -w
#
# names.pl
@name_list = (
"Rocky, Squirrel, 1 Hero Way, Frostbite Falls, MN",
"Bullwinkle, Moose, 33 Knowitall Ave, Frostbite Falls, MN",
"Boris, Badenov, 987 Nogoodnik St, Pottsylvania, MN",
"Natasha, Fatal, 123 NW Main, Pottsylvania, MN",
"Jay, Ward, 77 Sunset Blvd, Hollywood, CA",
);
foreach $name (@name_list) {
# split each record into list of scalars with meaningful names
($fname, $lname, $street, $city, $state) = split /, */, $name;
# build new list of records with the items we want
push @new_list, ($lname.",".$fname.",".$city.",".$state."\n");
}
# sort the new list and display it
@sorted_list = sort @new_list;
print @sorted_list;
A:\>perl names.pl Name "main::street" used only once: possible typo at names.pl line 14. Badenov,Boris,Pottsylvania,MN Fatal,Natasha,Pottsylvania,MN Moose,Bullwinkle,Frostbite Falls,MN Squirrel,Rocky,Frostbite Falls,MN Ward,Jay,Hollywood,CA A:\>
There are several points of interest in this example. First, notice that use of the -w option triggered a warning. We assigned all the words from the split() function to scalar variables, but then never use $street. This could indicate a programming error, in which case we would appreciate the warning. In this particular case we did not intend to use $street, so we can ignore the warning.
The split() function's first argument is a pattern in the form of a regular expression that specifies how to delimit the fields and the second argument is a string to be parsed into array elements. The pattern is enclosed within the "/" characters and defines a field separator beginning with a comma followed by 0 or more spaces. If we used the pattern "/,/" (e.g. only commas) our array elements would include the extra white space between the commas, which would affect the output when we rearranged the order of the fields.
The push() function is a convenient way to add elements to an array, even though in this case we don't really care about using the array as a stack data structure.
The sort() function defaults to sorting strings in ascending order. If we want different behavior we have to use an alternate form and supply a subroutine that defines how to sort the list.
fPerl Tutorials <Books> <Docs> <TTWOF> <Thoughts>
There are numerous tutorials on line; here are 4, one of which is specifically oriented toward Perl on Windows systems:
Robert's Perl Tutorial, Version 4.1.1
http://www.webhart.net/hemptreats/perl.html
http://www.ctp.bilkent.edu.tr/~cayfer/perl/PerlTutor.html
http://library.psyon.org/programming/perl/win32perltut.html
Good basic Perl tutorial with focus on Windows operating systems and
the ActiveState's
ActivePerl.
distribution. Assumes no prior programming experience. Based on Perl
5.005, a gentle but very thorough introduction. The original site
seems to have gone missing so I have provided links to known
mirrors.
Perl Tutorial
http://www.comp.leeds.ac.uk/Perl/start.html
Written by Nik Silver, School of Computer Studies, University of
Leeds, UK
This tutorial is more UNIX oriented (see TTWOF below), but has
good explanations of the basics with good examples.
Picking Up Perl (PUP)
http://www.ebb.org/PickingUpPerl/
Excellent, easy to read introduction to Perl. Appears to be a "work
in progress" with a crying need for a good editor, however
explanations are clear for entry-level readers. Assumes some
knowledge of computer concepts and architecture, but not necessarily
programming. Distributed under the GNU Free Documentation License;
PDF version available for printing (66 pages).
Note: Based on Perl 5.6.0. If you use Perl 5.004 or
5.005 you will need to comment out or remove the "use
warnings;" lines in the example programs and add "-w"
to the "shebang": #!/usr/bin/perl -w
Perl Lessons
http://www.cs.tut.fi/~jkorpela/perl/course.html
Excellent, well written tutorial from Jukka Korpela, Department of Information Technology, Tampere University of Technology, Finland. Assumes some familiarity with programming, many examples are non- trivial. Good links to additional resources. Other pages on author's site extremely interesting.
<Tutorials> Books <Docs> <TTWOF> <Thoughts>
On line tutorials are useful (and free), but I get a stiff neck when I read for prolonged periods at the computer, and I like books. Here are what I consider to be the best "first" and "second" Perl books.
Elements of Programming with Perl
Andrew L. Johnson
Manning Publications, 2000
ISBN: 1884777805
Best Book Buys
This is one of the few Perl books that does not assume the reader is already a programmer and familiar with the Unix operating system. The author describes general programming designs and practices as well as Perl specifics.
Reviews of Elements of Programming with Perl:
Beginning Perl
Simon Cozens, with Peter Wainwright
Wrox Press, 2000
ISBN: 1861003145
Best Book Buys
More detail than Elements of Programming with Perl and covers more advanced topics such as object oriented programming, CGI, and databases. However, still written with the beginning programmer in mind.
Reviews of Beginning Perl:
<Tutorials> <Books> Perl Documentation <TTWOF> <Thoughts>
If Perl is installed on your system you should also the have, at the minimum, the Perl POD (Plain Old Documentation). Start with the command
perldoc perldoc
to get an overview of the Perl documentation system. If that doesn't work try
perl -S perldoc perldoc
Use first command with ActivePerl and with the Perl distribution I make available. The second command should work with older distributions compiled for DOS. After getting a feeling for the documentation system try
perldoc perl
perldoc perlfaq
perldoc perlop
perldoc perlfunc
The complete Perl documentation can be found at www.perldoc.com. Look for the "Perl Manpage" link, then look for links that interest you. perlfaq and perlfunc are good places to start.
<Tutorials> <Books> <Docs> TTWOF - Things to Watch Out For <Thoughts>
Be aware that most tree-based and digital Perl tutorials, guides, and documentation will be "UNIX-centric." Many examples may need anywhere from minor to major modification, or may not be applicable to Windows because of operating system function calls that do not exist in Windows.
In particular you will see the "shebang" line at the beginning of nearly every example (so called because the "#" character is commonly pronounced "sharp" and the "!" character is commonly pronounced "bang")
#!/usr/local/bin/perl
It has a special use on UNIX, but is generally ignored under Windows and DOS. The line is, however, read by Perl, and you can pass options to Perl that you would normally include in the Perl command. An excellent use of the shebang in Windows is for passing the -w (use warnings) to Perl:
#!/usr/local/bin/perl -wor more simply
#!perl -w
You may also see examples of command lines using single quotes or "back quotes" (the "grave accent," ASCII 096). You can use these characters to quote strings, variables, and commands inside your Perl programs, but not on the command line in a Windows Command Prompt session. These are additional ways of quoting variables and commands on UNIX systems. Perl knows how to interpret them within Perl, but COMMAND.COM and CMD.EXE do not.
These commands work from a Windows Command Prompt:
C:\>perl -e "print 'Hey there';" Hey there C:\>perl -e "print `ver`;" Microsoft Windows 2000 [Version 5.00.2195]
Even though the above commands are "on the command line" the single quotes and back quotes ("back ticks") are actually part of the Perl statements and not seen by the command processor.
This command does not work from a Windows Command Prompt:
C:\>perl -e 'print "Hey there";'
Can't find string terminator "'" anywhere before EOF at -e line 1.
<Tutorials> <Books> <Docs> <TTWOF> Final Thoughts
Copy the trivial examples on this page (copy and paste directly from your browser or download the source code) into a text editor and get them to run on your own system. Better yet, type them in from scratch; you will be surprised how much more you learn by typing in source code as opposed to cutting and pasting. Sort of like rewriting your class notes... Then make some changes: modify the user prompts, test for different conditions, modify the loops (count down, count by 2's, 3's, etc.)
Design and write a 4-function calculator. Write an algorithm to prompt the user for 2 numbers and whether she wants to add, subtract, divide, or multiply. Write the program and test it. Add detection for an attempt to divide by zero. When that is working see if you can wrap the whole thing in a loop that lets the user do additional calculations until she wants to quit.
Take time to learn more about regular expressions. A lot of Perl's power comes from its pattern matching and text processing using regular expressions. The Windows XP and 2000 findstr command line program supports regular expressions, so you can read a little about them in Windows Help and practice with some of your text files. The last major upgrade to Java added a regular expression type, and Microsoft .NET languages will also support regular expressions. Someone at, or writing about Microsoft said "Regular expressions are to strings what math is to numbers."
The only way to learn a programming language is to jump in and write code. Look at examples in books, articles, and online tutorials. Type them in a text editor and run them. Then look up what you don't understand in the Perl online reference manual and try to figure out what is going on. Pretty soon you will be looking up less and understanding more. You can do it! Think of it as a different kind of cross-word puzzle, or a new game..
All example Perl programs in this guide begin with a comment identifying the name of a file available for download. All files are in the archive http://www.mhuffman.com/notes/language/src_perl/perl_examples.zip.
Additional files in the archive, but not illustrated in the guide, include a program that accesses and prints the 3 main data types discussed (datatype.pl), an example demonstrating file access (names2.pl), and 2 additional quoting examples (quote3.pl & quote4.pl).
Introduction to Perl Copyright © 2002, 2005 Michael B. Huffman
The author gives general permission to copy and distribute this document in any medium provided that all copies contain an acknowledgement of authorship and the URL of the original document: http://www.mhuffman.com/notes/language/perl_intro.html
The permission granted above does not imply permission to distribute this
document in a modified form or as a translation.
Comments, corrections, and suggestions for improvement always
appreciated.
Mike Huffman: mike [ at ] mhuffman.com
Revised: 08 MAR 2005 07:37