Sl-Unit 3
Sl-Unit 3
Origin of scripting
The use of the word ‘script’ in a computing context dates back to the early
1970s, when the originators of the UNIX operating system create the term ‘shell
script’ for sequence of commands that were to be read from a file and follow in
sequence as if they had been typed in at the keyword. e.g. an ‘AWKscript’, a ‘perl
script’ etc.. the name ‘script’ being used for a text file that was intended to be
executed directly rather than being compiled to a different form of file prior to
execution.
Other early occurrences of the term ‘script’ can be found. For example, in
a DOS-based system, use of a dial-up connection to a remote system required a
communication package that used proprietary language to write scripts to
automate the sequence of operations required to establish a connection to a
remote system. Note that if we regard a script as a sequence of commands to
1
control an application or a device, a configuration file such as a UNIX ‘make file’
could be regard as a script.
However, scripts only become interesting when they have the added value
that comes from using programming concepts such as loops and branches.
Scripting today
AppleScript m4 Scheme
ColdFusion Perl Tcl
DCL PHP Unix Shell
Embeddable Common scripts (ksh,
Lisp Pure csh,bash, sh and
ecl Python others)
Erlang Rebol VBScript
Work Flow
JCL Rexx Language
Windows
JScript and JavaScript PowerShell
Ruby
Lua XSLT
2
interface: this may be an API, though more commonly the application is
constructed from a collection of objects whose properties and methods are
exposed to the scripting language. Example: use of Visual Basic for
applications to control the applications in the Microsoft Office Suite.
3. Using a scripting language with its rich functionality and ease of use as an
alternate to a conventional language for general programming tasks,
particularly system programming and administration. Examples: are UNIX
system administrators have for a long time used scripting languages for
system maintenance tasks, and administrators of WINDOWS NT systems are
adopting a scripting language ,PERL for their work.
3
Users for Scripting Languages
2. Microsoft's visual basic and excel are the first applications that used the
concept of scriptable objects. To support all the applications of Microsoft
the concept of scriptable objects was developed.
3. Web scripting web scripting is classified into three forms they are
processing forms, dynamic web pages, dynamically generating HTML.
1. system administration,
2. experimental programming,
3. controlling applications.
Application areas :
Four main usage areas for scripting languages:
Web scripting
4
Web is the most fertile areas for the application of scripting languages.
Web scripting divides into three areas
a. processing forms
b. creating pages with enhanced visual effects and user interaction
c. generating pages ’on the fly’ from material held in database.
Another form of dynamic Web page is one in which some or all of the
HTML is generated by scripts executed on the server. A common application of
the technique is to construct pages whose content is retrieved from a
database. For example, Microsoft’s IIS web server implements Active Server
Pages (ASP), which incorporate scripts in Jscript or VBScript.
5
Perl has been used to implement an enterprise- wide document management
system for a leading aerospace company.
Names
name=value ;
Boolean values
String constants
Assignment
$b = 4 + ( $a = 3) ;
$a = "Java ;
Scalar Expressions
Scalar data items are combined into expressions using operators. Perl
has a lot of operators, which are ranked in 22 precedence levels. These are
carefully chosen so that the ‘obvious’ meaning is what You get , but the old
7
advice still applies: if in doubt ,use brackets to force the order of evaluation .
In the following sections we describe the available operators in their natural
groupings-arithmetic , strings, logical etc .
Arithmetic operators
$c= 17 ; $d = $c++;
$a += 3;
$a =$a + 3;
String Operators
$a =”Hello” x
3;
Sets $a to “HelloHelloHello”.
$foo .= “ “ ;
So far, things have been boringly conventional for the most part.
However, we begin to get a taste of the real flavor of perl when we see how it
adds a little magic when some operators, normally used in arithmetic
context, are used in a string context.
8
Two examples illustrate this.
1.Auto increment
If a variable has only ever been used in a string context, the auto
increment operator can be applied to it. If the value consists of a sequence of
letters, or a sequence of letters followed by a sequence of digits, the auto
increment takes place in string mode starting with the right most character,
with ‘carry’ along the string. For example, the sequence
$a = ‘a0’ ; $b =
‘Az9’ ;
2.Unaryminus
Comparison operators
Logical operators
Bitwise operators
Conditional expressions
Control structures
The Control Structures for conditional execution and repetition all the
control mechanisms is similar to C.
BLOCKS
Conditions
CONDITIONAL EXECUTION
If-then-else statements
if ($total>0)
{
print “$total\n”-
if ($total>0)
{
print “$total\n”
}
else
{
print “bad total!\n”-
}
}
Eg: if ($total>70)
{
$grade=”A”;
}
elsif ($total >50)
{
$grade=”B”;
}
elsif ($total>40)
{
$grade=”C”;
} else {
$grade=”F”;
$total=0;
}
Alternatives to if-then-else
if ($a<0)
($b=0)
else
($b=1)
can be written as
To use the ‘or’ operator between statements Eg: open (IN, $ARGV[0] or die
“Can’t open $ARGV*0+\n”;
11
Statement qualifiers
REPETITION:
1. Testing Loops
2. Counting Loops
TESTING LOOPS
With the if statement, the expression that forms the condition must be
enclosed in brackets. But now, while can be replaced by until to give the same
effect. Single statement can use while and until as statement modifiers to
improve readability.
do
{
……….
} while $a! = $b;
12
Counting Loops
In C,
In Perl,
foreach $i (1…10),
$i_square=$i* $i;
$i_cube=$i**3;
print “$i\t$i_square\t$i_cube\n”;
}
LISTS
These are the collections of scalar data items which have an assigned
storage space in memory and can therefore be accessed using a variable
name.
ARRAYS
13
HASHES
Array Creation
Array variables are prefixed with the @sign and are populated using
either parenthesis or the qw operator.
Manipulating Lists
Perl provides several built-in functions for list manipulation. Three useful
ones are:
shift LIST : Returns the first item of LIST, and moves the remaining
items down, reducing the size of LIST by 1.
14
unshift ARRAY, LIST : The opposite of shift. Puts the items in LIST at
the beginning of ARRAY, moving the original contents up by the
required amount.
push ARRAY, LIST : Similar to unshift, but adds the values in LIST to
the end of ARRAY.
foreach: The foreach loop performs a simple iteration over all the elements of a
list.
The block is executed repeatedly with the variables $item taking each
value from the list in turn. The variable can be omitted, in which case $_ will
be used. The natural Perl idiom for manipulating all items in an array is ;
foreach (@array)
{
……..#process $_
}
To refer to a single element of a hash, you will use the hash variable
name preceded by a ‘$’ sign and followed by the “key” associated with
the value in the curly brackets.
Creating Hashes
(banana => ‘yellow’ , apple => ‘red’ , grapes => ‘green’, ............);
Manipulating Hashes
keys % HASH returns a list of the keys of the elements in the hash, and
15
values % HASH returns a list of the values of the elements in the hash.
Eg: %foo = (banana => ‘yellow’ , apple => ‘red’ , grapes => ‘green’, ............);
keys %
HASH returns banana, apple, grapes values % HASH returns yellow, red,
green.
Other useful operators for manipulating hashes are delete and exists.
The most powerful features of Perl are in its vast collection of string
manipulation operators and functions. Perl would not be as popular as it is
today in bioinformatics applications if it did not contain its flexible and
powerful string manipulation capabilities.
String concatenation
$a . $b;
$c = $a . $b;
$a = $a . $b;
$a .= $b;
The first expression concatenates $a and $b together, but the the result
was immediately lost unless it is saved to the third string $c as in case two. If
$b is meant to be appended to the end of $a, use the .= operator will be more
convenient. As is any other assignments in Perl, if you see an assignment
written this way $a = $a op expression, where op stands for any operator and
expr stands for the rest of the statement, you can make a shorter version by
moving the op to the front of the assignment, e.g., $a op= expression.
Substring extraction
The first argument to the substr function is the source string, the
second argument is the start position of the substring in the source string,
and the third argument is the length of the substring to extract. The second
argument can be negative, and if that being the case, the start position will be
counted from the back of the source string. Also, the third argument can be
omitted. In that case, it will run to the end of the source string. Particularly
interesting feature in Perl is that the substr function can be assigned into as
well, meaning that in addition to string extraction, it can be used as string
replacement:
substr($a, -1) = 'abc'; # replace the last character as abc (i.e., also add
two new letters c)
Substring search
The index function takes two arguments, the source string to search,
and the substring to be located inside the source string. It can optionally take
a third argument to mean the start position of the search. If the index function
finds no substring in the source string anymore, then it returns -1.
Regular expression
o Any character except the following special ones stands for itself. Thus
abc matches 'abc', and xyz matches 'xyz'.
o The character . matches any single character. To match it only with the .
character itself, put an escape \ in front of it, so \. will match only '.', but
. will match anything. To match the escape character itself, type two of
them \\ to escape itself.
o All the above so far just match single characters. The power of regular
expression lies in its ability to match multiple characters with some
meta symbols. The * will match 0 or more of the previous symbol, the +
will match 1 or more of the previous symbol, and ? will match 0 or 1 of
the previous symbol.
o For example, a* will match 'aaaa...' for any number of a's including none
'', a+ will match 1 or more a's, and a? will match zero or one a's. A more
complicated example is to match numbers, which can be written this
way [0-9]+. To matching real numbers, you need to write [0-9]+\.?[0-9]*.
Note that the decimal point and fraction numbers can be omitted, thus
we use ?, and * instead of +.
o If you want to combine two regular expressions together, just write them
consecutively. If you want to use either one of the two regular
expressions, use the | meta symbol. Thus, a|b will match a or b, which
is equivalent to [ab], and a+|b+ will match any string of a's or b's. The
second case cannot be expressed using character subset because [ab]+
does not mean the same thing as a+|b+.
The rules above are simple, but it takes some experience to apply them
successfully on the actual substrings you wish to match. There are no better
ways to learn this than simply to write some regular expressions and see if
they match the substrings you have in mind.
• [A-Z][a-z]* will match all words whose first character are capitalized
• [A-Za-z_][A-Za-z0-9_]* will match all legal perl variable names [+-]?[0-
9]+\.?[0-9]*([eE][+-]?[0-9]+)? will match scientific numbers [acgtACGT]+
will match all DNA strings
• ^> will match the > symbol only at the beginning of a string a$ will
match the letter only at the end of a string
[+-]?\d+\.?\d*([eE][+-]?\d+)?
Pattern matching
Regular expressions are used in a few Perl statements, and their most
common use is in pattern matching. To match a regular expression pattern
inside a $string, use the string operator =~ combines with the pattern matching
operator / /:
the pattern matching operator / / does not alter the source $string. Instead, it
just returns a true or false value to determine if the pattern is found in $string:
if ($string =~ /\d+/)
{
print "there are numbers in $string\n";
}
Sometimes not only you want to know if the pattern exists in a string,
but also what it actually matched. In that case, use the parentheses to indicate
the matched substring you want to know, and they will be assigned to the
special $1, $2, ..., variables if the match is successful:
if ($string =~ /(\d+)\s+(\d+)\s+(\d+)/)
{
print "first three matched numbers are $1, $2, $3 in $string\n";
}
Note that all three numbers above must be found for the whole pattern to
match successfully, thus $1, $2 and $3 should be defined when the if
statement is true. The same memory of matched substrings within the regular
expression are \1, \2, \3, etc. So, to check if the same number happened twice
in the $string, you can do this:
if ($string =~ /(\d).+\1/) {
print "$1 happened at least twice in $string\n";
}
Pattern substitution
19
$string =~ s/\d+/0/; # replace a number with zero
For pattern matching, you can also use any separator by writing them
with m operator, i.e., m:/: will match the forward splash symbol. Natually,
the substitution string may (and often does) contain the \1, \2 special
memory substrings to mean the just matched substrings. For example, the
following will add parentheses around the matched number in the source
$string: $string =~ s/(\d+)/(\1)/;
/g tells Perl to match all existing patterns, thus the following prints all
numbers in $string
"a\na\na" =~ /a$/m will match the last a in the $string, not the a
before the first newline if /m is not given.
Perl- Subroutines
20
Define and Call a Subroutine
{
body of the subroutine
#!/usr/bin/perl
# Function definition sub Hello{
print "Hello, World!\n";
}
# Function call Hello();
You can pass arrays and hashes as arguments like any scalar but
passing more than one array or hash normally causes them to lose their
separate identities.
#!/usr/bin/perl
# Function definition sub PrintList{
m y @ list = @ _;
print "Given list is @ list\n";
}
$ a = 10;
@ b = (1, 2, 3, 4);
# Function call with list param eter PrintList($ a, @ b);
21
When above program is executed, it produces the following result −
Given list is 10 1 2 3 4
#!/usr/bin/perl
# Function definition sub PrintHash{
m y (%hash) = @ _;
foreach m y $ key ( keys %hash){ m y $ value = $ hash{$ key};
print "$ key : $ value\n";
}
}
%hash = ('nam e' => 'Tom ', 'age' => 19);
# Function call with hash parameter PrintHash(%hash);
ee : Tom age : 19
You can return a value from subroutine like you do in any other
programming language. If you are not returning a value from a subroutine
then whatever calculation is last performed in a subroutine is automatically
also the return value. You can return arrays and hashes from the
subroutine like any scalar but returning more than one array or hash
normally causes them to lose their separate identities. So we will use
references explained in the next chapter to return any array or hash from a
function. Let's try the following example, which takes a list of numbers and
then returns their average
#!/usr/bin/perl
# Function definition sub Average{
# get total num ber of argum ents passed. $ n = scalar(@ _);
$ sum = 0;
foreach $ item (@ _)
{
$ sum += $ item ;
}
$ average = $ sum / $ n; return $ average;
}
# Function call
$ num = Average(10, 20, 30);
print "Average for the given num bers : $ num \n";
22