[go: up one dir, main page]

0% found this document useful (0 votes)
26 views84 pages

PAP Module 3

This document discusses lists in Python. It defines lists as ordered sequences that can contain elements of different data types. Lists are created using square brackets and their elements can be accessed via integer indexes. Some key list methods covered include append(), sort(), reverse(), count(), and pop(). Operations like slicing, concatenation, repetition are also discussed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views84 pages

PAP Module 3

This document discusses lists in Python. It defines lists as ordered sequences that can contain elements of different data types. Lists are created using square brackets and their elements can be accessed via integer indexes. Some key list methods covered include append(), sort(), reverse(), count(), and pop(). Operations like slicing, concatenation, repetition are also discussed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 84

Application development using python, By: MANZOOR

KHAN, CS&E, GEC 1


 LISTS

 DICTIONARIES

 TUPLES

 REGULAR EXPRESSIONS

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 2
 A list is an ordered sequence of values.
 It is a data structure in Python. The values inside the lists can be of any type (like
integer, float, strings, lists, tuples, dictionaries etc) and are called as elements or
items.
 The elements of lists are enclosed within square brackets.
 For example,
ls1=[10,-4, 25, 13]

ls2=[“Tiger”, “Lion”, “Cheetah”]


 Here, ls1 is a list containing four integers, and ls2 is a list containing three strings.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 3
 A list need not contain data of same type.
 We can have mixed type of elements in list.
 For example,

ls3=[3.5, „Tiger‟, 10, [3,4]]


 Here, ls3 contains a float, a string, an integer and a list.
 This illustrates that a list can be nested as well.
>>> ls =[]
>>> type(ls)
<class 'list'>
or
>>> ls =list()
>>> type(ls)
<class 'list'>

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 4
 a new list can be created using this function by passing arguments to it as shown
below –
>>> ls2=list([3,4,1])
>>> print(ls2)
[3, 4, 1]

>>> ls2=[3,4,1]
>>> print(ls2)
[3, 4, 1]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 5
 The elements in the list can be accessed using a numeric index within square-
brackets.
 It is similar to extracting characters in a string.

>>> ls=[34, 'hi', [2,3],-5]

>>> print(ls[1])
hi

>>> print(ls[2])

[2, 3]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 6
>>> ls=[34, 'hi', [2,3],-5]
>>> print(ls[2][0])
2
>>> print(ls[2][1])
3

Mutable

>>> ls=[34, 'hi', [2,3],-5]


>>> ls[2]='Hello'
>>> print(ls)
[34, 'hi', 'Hello', -5]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 7
>>> ls=[34, 'hi', [2,3],-5]
>>> print(ls[2*1])
[2,3]

 Attempt to access a non-existing index will throw and IndexError.


>>> ls=[34, 'hi', [2,3],-5]
>>> print(ls[4])
IndexError: list index out of range

 A negative indexing counts from backwards.


>>> ls=[34, 'hi', [2,3],-5]
>>> print(ls[-1])
-5
>>> print(ls[-3])
hi

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 8
 The in operator applied on lists will results in a Boolean value.
>>> ls=[34, 'hi', [2,3],-5]
>>> 34 in ls
True
>>> -2 in ls
False

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 9
 A list can be traversed using for loop.
 If we need to use each element in the list, we can use the for loop and in operator

as below
>>> ls=[34, 'hi', [2,3],-5]
>>> for item in ls:
print(item)
34
hi
[2,3]
-5

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 10
 List elements can be accessed with the combination of range() and len() functions
as well –
ls=[1,2,3,4]
for i in range(len(ls)):
ls[i]=ls[i]**2
print(ls)

#output is
[1, 4, 9, 16]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 11
 Python allows to use operators + and * on lists.
 The operator + uses two list objects and returns concatenation of those two lists.
 Whereas * operator take one list object and one integer value, say n, and returns a
list by repeating itself for n times.
>>> ls1=[1,2,3]
>>> ls2=[5,6,7]
>>> print(ls1+ls2) # concatenation using +
[1, 2, 3, 5, 6, 7]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 12
>>> ls1=[1,2,3]
>>> print(ls1*3) #repetition using *
[1, 2, 3, 1, 2, 3, 1, 2, 3]
>>> [0]*4 #repetition using *
[0, 0, 0, 0]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 13
 Similar to strings, the slicing can be applied on lists as well. Consider a list t given
below, and a series of examples following based on this object.

t=['a','b','c','d','e']
 Extracting full list without using any index, but only a slicing operator –

>>> print(t[:])
['a', 'b', 'c', 'd', 'e']
 Extracting elements from 2nd position –

>>> print(t[1:])
['b', 'c', 'd', 'e']

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 14
 Extracting first three elements –

>>> print(t[:3])
['a', 'b', 'c']
 Selecting some middle elements –

>>> print(t[2:4])
['c', 'd']
 Using negative indexing –
>>> print(t[:-2])

['a', 'b', 'c']


 Reversing a list using negative value for stride –

>>> print(t[::-1])
['e', 'd', 'c', 'b', 'a']
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 15
 append(): This method is used to add a new element at the end of a list.

>>> ls=[1,2,3]
>>> ls.append(“hi”)

>>> ls.append(10)
>>> print(ls)

[1, 2, 3, “hi”, 10]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 16
 extend(): This method takes a list as an argument and all the elements in this list are
added at the end of invoking list.
>>> ls1=[1,2,3]
>>> ls2=[5,6]
>>> ls2.extend(ls1)
>>> print(ls2)
[5, 6, 1, 2, 3]

Now, in the above example, the list ls1 is unaltered.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 17
 sort(): This method is used to sort the contents of the list. By default, the function
will sort the items in ascending order.

>>> ls=[3,10,5, 16,-2]


>>> ls.sort()

>>> print(ls)
[-2, 3, 5, 10, 16]

When we want a list to be sorted in descending order, we need to set the argument
as shown

>>> ls.sort(reverse=True)
>>> print(ls)

[16, 10, 5, 3, -2]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 18
 reverse(): This method can be used to reverse the given list.
>>> ls=[4,3,1,6]

>>> ls.reverse()
>>> print(ls)

[6, 1, 3, 4]
 count(): This method is used to count number of occurrences of a particular value
within list.
>>> ls=[1,2,5,2,1,3,2,10]

>>> ls.count(2)

3 #The item 2 has appeared 3 times in ls

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 19
 clear(): This method removes all the elements in the list and makes the list empty.

>>> ls=[1,2,3]
>>> ls.clear()

>>> print(ls)

[]
 insert(): Used to insert a value before a specified index of the list.

>>> ls=[3,5,10]
>>> ls.insert(1,"hi")

>>> print(ls)
[3, 'hi', 5, 10]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 20
 index(): This method is used to get the index position of a particular value in the
list.
>>> ls=[4, 2, 10, 5, 3, 2, 6]
>>> ls.index(2)
1

The same function can be used with two more arguments start and end to specify a
range within which the search should take place.
>>> ls=[15, 4, 2, 10, 5, 3, 2, 6]
>>> ls.index(2)
2
>>> ls.index(2,3,7)
6

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 21
 pop(): This method deletes the last element in the list, by default.
>>> ls=[3,6,-2,8,10]
>>> x=ls.pop() #10 is removed from list and stored in x
>>> print(ls)
[3, 6, -2, 8]
>>> print(x)
10
When an element at a particular index position has to be deleted, then we can give
that position as argument to pop() function.
>>> t = ['a', 'b', 'c']
>>> x = t.pop(1) #item at index 1 is popped
>>> print(t)
['a', 'c']
>>> print(x)
b
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 22
 remove(): When we don‟t know the index, but know the value to be removed, then
this function can be used.

>>> ls=[5,8, -12,34,2]


>>> ls.remove(34)
>>> print(ls)
[5, 8, -12, 2]

Note that, this function will remove only the first occurrence of the specified value,
but not all occurrences.

>>> ls=[5,8, -12, 34, 2, 6, 34]


>>> ls.remove(34)
>>> print(ls)
[5, 8, -12, 2, 6, 34]

Unlike pop() function, the remove() function will not return the value that has been
deleted. Application development using python, By: MANZOOR
KHAN, CS&E, GEC 23
 del: This is an operator to be used when more than one item to be deleted at a
time. Here also, we will not get the items deleted.

>>> ls=[3,6,-2,8,1]
>>> del ls[2] #item at index 2 is deleted
>>> print(ls)
[3, 6, 8, 1]
>>> ls=[3,6,-2,8,1]
>>> del ls[1:4] #deleting all elements from index 1 to 3
>>> print(ls)
[3, 1]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 24
 Deleting all odd indexed elements of a list –

>>> t=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’]


>>> del t[1::2]

>>> print(t)
['a', 'c', 'e']

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 25
 The utility functions like max(), min(), sum(), len() etc. can be used on lists.
 Hence most of the operations will be easy without the usage of loops.

>>> ls=[3,12,5,26, 32,1,4]


>>> max(ls) # prints 32

>>> min(ls) # prints 1


>>> sum(ls) # prints 83

>>> len(ls) # prints 7


>>> avg=sum(ls)/len(ls)

>>> print(avg)
11.857142857142858
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 26
 When we need to read the data from the user and to compute sum and average of
those numbers, we can write the code as below –

ls= list()
while (True):
x= input('Enter a number: ')
if x== 'done':
break

x= float(x)
ls.append(x)

average = sum(ls) / len(ls)


print('Average:', average)

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 27
 Though both lists and strings are sequences, they are not same.
 In fact, a list of characters is not same as string.
 To convert a string into a list, we use a method list() as below –
>>> s="hello"

>>> ls=list(s)
>>> print(ls)

['h', 'e', 'l', 'l', 'o']

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 28
 If we want a list of words from a sentence, we can use the following code –

>>> s="Hello how are you?"


>>> ls=s.split()

>>> print(ls)
['Hello', 'how', 'are', 'you?']
 Note that, when no argument is provided, the split() function takes the delimiter as
white space.
 If we need a specific delimiter for splitting the lines, we can use as shown in
following example –
>>> dt="20/03/2018"

>>> ls=dt.split('/')
>>> print(ls)
Application development using python, By: MANZOOR
['20', '03', '2018'] KHAN, CS&E, GEC 29
 There is a method join() which behaves opposite to split() function.
 It takes a list of strings as argument, and joins all the strings into a single string
based on the delimiter provided.
>>> ls=["Hello", "how", "are", "you"]

>>> d=' '


>>> d.join(ls)

'Hello how are you'


 Here, we have taken delimiter d as white space. Apart from space, anything can be
taken as delimiter. When we don’t need any delimiter, use empty string as
delimiter.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 30
 In many situations, we would like to read a file and extract only the lines containing
required pattern. This is known as parsing.
 As an illustration, let us assume that there is a log file containing details of email
communication between employees of an organization.
 For all received mails, the file contains lines as –
From stephen.marquard@uct.ac.za Fri Jan 5 09:14:16 2018

From georgek@uct.ac.za Sat Jan 6 06:12:51 2018

………………
 Apart from such lines, the log file also contains mail-contents, to-whom the mail
has been sent etc.
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 31
 Now, if we are interested in extracting only the days of incoming mails, then we can
go for parsing.
 That is, we are interested in knowing on which of the days, the mails have been
received. The code would be –
fhand = open(“logFile.txt”)

for line in fhand:


line = line.rstrip()

if not line.startswith('From '):


continue

words = line.split()
print(words[2])

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 32
 Whenever we assign two variables with same value, the question arises – whether
both the variables are referring to same object, or to different objects.
 This is important aspect to know, because in Python everything is a class object.
 There is nothing like elementary data type.
 Consider a situation –
a= “hi”

b= “hi”
 Now, the question is whether both a and b refer to the same string.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 33
 There are two possible states –

 In the first situation, a and b are two different objects, but containing same value.
The modification in one object is nothing to do with the other.
 Whereas, in the second case, both a and b are referring to the same object.
 That is, a is an alias name for b and vice- versa. In other words, these two are
referring to same memory location.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 34
 To check whether two variables are referring to same object or not, we can use is
operator.

>>> a= “hi”
>>> b= “hi”

>>> a is b #result is True


>>> a==b #result is True

 When two variables are referring to same object, they are called as identical objects.
 When two variables are referring to different objects, but contain a same value, they
are known as equivalent objects.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 35
 For example.

>>> s1=input(“Enter a string:”) #assume you entered hello


>>> s2= input(“Enter a string:”) #assume you entered hello

>>> s1 is s2 #check s1 and s2 are identical False


>>> s1 == s2 #check s1 and s2 are equivalent True

Here s1 and s2 are equivalent, but not identical.


 If two objects are identical, they are also equivalent, but if they are equivalent, they
are not necessarily identical.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 36
 When an object is assigned to other using assignment operator, both of them will
refer to same object in the memory.
 The association of a variable with an object is called as reference.

>>> ls1=[1,2,3]

>>> ls2= ls1


>>> ls1 is ls2 #output is True
 Now, ls2 is said to be reference of ls1. In other words, there are two references to
the same object in the memory.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 37
 When a list is passed to a function as an argument, then function receives reference
to this list.
 Hence, if the list is modified within a function, the caller will get the modified
version.

def del_front(t):
del t[0]

ls = ['a', 'b', 'c']


del_front(ls)

print(ls) # output is ['b', 'c']

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 38
 A dictionary is like a list.
 In a list, the index positions have to be integers;
 In a dictionary, the indices can be (almost) any type.
 We can think of a dictionary as a mapping between a set of indices (which are called
keys) and a set of values. Each key maps to a value. The association of a key and a
value is called a key-value pair or sometimes an item.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 39
 An empty dictionary can be created using two ways –
d= {}

OR
>>> eng2sp = dict()
>>> print(eng2sp)
{}
 To add items to dictionary, we can use square brackets as –

>>> eng2sp['one'] = 'uno'


 This line creates an item that maps from the key 'one' to the value “uno”. If we print
the dictionary again, we see a key-value pair with a colon between the key and
value:

>>> print(eng2sp)

{'one': 'uno'}
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 40
 To add items to dictionary, we can use square brackets as –
>>> d={}
>>> d["Mango"]="Fruit"
>>> d["Cucumber"]="Veg"
>>> print(d)
{'Mango': 'Fruit', 'Cucumber': 'Veg'}

 To initialize a dictionary at the time of creation itself, one can use the code like –
>>> tel_dir={'Tom': 3491, 'Jerry':8135}
>>> print(tel_dir)
{'Tom': 3491, 'Jerry': 8135}

>>> tel_dir['Donald']=4793
>>> print(tel_dir)
{'Tom': 3491, 'Jerry': 8135, 'Donald': 4793}

NOTE that the order of elements in dictionary is unpredictable. That is, in the above
example, don‟t assume that 'Tom': 3491 is first item, 'Jerry': 8135 is second item etc.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 41
 Using a key, we can extract its associated value as shown below –
>>> print(tel_dir['Jerry'])

8135
 Here, the key 'Jerry' maps with the value 8135, hence it doesn‟t matter where
exactly it is inside the dictionary.
 If a particular key is not there in the dictionary and if we try to access such key,
then the KeyError is generated.
>>> print(tel_dir['Mickey'])

KeyError: 'Mickey'

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 42
 The len() function on dictionary object gives the number of key-value pairs in that
object.
>>> print(tel_dir)
{'Tom': 3491, 'Jerry': 8135, 'Donald': 4793}
>>> len(tel_dir)
3

 The in operator can be used to check whether any key (not value) appears in the
dictionary object.
>>> 'Mickey' in tel_dir #output is False
>>> 'Jerry' in tel_dir #output is True
>>> 3491 in tel_dir #output is False

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 43
 The dictionary object has a method values() which will return a list of all the values
associated with keys within a dictionary.
 If we would like to check whether a particular value exist in a dictionary, we can
make use of it as shown below –

>>> 3491 in tel_dir.values() #output is True

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 44
 Assume that we need to count the frequency of alphabets in a given string. There are
different methods to do it –

1. Create 26 variables to represent each alphabet. Traverse the given string and increment
the corresponding counter when an alphabet is found.

2. Create a list with 26 elements (all are zero in the beginning) representing alphabets.
Traverse the given string and increment corresponding indexed position in the list when
an alphabet is found.

3. Create a dictionary with characters as keys and counters as values. When we find a
character for the first time, we add the item to dictionary. Next time onwards, we
increment the value of existing item.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 45
 Each of the above methods will perform same task, but the logic of implementation
will be different. Here, we will see the implementation using dictionary.
s=input("Enter a string:") #read a string
d=dict() #create empty dictionary
for ch in s: #traverse through string
if ch not in d: #if new character found
d[ch]=1 #initialize counter to 1
else: #otherwise, increment
counter d[ch]+=1

print(d) #display the dictionary

The sample output would be –


Enter a string:
Hello World
{'H': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'W': 1, 'r': 1, 'd': 1}
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 46
 Dictionary in Python has a method called as get(), which takes key and a default
value as two arguments. If key is found in the dictionary, then the get() function
returns corresponding value, otherwise it returns default value.
 For example,
>>> tel_dir={'Tom': 3491, 'Jerry':8135, 'Mickey':1253}
>>> print(tel_dir.get('Jerry',0))
8135
>>> print(tel_dir.get('Donald',0))
0
 In the above example, when the get() function is taking 'Jerry' as argument, it
returned corresponding value, as 'Jerry' is found in tel_dir .
 Whereas, when get() is used with 'Donald' as key, the default value 0 (which is
provided by us) is returned.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 47
 The function get() can be used effectively for calculating frequency of alphabets in a
string.
 Here is the modified version of the program –
s=input("Enter a string:")
d=dict()
for ch in s:
d[ch]=d.get(ch,0)+1

print(d)

 In the above program, for every character ch in a given string, we will try to retrieve
a value. When the ch is found in d, its value is retrieved, 1 is added to it, and
restored.
 If ch is not found, 0 is taken as default and then 1 is added to it.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 48
 When a for-loop is applied on dictionaries, it will iterate over the keys of dictionary.
 If we want to print key and values separately, we need to use the statements as
shown
tel_dir={'Tom': 3491, 'Jerry':8135, 'Mickey':1253}
for k in tel_dir:
print(k, tel_dir[k])

Output would be –
Tom 3491
Jerry 8135
Mickey 1253

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 49
 Note that, while accessing items from dictionary, the keys may not be in order. If we
want to print the keys in alphabetical order, then we need to make a list of the keys,
and then sort that list.
 We can do so using keys() method of dictionary and sort() method of lists.
 Consider the following code –
tel_dir={'Tom': 3491, 'Jerry':8135, 'Mickey':1253}
ls=list(tel_dir.keys())
print("The list of keys:",ls)
ls.sort()
print("Dictionary elements in alphabetical order:")
for k in ls:
print(k, tel_dir[k])

The output would be –


The list of keys: ['Tom', 'Jerry', 'Mickey']
Dictionary elements in alphabetical order:
Jerry 8135 Mickey 1253 Tom 3491
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 50
 The key-value pair from dictionary can be together accessed with the help of a
method items() as shown

>>> d={'Tom':3412, 'Jerry':6781, 'Mickey':1294}


>>> for k,v in d.items(): print(k,v)

Output:
Tom 3412
Jerry 6781
Mickey 1294

The usage of comma-separated list k,v here is internally a tuple (another data
structure in Python, which will be discussed later).

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 51
 A dictionary can be used to count the frequency of words in a file.
 Consider a file myfile.txt consisting of following text:
hello, how are you?
I am doing fine.
How about you?
 Now, we need to count the frequency of each of the word in this file. So, we need to
take an outer loop for iterating over entire file, and an inner loop for traversing each
line in a file.
 Then in every line, we count the occurrence of a word, as we did before for a
character.
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 52
 The program is given as below –
fname=input("Enter file name:")
try:
fhand=open(fname)
except:
print("File cannot be opened")
exit()

d=dict()
for line in fhand:
for word in line.split():
d[word]=d.get(word,0)+1
print(d)

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 53
 The output of this program when the input file is myfile.txt would be –

Enter file name: myfile.txt


{'hello,': 1, 'how': 1, 'are': 1, 'you?': 2, 'I': 1, 'am': 1, 'doing': 1, 'fine.': 1, 'How': 1,
'about': 1}

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 54
 As discussed in the previous section, during text parsing, our aim is to eliminate
punctuation marks as a part of word.
 The string module of Python provides a list of all punctuation marks as shown:
>>> import string
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
 The str class has a method maketrans() which returns a translation table usable for
another method translate().
 Consider the following syntax to understand it more clearly:
 line.translate(str.maketrans(fromstr, tostr, deletestr))

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 55
 A tuple is a sequence of items, similar to lists.
 The values stored in the tuple can be of any type and they are indexed using
integers.
 Unlike lists, tuples are immutable. That is, values within tuples cannot be
modified/reassigned. Tuples are comparable and hashable objects.
 Hence, they can be made as keys in dictionaries.
 Tuple values can be of mixed types.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 56
 A tuple can be created in Python as a comma separated list of items – may or may not be
enclosed within parentheses.
>>> t='Mango', 'Banana', 'Apple' #without parentheses
>>> print(t)
('Mango', 'Banana', 'Apple')
>>> t1=('Tom', 341, 'Jerry') #with parentheses
>>> print(t1)
('Tom', 341, 'Jerry')
 If we would like to create a tuple with single value, then just a parenthesis will not
suffice. For example,
>>> x=(3) #trying to have a tuple with single item
>>> print(x)
3 #observe, no parenthesis found
>>> type(x)
<class 'int'> #not a tuple, it is integer!!

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 57
 To have a tuple with single item, we must include a comma after the item.
>>> t=3, #or use the statement t=(3,)
>>> type(t) #now this is a tuple
<class 'tuple'>

 An empty tuple can be created either using a pair of parenthesis or using a function
tuple() as below
>>> t1=()
>>> type(t1)
<class 'tuple'>

>>> t2=tuple()
>>> type(t2)
<class 'tuple'>

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 58
 If we provide an argument of type sequence (a list, a string or tuple) to the method
tuple(), then a tuple with the elements in a given sequence will be created:
 Create tuple using string:

>>> t=tuple('Hello')
>>> print(t)

('H', 'e', 'l', 'l', 'o')


 Create tuple using list:

>>> t=tuple([3,[12,5],'Hi'])
>>> print(t)

(3, [12, 5], 'Hi')

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 59
 Create tuple using another tuple:
>>> t=('Mango', 34, 'hi')
>>> t1=tuple(t)
>>> print(t1)
('Mango', 34, 'hi')
>>> t is t1
True
 Note that, in the above example, both t and t1 objects are referring to same
memory location. That is, t1 is a reference to t.
 Elements in the tuple can be extracted using square-brackets with the help of
indices.
>>> t=('Mango', 'Banana', 'Apple')
>>> print(t[1])
Banana

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 60
 Similarly, slicing also can be applied to extract required number of items from
tuple.
>>> print(t[1:])
('Banana', 'Apple')
>>> print(t[-1])
Apple
 Modifying the value in a tuple generates error, because tuples are immutable –
>>> t[0]='Kiwi'
TypeError: 'tuple' object does not support item assignment
 We wanted to replace “Mango‟ by “Kiwi‟, which did not work using assignment.
 But, a tuple can be replaced with another tuple involving required modifications –
>>> t=('Kiwi',)+t[1:]
>>> print(t)
('Kiwi', 'Banana', 'Apple')

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 61
 Tuples can be compared using operators like >, <, >=, == etc.
 The comparison happens lexicographically.
 For example, when we need to check equality among two tuple objects, the first
item in first tuple is compared with first item in second tuple.
 If they are same, 2nd items are compared.
 The check continues till either a mismatch is found or items get over.
 Consider few examples –
>>> (1,2,3)==(1,2,5)
False
>>> (3,4)==(3,4)
True

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 62
 The meaning of < and > in tuples is not exactly less than and greater than, instead,
it means comes before and comes after.
 Hence in such cases, we will get results different from checking equality (==).
>>> (1,2,3)<(1,2,5)
True
>>> (3,4)<(5,2)
True
 When we use relational operator on tuples containing non-comparable types, then
TypeError will be thrown.
>>> (1,'hi')<('hello','world')
TypeError: '<' not supported between instances of 'int' and 'str'

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 63
 The sort() function internally works on similar pattern – it sorts primarily by first
element, in case of tie, it sorts on second element and so on.
 Consider a program of sorting words in a sentence from longest to shortest, which
illustrates DSU property.
txt = 'Ram and Seeta went to forest with Lakshman'
words = txt.split()
t = list()
for word in words:
t.append((len(word), word))

print(“The list is:‟,t)


t.sort(reverse=True)
res = list()

for length, word in t:


res.append(word)
print(“The sorted list:‟,res)
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 64
 The output would be –
The list is:
[(3, 'Ram'), (3, 'and'), (5, 'Seeta'), (4, 'went'), (2, 'to'), (6, 'forest'), (4, 'with'), (8, 'Lakshman')]
The sorted list:
['Lakshman', 'forest', 'Seeta', 'went', 'with', 'and', 'Ram', 'to']

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 65
 Tuple has a unique feature of having it at LHS of assignment operator.
 This allows us to assign values to multiple variables at a time.
>>> x,y=10,20
>>> print(x) #prints 10
>>> print(y) #prints 20
 When we have list of items, they can be extracted and stored into multiple variables
as below –
>>> ls=["hello", "world"]
>>> x,y=ls
>>> print(x)
hello
>>> print(y)
world
 This code internally means that –
x= ls[0]
y= ls[1] Application development using python, By: MANZOOR
KHAN, CS&E, GEC 66
 The best known example of assignment of tuples is swapping two values as below –
>>> a=10
>>> b=20
>>> a, b = b, a
>>> print(a, b)
20 10
 In the above example, the statement a, b = b, a is treated by Python as – LHS is a
set of variables, and RHS is set of expressions.
 The expressions in RHS are evaluated and assigned to respective variables at LHS.
 Giving more values than variables generates ValueError –
>>> a, b=10,20,5
ValueError: too many values to unpack (expected 2)

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 67
 Dictionaries have a method called items() that returns a list of tuples, where each
tuple is a key-value pair as shown below –

>>> d = {'a':10, 'b':1, 'c':22}

>>> t = list(d.items())

>>> print(t)
[('b', 1), ('a', 10), ('c', 22)]

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 68
 We can combine the method items(), tuple assignment and a for-loop to get a
pattern for traversing dictionary:
d={'Tom': 1292, 'Jerry': 3501, 'Donald': 8913}
for key, val in list(d.items()):
print(val,key)
The output would be –
1292 Tom
3501 Jerry
8913 Donald

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 69
 Searching for required patterns and extracting only the lines/words matching the
pattern is a very common task in solving problems programmatically.
 We have done such tasks earlier using string slicing and string methods like split(),
find() etc.
 As the task of searching and extracting is very common, Python provides a powerful
library called regular expressions to handle these tasks elegantly.
 To use them in our program, the library/module re must be imported.
 There is a search() function in this module, which is used to find particular
substring within a string.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 70
 Consider the following example –

import re

str=“how are you”


line = str.rstrip()

if re.search('how', line):
print(line)
 By referring to file myfile.txt that has been discussed in previous Chapters, the
output would be

hello, how are you?


how about you?

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 71
 In the following example, we make use of caret (^) symbol, which indicates
beginning of the line.

import re
hand = open('myfile.txt')

for line in hand:


line = line.rstrip()

if re.search('^how', line):

print(line)

The output would be –


how about you?

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 72
 Python provides a list of meta-characters to match search strings.
 Table below shows the details of few important metacharacters.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 73
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 74
 Some of the examples for quick and easy understanding of regular expressions are
given in next Table.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 75
 Most commonly used metacharacter is dot, which matches any character.
 Consider the following example, where the regular expression is for searching lines
which starts with I and has any two characters (any character represented by two
dots) and then has a character m.
import re
fhand = open('myfile.txt')
for line in fhand:
line = line.rstrip()
if re.search('^I..m', line):
print(line)

The output would be –


I am doing fine.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 76
 when we don‟t know the exact number of characters between two characters (or
strings), we can make use of dot and + symbols together.
 Consider the below given program –

import re
fhand = open('myfile.txt')
for line in fhand:
line = line.rstrip()
if re.search('^h.+u', line):
print(line)

The output would be –


hello, how are you

how about you

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 77
 To understand the behavior of few basic meta characters, we will see some
examples.
 The file used for these examples is mbox-short.txt which can be downloaded from –
 https://www.py4e.com/code3/mbox-short.txt
 Use this as input and try following examples –
 Pattern to extract lines starting with the word From (or from) and ending with edu:

import re
fhand = open('mbox-short.txt')
for line in fhand:
line = line.rstrip()
pattern = “^[Ff]rom.*edu$‟
if re.search(pattern, line):
print(line)
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 78
 Pattern to extract lines ending with any digit:

pattern = “[0-9]$”

 Using Not :

pattern = “^[^a-z0-9]+”

 Start with upper case letters and end with digits:

pattern = '^[A-Z].*[0-9]$'

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 79
 Python provides a method findall() to extract all of the substrings matching a
regular expression.
 This function returns a list of all non-overlapping matches in the string.
 If there is no match found, the function returns an empty list.
 Consider an example of extracting anything that looks like an email address from
any line.
import re
s = 'A message from csev@umich.edu to cwen@iupui.edu about meeting @2PM'
lst = re.findall('\S+@\S+', s)
print(lst)

The output would be – ['csev@umich.edu', 'cwen@iupui.edu']


Application development using python, By: MANZOOR
KHAN, CS&E, GEC 80
 Here, the pattern indicates at least one non-white space characters (\S) before @
and at least one non-white space after @.
 Hence, it will not match with @2pm, because of a white- space before @.
 we can write a complete program to extract all email-ids from the file.
import re
fhand = open('mbox-short.txt')
for line in fhand:
line = line.rstrip()
x = re.findall('\S+@\S+', line)
if len(x) > 0:
print(x)
 Here, the condition len(x) > 0 is checked because, we want to print only the line
which contain an email-ID. If any line do not find the match for a pattern given, the
findall() function will return an empty list.
Application development using python, By: MANZOOR
KHAN, CS&E, GEC 81
 Assume that we need to extract the data in a particular syntax.
 For example, we need to extract the lines containing following format –
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.0000
 The pattern for regular expression would be – ^X-.*: [0-9.]+
 The complete program is –
import re hand = open('mbox-short.txt')
for line in hand:
line = line.rstrip()
if re.search('^X\S*: [0-9.]+', line):
print(line)

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 82
 Assume that, we want only the numbers (representing confidence, probability etc)
in the above output.
 When we add parentheses to a regular expression, they are ignored when matching
the string. But when we are using findall(), parentheses indicate that while we want
the whole expression to match, we only are interested in extracting a portion of the
substring that matches the regular expression.

import re hand = open('mbox-short.txt')


for line in hand:
line = line.rstrip()
x = re.findall('^X-\S*: ([0-9.]+)', line)
if len(x) > 0:
print(x)

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 83
 As we have discussed till now, the character like dot, plus, question mark, asterisk,
dollar etc. are meta characters in regular expressions.
 Sometimes, we need these characters themselves as a part of matching string.
 Then, we need to escape them using a back- slash. For example,
import re
x = 'We just received $10.00 for cookies.'
y = re.findall('\$[0-9.]+',x)

Output: ['$10.00']

 Here, we want to extract only the price $10.00. As, $ symbol is a metacharacter, we
need to use \ before it.
 So that, now $ is treated as a part of matching string, but not as metacharacter.

Application development using python, By: MANZOOR


KHAN, CS&E, GEC 84

You might also like