Intermediate Python (Ike-Nwosu Obi)
Intermediate Python (Ike-Nwosu Obi)
Obi Ike-Nwosu
* * * * *
This is a Leanpub book. Leanpub empowers authors and publishers with the
Lean Publishing process. Lean Publishing is the act of publishing an in-
progress ebook using lightweight tools and many iterations to get reader
feedback, pivot until you have the right book and build traction once you
do.
* * * * *
The Python interpreter and the extensive standard library that come with the
interpreter are available for free in source or binary form for all major
platforms from the Python Web site. This site also contains distributions of
and pointers to many free third party Python modules, programs and tools,
and additional documentation.
The Python interpreter can easily be extended with new functions and data
types implemented in C, C++ or any other language that is callable from C.
Python is also suitable as an extension language for customisable
applications. One of the most notable feature of python is the easy and
white-space aware syntax.
The book covers only a handful of topics but tries to provide a holistic and
in-depth coverage of these topics. It starts with a short tutorial introduction
to get the reader up to speed with the basics of Python; experienced
programmers from other object oriented languages such as Java may find
that this is all the introduction to Python that they need. This is followed by
a discussion of the Python object model then it moves on to discussing
object oriented programming in Python. With a firm understanding of the
Python object model, it goes ahead to discuss functions and functional
programming. This is followed by a discussion of meta-programming
techniques and their applications. The remaining chapters cover generators,
a complex but very interesting topic in Python, modules and packaging, and
python runtime services. In between, intermezzos are used to discuss topics
are worth knowing because of the added understanding they provide.
I hope the content of the book achieves the purpose for the writing of this
book. I welcome all feedback readers may have and actively encourage
readers to provide such feedback.
By version 1.4, Python had acquired several new features including the
Modula-3 inspired keyword arguments and built-in support for complex
numbers. It also included a basic form of data hiding by name mangling.
Python 1.5 was released on December 31, 1997 while python 1.6 followed
on September 5, 2000.
Python 2.0 was released on October 16, 2000 and it introduced list
comprehensions, a feature borrowed from the functional programming
languages SETL and Haskell as well as a garbage collection system capable
of collecting reference cycles.
Python 2.2 was the first major update to the Python type system. This
update saw the unification of Python’s in-built types and user defined
classes written in Python into one hierarchy. This single unification made
Python’s object model purely and consistently object oriented. This update
to the class system of Python added a number of features that improved the
programming experience. These included:
The full details on changes from python 2 to python 3 can be viewed on the
python website. The rest of the book will assumes the use of Python 3.4.
A user can type in python statements at the interpreter prompt and get
instant feedback. For example, we can evaluate expressions at the REPL and
get values for such expressions as in the following example.
Python 2.7.6 (default, Sep 9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> var_1 = 3
>>> var_2 = 3
>>> var_1*var_2
9
>>>
Typing Ctrl-D at the primary prompt causes the interpreter to exit the
session.
Multiple physical lines can be explicitly joined into a single logical line by
use of the line continuation character, \, as shown below:
>>> name = "Obi Ike-Nwosu"
>>> cleaned_name = name.replace("-", " "). \
... replace(" ", "")
>>> cleaned_name
'ObiIkeNwosu'
>>>
Lines are joined implicitly, thus eliminating the need for line continuation
characters, when expressions in triple quoted strings, enclosed in
parenthesis (…), brackets [….] or braces {…} spans multiple lines.
From discussions above, it can be inferred that there are two types of
statements in python:
The suite may be a set of one or more statements that follow the header’s
colon with each statement separated from the previous by a semi-colon as
shown in the following example.
```python
>>> x = 1
>>> y = 2
>>> z = 3
>>> if x < y < z: print(x); print(y); print(z)
...
1
2
3
```
2.3 Strings
Strings are represented in Python using double "..." or single '...'
quotes. Special characters can be used within a string by escaping them
with \ as shown in the following example:
# the quote is used as an apostrophe so we escape it for python to
# treat is as an apostrophe rather than the closing quote for a string
>>> name = 'men\'s'
>>> name
"men's"
>>>
String literals that span multiple lines can be created with the triple quotes
but newlines are automatically added at the end of a line as shown in the
following snippet.
>>> para = """hello world I am putting together a
... book for beginners to get to the next level in python"""
# notice the new line character
>>> para
'hello world I am putting together a \nbook for beginners to get to the
next level in python'
# printing this will cause the string to go on multiple lines
>>> print(para)
hello world I am putting together a
book for beginners to get to the next level in python
>>>
The while and for statements constitute the main looping constructs
provided by python.
The for statement in python is used to iterate over sequence types (lists,
sets, tuples etc.). More generally, the for loop is used to iterate over any
object that implements the python iterator protocol. This will be discussed
further in chapters that follow. Example usage of the for loop is shown by
the following snippet:
>>> names = ["Joe", "Obi", "Chris", "Nkem"]
>>> for name in names:
... print(name)
...
Joe
Obi
Chris
Nkem
>>>
Python replaces the above with the simpler range() statement that is used
to generate an arithmetic progression of integers. For example:
>>> for i in range(10, 20):
... print i
...
10
11
12
13
14
15
16
17
18
19
>>>
The range statement has a syntax of range(start, stop, step). The stop
value is never part of the progression that is returned.
while statement
The while statement executes the statements in its suite as long as the
condition expression in the while statement evaluates to true.
>>> counter = 10
>>> while counter > 0: # the conditional expression is 'counter>0'
... print(counter)
... counter = counter - 1
...
10
9
8
7
6
5
4
3
2
1
The break keyword is used to escape from an enclosing loop. Whenever the
break keyword is encountered during the execution of a loop, the loop is
abruptly exited and no other statement within the loop is executed.
>>> for i in range(10):
... if i == 5:
... break
... else:
... print(i)
...
0
1
2
3
4
The continue keyword is used to force the start of the next iteration of a
loop. When used the interpreter ignores all statements that come after the
continue statement and continues with the next iteration of the loop.
In the example above, it can be observed that the number 5 is not printed
due to the use of continue when the value is 5 however all subsequent
values are printed.
else clause with looping constructs
Python has a quirky feature in which the else keyword can be used with
looping constructs. When an else keyword is used with a looping construct
such as while or for, the statements within the suite of the else statement
are executed as long as the looping construct was not ended by a break
statement.
# loop exits normally
>>> for i in range(10):
... print(i)
... else:
... print("I am in quirky else loop")
...
0
1
2
3
4
5
6
7
8
9
I am in quirky else loop
>>>
If the loop was exited by a break statement, the execution of the suite of the
else statement is skipped as shown in the following example:
Enumerate
The above solution is how one would go about it in most languages but
python has a better alternative to such in the form of the enumerate
keyword. The above solution can be reworked beautifully in python as
shown in the following snippet:
>>> for index, name in enumerate(names):
... print("{}. {}".format(index, name))
...
0. Joe
1. Obi
2. Chris
3. Jamie
>>>
2.5 Functions
Named functions are defined with the def keyword which must be followed
by the function name and the parenthesized list of formal parameters. The
returnkeyword is used to return a value from a function definition. A
python function definition is shown in the example below:
def full_name(first_name, last_name):
return " ".join((first_name, last_name))
Python functions can be defined without return keyword. In that case the
default returned value is None as shown in the following snippet:
>>> def print_name(first_name, last_name):
... print(" ".join((first_name, last_name)))
...
>>> print_name("Obi", "Ike-Nwosu")
Obi Ike-Nwosu
>>> x = print_name("Obi", "Ike-Nwosu")
Obi Ike-Nwosu
>>> x
>>> type(x)
<type 'NoneType'>
>>>
The return keyword does not even have to return a value in python as
shown in the following example.
>>> def dont_return_value():
... print("How to use return keyword without a value")
... return
...
>>> dont_return_value()
How to use return keyword without a value
Python also supports anonymous functions defined with the lambda
keyword. Python’s lambda support is rather limited, crippled a few people
may say, because it supports only a single expression in the body of the
lambda expression. Lambda expressions are another form of syntactic sugar
and are equivalent to conventional named function definition. An example
of a lambda expression is the following:
>>> square_of_number = lambda x: x**2
>>> square_of_number
<function <lambda> at 0x101a07158>
>>> square_of_number(2)
4
>>>
Elements can also be added to other parts of a list not just the end using
`insert` method.
>>> name = ["obi", "ike", "nwosu"]
>>> name.insert(1, "nkem")
>>> names
["obi", "nkem", "ike", "nwosu"]
Two or more lists can be concatenated together with the `+` operator.
To get a full listing of all methods of the list, run the help command with
list as argument.
When defining a non-empty tuple the parenthesis is optional but when the
tuple is part of a larger expression, the parenthesis is required. The
parenthesis come in handy when defining an empty tuple for instance:
>>> companies = ()
>>> type(companies)
<class 'tuple'>
>>>
Tuples have a quirky syntax that some people may find surprising. When
defining a single element tuple, the comma must be included after the single
element regardless of whether or not parenthesis are included. If the comma
is left out then the result of the expression is not a tuple. For instance:
>>> company = "Google",
>>> type(company)
<class 'tuple'>
>>>
>>> company = ("Google",)
>>> type(company)
<class 'tuple'>
# absence of the comma returns the value contained within the parenthesis
>>> company = ("Google")
>>> company
'Google'
>>> type(company)
<class 'str'>
>>>
Tuples are integer indexed just like lists but are immutable; once created the
contents cannot be changed by any means such as by assignment. For
instance:
>>> companies = ("Google", "Microsoft", "Palantir")
>>> companies[0]
'Google'
>>> companies[0] = "Boeing"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>>
Python’s data structures are not limited to just those listed in this section.
For example the collections module provides additional data structures
such as queues and deques however the data structures listed in this section
form the workhorse for most Python applications. To get better insight into
the capabilities of a data structure, the help() function is used with the
name of the data structure as argument for example, help(list).
2.7 Classes
The class statement is used to define new types in python as shown in the
following example:
class Account:
# class variable that is common to all instances of a class
num_accounts = 0
def inquiry(self):
return "Name={}, balance={}".format(self.name,
self.balance)
@classmethod
def from_dict(cls, params):
params_dict = json.loads(params)
return cls(params_dict.get("name"),
params_dict.get("balance"))
Classes in python just like classes in other languages have class variables,
instance variables, class methods, static methods and instance methods.
When defining classes, the base classes are included in the parenthesis that
follows the class name. For those that are familiar with Java, the __init__
method is something similar to a constructor; it is in this method that
instance variables are initialized. The above defined class can be initialized
by calling the defined class with required arguments to __init__ in
parenthesis ignoring the self argument as shown in the following example.
>>> acct = Account("obie", 10000000)
Methods in a class that are defined with self as first argument are instance
methods. The self argument is similar to this in java and refers to the
object instance. Methods are called in python using the dot notation syntax
as shown below:
>>> acct = Account("obie", 10000000)
>>>account.inquiry()
Name=obie, balance=10000000
Python comes with built-in function, dir, for introspection of objects. The
dir function can be called with an object as argument and it returns a list of
all attributes, methods and variables, of a class.
2.8 Modules
Functions and classes provide mean for structuring your Python code but as
the code grows in size and complexity, there is a need for such code to be
split into multiple files with each source file containing related definitions.
The source files can then be imported as needed in order to access
definitions in any of such source file. In python, we refer to source files as
modules and modules have the .py extensions.
For example, the Account class definition from the previous section can be
saved to a module called Account.py. To use this module else where, the
import statement is used to import the module as shown in the following
example:
>>> import Account
>>> acct = Account.Account("obie", 10000000)
Note that the import statement takes the name of the module without the
.py extension. Using the import statement creates a name-space, in this case
the Account name-space and all definitions in the module are available in
such name-space. The dot notation (.) is used to access the definitions as
required. An alias for an imported module can also be created using the as
keyword so the example from above can be reformulated as shown in the
following snippet:
>>> import Account as acct
>>> account = acct.Account("obie". 10000000)
It is also possible to import only the definitions that are needed from the
module resulting in the following:
>>> from Account import Account
>>> account = Account("obie", 10000000)
All the definitions in a module can also be imported by using the wild card
symbol a shown below:
>>> from Account import *
2.9 Exceptions
Python has support for exceptions and exception handling. For example,
when an attempt is made to divide by zero, a ZeroDivisionError is thrown
by the python interpreter as shown in the following example.
>>> 2/0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
>>>
The open method returns a file object or throws an exception if the file does
not exist. The file object supports a number of methods such as read that
reads the whole content of the file into a string or readline that reads the
contents of the file one line at a time. Python supports the following
syntactic sugar for iterating through the lines of a file.
for line in open("afile.txt"):
print(line)
Python also has support for writing to standard input and standard output.
This can be done using the sys.stdout.write() or the
sys.stdin.readline() from the sys module.
In the above, example, x is a name that references the object, 5. The process
of assigning a reference to 5 to x is called binding. A binding causes a name
to be associated with an object in the innermost scope of the currently
executing program. Bindings may occur during a number of instances such
as during variable assignment or function or method call when the supplied
parameter is bound to the argument. It is important to note that names are
just symbols and they have no type associated with them; names are just
references to objects that actually have types
3.3 Name-spaces
A name-space as the name implies is a context in which a given set of
names is bound to objects. name-spaces in python are currently
implemented as dictionary mappings. The built-in name-space is an
example of a name-space that contains all the built-in functions and this can
be accessed by entering __builtins__.__dict__ at the terminal (the result
is of a considerable amount). The interpreter has access to multiple name-
spaces including the global name-space, the built-in name-space and the
local name-space. name-spaces are created at different times and have
different lifetimes. For example, a new local name-space is created at the
start of a function execution and this name-space is discarded when the
function exits or returns. The global name-space refers to the module wide
name-space and all names defined in this name-space are available module-
wide. The local name-space is created by function definitions while the
built-in name-space contains all the built-in names. These three name-
spaces are the main name-space available to the interpreter.
3.4 Scopes
A scope is an area of a program in which a set of name bindings (name-
spaces) is visible and directly accessible. Direct access is an important
characteristic of a scope as will be explained when classes are discussed.
This simply means that a name, name, can be used as is, without the need
for dot notation such as SomeClassOrModule.name to access it. At runtime,
the following scopes may be available.
In order to modify the object from the global scope, the global statement is
used as shown in the following snippet.
>>> a = 1
>>> def inc_a():
... global a
... a += 1
...
>>> inc_a()
>>> a
2
Python also has the nonlocal keyword that is used when there is a need to
modify a variable bound in an outer non-global scope from an inner scope.
This proves very handy when working with nested functions (also referred
to as closures). A very trivial illustration of the nonlocal keyword in action
is shown in the following snippet that defines a simple counter object that
counts in ascending order.
>>> def make_counter():
... count = 0
... def counter():
... nonlocal count # nonlocal captures the count binding from
enclosing scope not global\
scope
... count += 1
... return count
... return counter
...
>>> counter_1 = make_counter()
>>> counter_2 = make_counter()
>>> counter_1()
1
>>> counter_1()
2
>>> counter_2()
1
>>> counter_2()
2
3.5 eval()
eval is a python built-in method for dynamically executing python
expressions in a string (the content of the string must be a valid python
expression) or code objects. The function has the following signature
eval(expression, globals=None, locals=None). If supplied, the
globals argument to the eval function must be a dictionary while the
locals argument can be any mapping. The evaluation of the supplied
expression is done using the globals and locals dictionaries as the global
and local name-spaces. If the __builtins__ is absent from the globals
dictionary, the current globals are copied into globals before expression is
parsed. This means that the expression will have either full or restricted
access to the standard built-ins depending on the execution environment;
this way the exection environment of eval can be restricted or sandboxed.
eval when called returns the result of executing the expression or code
object for example:
```python
>>> eval("2 + 1") # note the expression is in a string
3
```
Since eval can take arbitrary code obects as argument and return the value
of executing such expressions, it along with exec, is used in executing
arbitrary Python code that has been compiled into code objects using the
compile method. Online Python interpreters are able to execute python
code supplied by their users using both eval and exec among other
methods.
3.6 exec()
exec is the counterpart to eval. This executes a string interpreted as a suite
of python statements or a code object. The code supplied is supposed to be
valid as file input in both cases. exec has the following signature:
exec(object[, globals[, locals]]). The following is an example of
exec using a string and the current name-spaces.
The type() function returns an object’s type; the type of an object is also an
object itself. An object’s type is also normally unchangeable. An object’s
type determines the operations that the object supports and also defines the
possible values for objects of that type. Python is a dynamic language
because types are not associated with variables so a variable, x, may refer to
a string and later refer to an integer as shown in the following example.
x = 1
x = "Nkem"
This is unlike Javascript where the above succeeds because the interpreter
implicitly converts the integer to a string then adds it to the supplied string.
Python objects are either one of the following:
1. Mutable objects: These refer to objects whose value can change. For
example a list is a mutable data structure as we can grow or shrink the
list at will.
>>> x = [1, 2, 4]
>>> y = [5, 6, 7]
>>> x = x + y
>>> x
[1, 2, 4, 5, 6, 7]
>>>
Programmers new to Python from other languages may find some behavior
of mutable object puzzling; Python is a pass-by-object-reference language
which means that the values of object references are the values passed to
function or method calls and names bound to variables refer to these
reference values. For example consider the snippets shown in the following
example.
>>> x
[1, 2, 3]
# now x and y refer to the same list
>>> y = x
# a change to x will also be reflected in y
>>> x.extend([4, 5, 6])
>>> y
[1, 2, 3, 4, 5, 6]
The weakref.ref function returns an object that when called returns the
weakly referenced object. The weakref module the weakref.proxy
alternative to the weakref.ref function for creating weak references. This
method creates a proxy object that can be used just like the original object
without the need for a call as shown in the following snippet.
>>> d = weakref.proxy(a)
>>> d
<weakproxy at 0x10138ba98 to Foo at 0x1012d6828>
>>> d.__dict__
{}
When all the strong references to an object have deleted then the weak
reference looses it reference to the original object and the object is ready for
garbage collection. This is shown in the following example.
>>> del a
>>> del b
>>> d
<weakproxy at 0x10138ba98 to NoneType at 0x1002040d0>
>>> d.__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ReferenceError: weakly-referenced object no longer exists
>>> c()
>>> c().__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute '__dict__'
None Type
The None type is a singleton object that has a single value and this value is
accessed through the built-in name None. It is used to signify the absence of
a value in many situations, e.g., it is returned by functions that don’t
explicitly return a value as illustrated below:
```python
>>> def print_name(name):
... print(name)
...
>>> name = print_name("nkem")
nkem
>>> name
>>> type(name)
<class 'NoneType'>
>>>
```
NotImplemented Type
The NotImplemented type is another singleton object that has a single
value. The value of this object is accessed through the built-in name
NotImplemented. This object should be returned when we want to delegate
the search for the implementation of a method to the interpreter rather than
throwing a runtime NotImplementedError exception. For example,
consider the two types, Foo and Bar below:
class Foo:
def __init__(self, value):
self.value = value
class Bar:
def __init__(self, value):
self.value = value
Ellipsis Type
This is another singleton object type that has a single value. The value of
this object is accessed through the literal ... or the built-in name Ellipsis.
The truth value for the Ellipsis object is true. The Ellipsis object is
mainly used in numeric python for indexing and slicing matrices. The
numpy documentation provides more insight into how the Ellipsis object
is used.
Numeric Type
Sequence Type
Sequence types are finite ordered collections of objects that can be indexed
by integers; using negative indices in python is legal. Sequences fall into
two categories - mutable and immutable sequences.
```python
>>> b = b'abc'
>>> b
b'abc'
>>> type(b)
<class 'bytes'>
>>> b = bytes('abc', 'utf-16') # encode a string to bytes using UTF-16
encoding
>>> b
b'\xff\xfea\x00b\x00c\x00'
>>> b
b'\xff\xfea\x00b\x00c\x00'
>>> b.decode("utf-8") # decoding fails as encoding has been done with utf-
16
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0:
invalid start byte
>>> b.decode("utf-16") # decoding to string passes
'abc'
>>> type(b.decode("utf-16"))
<class 'str'>
```
```python
>>> names = "Obi", # tuple of 1
>>> names
('Obi',)
>>> type(names)
<class 'tuple'>
Sequence types have some operations that are common to all sequence
types. These are described in the following table; x is an object, s and t are
sequences and n, i, j, k are integers.
Operation Result
x in s True if an item of s is equal to x, else False
x not in
False if an item of s is equal to x, else True
s
s+t the concatenation of s and t
s * n or n
n shallow copies of s concatenated
*s
s[i] ith item of s, origin 0
s[i:j] slice of s from i to j
s[i:j:k] slice of s from i to j with step k
len(s) length of s
min(s) smallest item of s
max(s) largest item of s
s.index(x[, index of the first occurrence of x in s (at or
i[, j]]) after index i and before index j)
s.count(x) total number of occurrences of x in s
Note
1. Values of n that are less than 0 are treated as 0 and this yields an empty
sequence of the same type as s such as below:
>>> x = "obi"
>>> x*-2
''
2. Copies made from using the * operation are shallow copies; any nested
structures are not copied. This can result in some confusion when
trying to create copies of a structure such as a nested list.
>>> lists = [[]] * 3 # shallow copy
>>> lists
[[], [], []] # all three copies reference the same list
>>> lists[0].append(3)
>>> lists
[[3], [3], [3]]
To avoid shallow copies when dealing with nested lists, the following
method can be adopted
```python
>>> lists = [[] for i in range(3)]
>>> lists[0].append(3)
>>> lists[1].append(5)
>>> lists[2].append(7)
>>> lists
[[3], [5], [7]]
```
Python defines the interfaces (thats the closest word that can be used) -
Sequences and MutableSequences in the collections library and these
define all the methods a type must implement to be considered a mutable or
immutable sequence; when abstract base classes are discussed, this concept
will become much clearer.
Set
These are unordered, finite collection of unique python objects. Sets are
unordered so they cannot be indexed by integers. The members of a set
must be hash-able so only immutable objects can be members of a set. This
is so because sets in python are implemented using a hash table; a hash
table uses some kind of hash function to compute an index into a slot. If a
mutable value is used then the index calculated will change when this
object changes thus mutable values are not allowed in sets. Sets provide
efficient solutions for membership testing, de-duplication, computing of
intersections, union and differences. Sets can be iterated over, and the built-
in function len() returns the number of items in a set. There are currently
two intrinsic set types:- the mutable set type and the immutable frozenset
type. Both have a number of common methods that are shown in the
following table.
Method Description
return the cardinality of the
len(s)
set, s.
x in s Test x for membership in s.
Test x for non-membership
x not in s
in s.
Return True if the set has no
elements in common with
isdisjoint(other) other. Sets are disjoint if and
only if their intersection is
the empty set.
Test whether every element
issubset(other), set <= other
in the set is in other.
Test whether the set is a
proper subset of other, that
set < other
is, set <= other and set !
other.
issuperset(other), set >= Test whether every element
other in other is in the set.
Test whether the set is a
proper superset of other, that
set > other
is, set >= other and set !=
other.
Return a new set with
union(other, …), set | other |
elements from the set and all
…
others.
Return a new set with
intersection(other, …), set &
elements common to the set
other & …
and all others.
Return a new set with
difference(other, …), set -
elements in the set that are
other - …
not in the others.
Return a new set with
symmetric_difference(other),
elements in either the set or
set ^ other
other but not both.
Method Description
Return a new set with a
copy()
shallow copy of s.
Mapping
A python mapping is a finite set of objects (values) indexed by a set of
immutable python objects (keys). The keys in the mapping must be
hashable for the same reason given previously in describing set members
thus eliminating mutable types like lists, frozensets, mappings etc. The
expression, a[k], selects the item indexed by the key, k, from the mapping a
and can be used as in assignments or del statements. The dictionary
mostly called dict for convenience is the only intrinsic mapping type built
into python:
Callable Types
These are types that support the function call operation. The function call
operation is the use of () after the type name. In the example below, the
function is print_name and the function call is when the () is appended to
the function name as such print_name().
def print_name(name):
print(name)
Functions are not the only callable types in python; any object type that
implements the __call__ special method is a callable type. The function,
callable(type), is used to check that a given type is callable. The
following are built-in callable python types:
1. User-defined functions: these are functions that a user defines with the
def statement such as the print_name function from the previous
section.
2. Methods: these are functions defined within a class and accessible
within the scope of the class or a class instance. These methods could
either be instance methods, static or class methods.
3. Built-in functions: These are functions available within the interpreter
core such as the len function.
4. Classes: Classes are also callable types. The process of creating a class
instance involves calling the class such as Foo().
Custom Type
Custom types are created using the class statements. Custom class objects
have a type of type. These are types created by user defined programs and
they are discussed in the chapter on object oriented programming.
Module Type
A module is one of the organizational units of Python code just like
functions or classes. A module is also an object just like every other value
in the python. The module type is created by the import system as invoked
either by the import statement, or by calling functions such as
importlib.import_module() and built-in __import__().
File/IO Types
A file object represents an open file. Files are created using the open built-
in functions that opens and returns a file object on the local file system; the
file object can be open in either binary or text mode. Other methods for
creating file objects include:
1. os.fdopen that takes a file descriptor and create a file object from it.
The os.open method not to be confused with the open built-in function
is used to create a file descriptor that can then be passed to the
os.fdopen method to create a file object as shown in the following
example.
>>> import os
>> fd = os.open("test.txt", os.O_RDWR|os.O_CREAT)
>>> type(fd)
<class 'int'>
>>> fd
3
>>> fo = os.fdopen(fd, "w")
>>> fo
<_io.TextIOWrapper name=3 mode='w' encoding='UTF-8'>
>>> type(fo)
<class '_io.TextIOWrapper'>
The built-in objects, sys.stdin, sys.stdout and sys.stderr, are also file
objects corresponding to the python interpreter’s standard input, output and
error streams.
Built-in Types
These are objects used internally by the python interpreter but accessible by
a user program. They include traceback objects, code objects, frame objects
and slice objects
Code Objects
The code object for the above function can be obtained from the function
object by assessing its __code__ attribute as shown below:
>>> return_author_name.__code__
<code object return_author_name at 0x102279270, file "<stdin>", line 1>
We can go further and inspect the code object using the dir function to see
the attributes of the code object.
>>> dir(return_author_name.__code__)
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattri\
bute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__',
'__new__', '__reduce__', '_\
_reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__', 'co_argcount'\
, 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno',
'co_flags', 'co_freevars',\
'co_kwonlyargcount', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals',
'co_stacksize', 'co_varnames\
']
Method Description
number of arguments (not including * or **
co_argcount
args)
co_code string of raw compiled bytecode
co_consts tuple of constants used in the bytecode
name of file in which this code object was
co_filename
created
co_firstlineno number of first line in Python source code
bitmap: 1=optimized | 2=newlocals | 4=*arg
co_flags
| 8=**arg
encoded mapping of line numbers to
co_lnotab
bytecode indices
name with which this code object was
co_name
defined
co_names tuple of names of local variables
co_nlocals number of local variables
co_stacksize virtual machine stack space required
tuple of names of arguments and local
co_varnames
variables
We can view the bytecode string for the function using the co_code method
of the code object as shown below.
>>> return_author_name.__code__.co_code
b'd\x01\x00S'
Frame objects represent execution frames. Python code blocks are executed
in execution frames. The call stack of the interpreter stores information
about currently executing subroutines and the call stack is made up of stack
frame objects. Frame objects on the stack have a one-to-one mapping with
subroutine calls by the program executing or the interpreter. The frame
object contains code objects and all necessary information, including
references to the local and global name-spaces, necessary for the runtime
execution environment. The frame objects are linked together to form the
call stack. To simplify how this all fits together a bit, the call stack can be
thought of as a stack data structure (it actually is), every time a subroutine is
called, a frame object is created and inserted into the stack and then the
code object contained within the frame is executed. Some special read-only
attributes of frame objects include:
1. f_back is to the previous stack frame towards the caller, or None if this
is the bottom stack frame.
2. f_code is the code object being executed in this frame.
3. f_locals is the dictionary used to look up local variables.
4. f_globals is used for global variables.
5. f_builtins is used for built-in names.
6. f_lasti gives the precise instruction - it is an index into the bytecode
string of the code object.
Traceback Objects
Slice Objects
Attribute Description
start which is the lower bound;
stop the optional upper bound;
step the optional step value;
Each of the optional attributes is None if omitted. Slices can take a number
of forms in addition to the standard slice(start, stop [,step]). Other
forms include
a[start:end] # items start to end-1 equivalent to slice(start, stop)
a[start:] # items start to end-1 equivalent to slice(start)
a[:end] # items from the beginning to end-1 equivalent to
slice(None, end)
a[:] # a shallow copy of the whole array equivalent to
slice(None, None)
The start or end values may also be negative in which case we count from
the end of the array as shown below:
a[-1] # last item in the array equivalent to slice(-1)
a[-2:] # last two items in the array equivalent to slice(-2)
a[:-2] # everything except the last two items equivalent to slice(None,
-2)
Generator Objects
With a strong understanding of the built-in type hierarchy, the stage is now
set for examining object oriented programming and how users can create
their own type hierarchy and even make such types behave like built-in
types.
5. Object Oriented Programming
Classes are the basis of object oriented programming in python and are one
of the basic organizational units in a python program.
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return self.balance
Class Objects
The execution of a class statement creates a class object. At the start of the
execution of a class statement, a new name-space is created and this serves
as the name-space into which all class attributes go; unlike languages like
Java, this name-space does not create a new local scope that can be used by
class methods hence the need for fully qualified names when accessing
attributes. The Account class from the previous section illustrates this; a
method trying to access the num_accounts variable must use the fully
qualified name, Account.num_accounts else an error results such as when
the fully qualified name is not used in the __init__ method as shown
below:
class Account(object):
num_accounts = 0
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return self.balance
A little diversion here. One may ask, if the class created is an object then
what is the class of the class object?. In accordance with the Python
philosophy that every value is an object , the class object does indeed have
a class which it is created from; this is the type class.
>>> type(Account)
<class 'type'>
So just to confuse you a bit, the type of a type, the Account type, is type. To
get a better understanding of the fact that a class is indeed an object with its
own class we go behind the scenes to explain what really goes on during the
execution of a class statement using the Account example from above.
>>>class_name = "Account"
>>>class_parents = (object,)
>>>class_body = """
num_accounts = 0
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return self.balance
"""
# a new dict is used as local name-space
>>>class_dict = {}
#the body of the class is executed using dict from above as local
# name-space
>>>exec(class_body, globals(), class_dict)
# viewing the class dict reveals the name bindings from class body
>>> class_dict
{'del_account': <function del_account at 0x106be60c8>, 'num_accounts':
0, 'inquiry': <function i\
nquiry at 0x106beac80>, 'deposit': <function deposit at 0x106be66e0>,
'withdraw': <function withdraw\
at 0x106be6de8>, '__init__': <function __init__ at 0x106be2c08>}
During the execution of class statement, the interpreter carries out the
following steps behind the scene:
Object instantiation is carried out by calling the class object like a normal
function with required parameters for the __init__ method of the class as
shown in the following example:
>>> Account("obi", 0)
An instance object that has been initialized with supplied arguments is
returned from instantiation of a class object. In the case of the Account
class, the account name and account balance are set and, the number of
instances is incremented by 1 in the __init__ method.
Instance Objects
If class objects are the cookie cutters then instance objects are the cookies
that are the result of instantiating class objects. Instance objects are returned
after the correct initialization of a class just as shown in the previous
section. Attribute references are the only operations that are valid on
instance objects. Instance attributes are either data attribute, better known as
instance variables in languages like Java, or method attributes.
Method Objects
If x is an instance of the Account class, x.deposit is an example of a
method object. Method objects are similar to functions however during a
method definition, an extra argument is included in the arguments list, the
self argument. This self argument refers to an instance of the class but
why do we have to pass an instance as an argument to a method? This is
best illustrated by a method call such as the following.
>>> x = Account()
>>> x.inquiry()
10
def del_account(obj):
Account.num_accounts -= 1
def inquiry(obj):
return obj.balance
>>> Account.num_accounts
0
>>> x = Account('obi', 0)
>>> x.deposit(10)
>>> Account.inquiry(x)
10
User defined classes can also implement these special methods; a corollary
of this is that built-in operators such as + or [] can be adapted for use by
user defined classes. This is one of the essence of polymorphism in Python.
In this book, special methods are grouped according to the functions they
serve. These groups include:
Special methods for instance creation
The __new__ and __init__ special methods are the two methods that are
integral to instance creation. New class instances are created in a two step
process; first the static method, __new__, is called to create and return a new
class instance then the __init__ method is called to to initialize the newly
created object with supplied arguments. A very important instance in which
there is a need to override the __new__ method is when sub-classing built-in
immutable types. Any initialization that is done in the sub-class must be
done before object creation. This is because once an immutable object is
created, its value cannot be changed so it makes no sense trying to carry out
any function that modifies the created object in an __init__ method. An
example of sub-classing is shown in the following snippet in which
whatever value is supplied is rounded up to the next integer.
>>> import math
>>> class NextInteger(int):
... def __new__(cls, val):
... return int.__new__(cls, math.ceil(val))
...
>>> NextInteger(2.2)
3
>>>
Users are already familiar with defining the __init__ method; the
__init__ method is overridden to perform attribute initialization for an
instance of a mutable types.
Special methods for attribute access
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
>>> x = Account('obi', 0)
>>> x.balaance
Hey I dont see any attribute called balaance
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
>>> x = Account('obi', 0)
>>> x.balaance # this will result in a RuntimeError: maximum recursion
depth exceeded while call\
ing a Python object exception
The following table shows some of the basic operators and the special
methods invoked when these operators are encountered.
Python has the concept of reflected operations; this was covered in the
section on the NotImplemented of previous chapter. The idea behind this
concept is that if the left operand of a binary arithmetic operation does not
support a required operation and returns NotImplemented then an attempt is
made to call the corresponding reflected operation on the right operand
provided the type of both operands differ. An example of this rarely used
functionality is shown in the following trivial example for emphasis.
class MyNumber(object):
def __init__(self, x):
self.x = x
def __str__(self):
return str(self.x)
>>> 10 - MyNumber(9) # int type, 10, does not know how to subtract
MyNumber type and MyNumbe\
r does not know how to handle the operation too
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'int' and 'MyNumber'
In the next snippet the class implements the reflected special method and
this reflected method is called by the interpreter.
class MyFixedNumber(MyNumber):
def __rsub__(self, other): # reflected operation implemented
return MyNumber(other - self.val)
Sequence and mapping are often referred to as container types because they
can hold references to other objects. User-defined classes can emulate
container types to the extent that this makes sense if such classes implement
the special methods listed in the following table.
Special Method Description
returns length of obj. This is invoked
to implement the built-in function
len(). An object that doesn’t define
__len__(obj) a __bool__() method and whose
__len__() method returns zero is
considered to be false in a Boolean
context.
fetches item, obj[key]. For sequence
types, the keys should be integers or
slice objects. If key is of an
inappropriate type, TypeError may
__getitem__(obj, be raised; if the key has a value
key) outside the set of indices for the
sequence, IndexError should be
raised. For mapping types, if key is
absent from the container, KeyError
should be raised.
__setitem__(obj,
Sets obj[key] = value
key, value)
__delitem__(obj, deletes obj[key]. Invoked by del
key) obj[key]
Returns true if key is contained in
__contains__(obj,
obj and false otherwise. Invoked by
key)
a call to key in obj
This method is called when an
iterator is required for a container.
This method should return a new
iterator object that can iterate over all
the objects in the container. For
__iter__(self) mappings, it should iterate over the
keys of the container. Iterator objects
also need to implement this method;
they are required to return
themselves. This is also used by the
for..in construct.
Sequence types such as lists support the addition (for concatenating lists)
and multiplication operators (for creating copies), + and * respectively, by
defining the methods __add__(), __radd__(), __iadd__(), __mul__(),
__rmul__() and __imul__(). Sequence types also implement the
__reversed__ method that implements the reversed() method that is used
for reverse iteration over a sequence. User defined classes can implement
these special methods to get the required functionality.
Emulating Callable Types
Callable types support the function call syntax, (args). Classes that
implement the __call__(self[, args...]) method are callable. User
defined classes for which this functionality makes sense can implement this
method to make class instances callable. The following example shows a
class implementing the __call__(self[, args...]) method and how
instances of this class can be called using the function call syntax.
class Account(object):
num_accounts = 0
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return self.balance
In Python, x==y is True does not imply that x!=y is False so __eq__()
should be defined along with __ne__() so that the operators are well
behaved. __lt__() and __gt__(), and __le__() and __ge__() are each
other’s reflection while __eq__() and __ne__() are their own reflection;
this means that if a call to the implementation of any of these methods on
the left argument returns NotImplemented, the reflected operator is is used.
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
A few things that are worth noting about __slots__ include the following:
User defined classes have a default hash value that is derived from their
id() value. Any __hash__() implementation must return an integer and
objects that are equal by comparison must have the same hash value so for
two object, a and b, (a==b and hash(a)==hash(b)) must be true. A few
rules for implementing a __hash__() method include the following: 1. A
class should only define the __hash__() method if it also defines the
__eq__() method.
def __neg__(self):
"""
Returns the negation of a vector.
>>> u = Vec({1,3,5,7},{1:1,3:2,5:3,7:4})
>>> -u
Vec({1, 3, 5, 7},{1: -1, 3: -2, 5: -3, 7: -4})
>>> u == Vec({1,3,5,7},{1:1,3:2,5:3,7:4})
True
>>> -Vec({'a','b','c'}, {'a':1}) == Vec({'a','b','c'},
{'a':-1})
True
"""
return Vec(self.D, {key:-self[key] for key in self.D})
def __mul__(self,other):
#If other is a vector, returns the dot product of self and
other
if isinstance(other, Vec):
return dot(self,other)
else:
return NotImplemented # Will cause
other.__rmul__(self) to be invoked
Make sure to add together values for all keys from u.f and v.f
even if some keys in \
u.f do not
exist in v.f (or vice versa)
def is_almost_zero(self):
s = 0
for x in self.f.values():
if isinstance(x, int) or isinstance(x, float):
s += x*x
elif isinstance(x, complex):
y = abs(x)
s += y*y
else: return False
return s < 1e-20
def __str__(v):
"pretty-printing"
D_list = sorted(v.D, key=repr)
numdec = 3
wd = dict([(k,(1+max(len(str(k)), len('{0:.
{1}G}'.format(v[k], numdec))))) if isinst\
ance(v[k], int) or isinstance(v[k], float) else (k,(1+max(len(str(k)),
len(str(v[k]))))) for k in D_\
list])
s1 = ''.join(['{0:>{1}}'.format(str(k),wd[k]) for k in
D_list])
s2 = ''.join(['{0:>{1}.{2}G}'.format(v[k],wd[k],numdec) if
isinstance(v[k], int) or \
isinstance(v[k], float) else '{0:>{1}}'.format(v[k], wd[k]) for k in D_list])
return "\n" + s1 + "\n" + '-'*sum(wd.values()) +"\n" +
s2
def __hash__(self):
"Here we pretend Vecs are immutable so we can form sets of
them"
h = hash(frozenset(self.D))
for k,v in sorted(self.f.items(), key = lambda
x:repr(x[0])):
if v != 0:
h = hash((h, hash(v)))
return h
def __repr__(self):
return "Vec(" + str(self.D) + "," + str(self.f) + ")"
def copy(self):
"Don't make a new copy of the domain D"
return Vec(self.D, self.f.copy())
def __iter__(self):
raise TypeError('%r object is not iterable' %
self.__class__.__name__)
if __name__ == "__main__":
import doctest
doctest.testmod()
5.4 Inheritance
Inheritance is one of the basic tenets of object oriented programming and
python supports multiple inheritance just like C++. Inheritance provides a
mechanism for creating new classes that specialise or modify a base class
thereby introducing new functionality. We call the base class the parent
class or the super class. An example of a class inheriting from a base class
in python is given in the following example.
class Account:
"""base class for representing user accounts"""
num_accounts = 0
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
class SavingsAccount(Account):
def __repr__(self):
return "SavingsAccount({}, {}, {})".format(self.name,
self.balance, self.rate)
Multiple Inheritance
In multiple inheritance, a class can have multiple parent classes. This type
of hierarchy is strongly discouraged. One of the issues with this kind of
inheritance is the complexity involved in properly resolving methods when
called. Imagine a class, D, that inherits from two classes, B and C and there is
a need to call a method from the parent classes however both parent classes
implement the same method. How is the order in which classes are
searched for the method determined ? A Method Resolution Order
algorithm determines how a method is found in a class or any of the class’
base classes. In Python, the resolution order is calculated at class definition
time and stored in the class __dict__ as the __mro__ attribute. To illustrate
this, imagine a class hierarchy with multiple inheritance such as that
showed in the following example.
>>> class A:
... def meth(self): return "A"
...
>>> class B(A):
... def meth(self): return "B"
...
>>> class C(A):
... def meth(self): return "C"
...
>>> class D(B, C):
... def meth(self): return "X"
...
>>>
>>> D.__mro__
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class
'__main__.A'>, <class \
'object'>)
>>>
This section will show the power of the super keyword in a multiple
inheritance hierarchy. The class hierarchy from the previous section is used.
This example is from the excellent write up by Guido Van Rossum on Type
Unification. Imagine that A defines a method that is overridden by B, C and
D. Suppose that there is a requirement that all the methods are called; the
method may be a save method that saves an attribute for each type it is
defined for, so missing any call will result in some unsaved data in the
hierarchy. A combination of super and __mro__ provide the ammunition for
solving this problem. This solution is referred to as the call-next method by
Guido van Rossum and is shown in the following snippet:
class A(object):
def meth(self):
"save A's data"
print("saving A's data")
class B(A):
def meth(self):
"save B's data"
super(B, self).meth()
print("saving B's data")
class C(A):
def meth(self):
"save C's data"
super(C, self).meth()
print("saving C's data")
Static Methods
Static methods are normal functions that exist in the name-space of a class.
Referencing a static method from a class shows that rather than an unbound
method type, a function type is returned as shown below:
class Account(object):
num_accounts = 0
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
@staticmethod
def static_test_method():
return "Current Account"
>>> Account.static_test_method
<function Account.static_test_method at 0x101b846a8>
Class Methods
Class methods as the name implies operate on classes themselves rather
than instances. Class methods are created using the @classmethod decorator
with the class rather than instance passed as the first argument to the
method.
import json
class Account(object):
num_accounts = 0
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
@classmethod
def from_json(cls, params_json):
params = json.loads(params_json)
return cls(params.get("name"), params.get("balance"))
@staticmethod
def type():
return "Current Account"
The above method maybe feasible for enforcing such type checking
for one or two data attributes but as the attributes increase in number it
gets cumbersome. Alternatively, a type_check(type, val) function
could be defined and this will be called in the __init__ method before
assignment; but this cannot be elegantly applied when the attribute
value is set after initialization. A quick solution that comes to mind is
the getters and setters present in Java but that is un-pythonic and
cumbersome.
All the above mentioned issues are all linked together by the fact that they
are all related to attribute references. Attribute access is trying to be
customized.
Enter Python Descriptors
Descriptors provide elegant, simple, robust and re-usable solutions to the
above listed issues. Simply put, a descriptor is an object that represents the
value of an attribute. This means that if an account object has an attribute
name, a descriptor is another object that can be used to represent the value
held by that attribute, name. Such an object implements the __get__,
__set__ or __delete__ special methods of the descriptor protocol. The
signature for each of these methods is shown below:
descr.__get__(self, obj, type=None) --> value
descr.__set__(self, obj, value) --> None
descr.__delete__(self, obj) --> None
def __set__(self,instance,value):
if not isinstance(value,self.type):
raise TypeError("Must be a %s" % self.type)
setattr(instance,self.name,value)
def __delete__(self,instance):
raise AttributeError("Can't delete attribute")
class Account:
name = TypedAttribute("name",str)
balance = TypedAttribute("balance",int, 42)
Class Properties
Defining descriptor classes each time a descriptor is required is
cumbersome. Python properties provide a concise way of adding data
descriptors to attributes. A property signature is given below:
property(fget=None, fset=None, fdel=None, doc=None) -> property
attribute
fget, fset and fdel are the getter, setter and deleter methods for such class
attributes. The process of creating properties is illustrated with the
following example.
class Accout(object):
def __init__(self):
self._acct_num = None
def get_acct_num(self):
return self._acct_num
def del_acct_num(self):
del self._acct_num
Python also provides the @property decorator that can be used to create
read only attributes. A property object has getter, setter, and deleter
decorator methods that can be used to create a copy of the property with the
corresponding accessor function set to the decorated function. This is best
explained with an example:
class C(object):
def __init__(self):
self._x = None
@property
# the x property. the decorator creates a read-only property
def x(self):
return self._x
@x.setter
# the x property setter makes the property writeable
def x(self, value):
self._x = value
@x.deleter
def x(self):
del self._x
class Vehicle(object):
__meta-class__ = ABCMeta
@abstractmethod
def change_gear(self):
pass
@abstractmethod
def start_engine(self):
pass
class Car(Vehicle):
Once, a class implements all abstract methods then that class becomes a
concrete class and can be instantiated by a user.
from abc import ABCMeta, abstractmethod
class Vehicle(object):
__meta-class__ = ABCMeta
@abstractmethod
def change_gear(self):
pass
@abstractmethod
def start_engine(self):
pass
class Car(Vehicle):
def change_gear(self):
print("Changing gear")
def start_engine(self):
print("Changing engine")
Abstract base classes also allow existing classes to register as part of its
hierarchy but it performs no check on whether such classes implement all
the methods and properties that have been marked as abstract. This provides
a simple solution to the second issue raised in the opening paragraph. Now,
a proxy class can be registered with an abstract base class and isinstance
check will return the correct answer when used.
from abc import ABCMeta, abstractmethod
class Vehicle(object):
__meta-class__ = ABCMeta
@abstractmethod
def change_gear(self):
pass
@abstractmethod
def start_engine(self):
pass
class Car(object):
>>> Vehicle.register(Car)
>>> car = Car("Toyota", "Avensis", "silver")
>>> print(isinstance(car, Vehicle))
True
Abstract base classes are used a lot in python library. They provide a mean
to group python objects such as number types that have a relatively flat
hierarchy. The collections module also contains abstract base classes for
various kinds of operations involving sets, sequences and dictionaries.
Whenever we want to enforce contracts between classes in python just as
interfaces do in Java, abstract base classes is the way to go.
6. The Function
The function is another organizational unit of code in Python. Python
functions are either named or anonymous set of statements or expressions.
In Python, functions are first class objects. This means that there is no
restriction on function use as values; introspection on functions can be
carried out, functions can be assigned to variables, functions can be used as
arguments to other function and functions can be returned from method or
function calls just like any other python value such as strings and numbers.
Python also has support for anonymous functions. These functions are
created using the lambda keyword. Lambda expressions in python are of the
form:
lambda_expr ::= "lambda" [parameter_list]: expression
Lambda expressions return function objects after evaluation and have same
attributes as named functions. Lambda expressions are normally only used
for very simple functions in python due to the fact that a lambda definition
can contain only one expression. A lambda definition for the square
function defined above is given in the following snippet.
>>> square = lambda x: x**2
>>> for i in range(10):
square(i)
0
1
4
9
16
25
36
49
64
81
>>>
>>> type(square)
<class 'function'>
Like every other object, introspection on functions using the dir() function
provides a list of function attributes.
def square(x):
return x**2
>>> square
<function square at 0x031AA230>
>>> dir(square)
['__annotations__', '__call__', '__class__', '__closure__', '__code__',
'__defaults__', '__delat\
tr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__',
'__get__', '__getattribut\
e__', '__globals__', '__gt__', '__hash__', '__init__', '__kwdefaults__',
'__le__', '__lt__', '__modu\
le__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__',
'__reduce_ex__', '__repr__', '\
__setattr__', '__sizeof__', '__str__', '__subclasshook__']
>>>
>>> square.__doc__
'return square of given number'
The above function has been defined with a single normal positional
argument, arg and two default arguments, def_arg and def_arg2. The
function can be called in any of the following ways below:
>>> show_args("tranquility")
'arg=tranquility, def_arg=1, def_arg2=2'
>>> show_args_using_mutable_defaults("test")
"arg=test, def_arg=['Hello World']"
>>> show_args_using_mutable_defaults("test 2")
"arg=test 2, def_arg=['Hello World', 'Hello World']"
On every function call, Hello World is added to the def_arg list and
after two function calls the default argument has two hello world
strings. It is important to take note of this when using mutable default
arguments as default values.
show_args(test)
show_args(arg="test")
show_args("test", 3)
The arguments one two three four five are all bunched together
into a tuple that can be accessed via the args argument.
If the values for a function call are in a list then these values can be
unpacked directly into the function as shown below:
>>> args = [1, 2]
>>> print_args(*args)
1
2
The normal argument must be supplied to the function but the *args and
**kwargs are optional as shown below:
At function call the normal argument(s) is/are supplied normally while the
optional arguments are unpacked. This kind of function definition comes in
handy when dealing with function decorators as will be seen in the chapter
on decorators.
When nested functions reference variables from the outer function in which
they are defined, the nested function is said to be closed over the referenced
variable. The __closure__ special attribute of a function object is used to
access the closed variables as shown in the next example.
>>> cl = x.__closure__
>>> cl
(<cell at 0x029E4470: str object at 0x02A0FD90>,)
>>> cl[0].cell_contents
0
>>> c = counter()
>>> c()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in c
UnboundLocalError: local variable 'count' referenced before assignment
>>> c = counter()
>>> c()
1
>>> c()
2
>>> c()
3
Python 3 introduced the nonlocal key word that fixed this closure scoping
issue as shown in the following snippet.
def counter():
count = 0
def c():
nonlocal count
count += 1
return count
return c
Closures can be used for maintaining states (isn’t that what classes are for)
and for some simple cases provide a more succinct and readable solution
than classes. A class version of a logging API tech_pro is shown in the
following example.
class Log:
def __init__(self, level):
self._level = level
log_info = Log("info")
log_warning = Log("warning")
log_error = Log("error")
The same functionality that the class based version possesses can be
implemented with functions closures as shown in the following snippet:
def make_log(level):
def _(message):
print("{}: {}".format(level, message))
return _
log_info = make_log("info")
log_warning = make_log("warning")
log_error = make_log("error")
The closure based version as can be seen is way more succinct and readable
even though both versions implement exactly the same functionality.
Closures also play a major role in a major function decorators. This is a
widely used functionality that is explained in the chapter on meta-
programming. Closures also form the basis for the partial function, a
function that is described in detail in the next section. With a firm
understanding of functions, a tour of some techniques and modules for
functional programming in Python is given in the following section.
Python provides built-in functions such as map, filter and reduce that aid
in functional programming. A description of these functions follows.
The above listed functions are examples of built-in higher order functions in
Python. Some of the functionality they provide can be replicated using
more common constructs. Comprehensions are one of the most popular
alternatives to these higher order functions.
Comprehensions
Python comprehensions are syntactic constructs that enable sequences to be
built from other sequences in a clear and concise manner. Python
comprehensions are of three types namely:
1. List Comprehensions.
2. Set Comprehensions.
3. Dictionary Comprehensions.
List Comprehensions
def squares(numbers):
return map(lambda x:x*x, numbers)
>>> sq = squares(range(10))
The same list can be created in a more concise manner by using list
comprehensions rather than the map function as in the following example.
>>> squares = [x**2 for x in range(10)]
The result of a list comprehension expression is a new list that results from
evaluating the expression in the context of the for and if clauses that follow
it. For example, to create a list of the squares of even numbers between 0
and 10, the following comprehension is used.
>>> even_squares = [i**2 for i in range(10) if i % 2 == 0]
>>> even_squares
[0, 4, 16, 36, 64]
The expression i**2 is computed in the context of the for clause that
iterates over the numbers from 0 to 10 and the if clause that filters out non-
even numbers.
Nested for loops and List Comprehensions
List comprehensions can also be used with multiple or nested for loops.
Consider for example, the simple code fragment shown below that creates a
tuple from pair of numbers drawn from the two sequences given.
>>> combs = []
>>> for x in [1,2,3]:
... for y in [3,1,4]:
... if x != y:
... combs.append((x, y))
...
>>> combs
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
The above can be rewritten in a more concise and simple manner as shown
below using list comprehensions
>>> [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
It is important to take into consideration the order of the for loops as used in
the list comprehension. Careful observation of the code snippets using
comprehension and that without comprehension shows that the order of the
for loops in the comprehension follows the same order if it had been written
without comprehensions. The same applies to nested for loops with nesting
depth greater than two.
Nested List Comprehensions
Set Comprehensions
In set comprehensions, braces rather than square brackets are used to create
new sets. For example, to create the set of the squares of all numbers
between 0 and 10, the following set comprehensions is used.
>>> x = {i**2 for i in range(10)}
>>> x
set([0, 1, 4, 81, 64, 9, 16, 49, 25, 36])
>>>
Dict Comprehensions
Functools
The functools module in Python contains a few higher order functions that
act on and return other functions. A few of the interesting higher order
functions that are included in this module are described.
The built-in sort() method of lists is handy here and accepts a key
argument that can be used to customize sorting, but it only works with
functions that take a single argument thus distance() is unsuitable.
The partial method provides an elegant method of dealing with this
as shown in the following snippet.
>>> pt = (4, 3)
>>> points.sort(key=partial(distance,pt))
>>> points
[(3, 4), (1, 2), (5, 6), (7, 8)]
>>>
The partial function creates and returns a callable that takes a single
argument, a point. Now note that the partial object has captured the
reference point, pt already so when the key is called with the point
argument, the distance function passed to the partial function is used to
compute the distance between the supplied point and the reference
point.
@fun.register(int)
def _(arg, verbose=False):
if verbose:
print("Strength in numbers, eh?", end=" ")
print(arg)
@fun.register(list)
def _(arg, verbose=False):
if verbose:
print("Enumerate this:")
for i, elem in enumerate(arg):
print(i, elem)
fun("Hello, world.")
fun(1, verbose=True)
fun([1, 2, 3], verbose=True)
fun((1, 2, 3), verbose=True)
Hello, world.
Strength in numbers, eh? 1
Enumerate this:
0 1
1 2
2 3
Let me just say, (1, 2, 3)
The ubiquity of sequences requires that they are represented efficiently. One
could come up with multiple ways of representing sequences. For example,
a naive way of implementing sequences would be to store all the members
of a sequence in memory. This however has a significant drawback that
sequences are limited in size to the RAM available on the machine. A more
clever solution is to use a single object to represent sequences. This object
knows how to compute the next required elements of the sequence on the
fly just as it is needed. Python has a built-in protocol exactly for doing this,
the __iter__ protocol. This is strongly related to generators, a brilliant
feature of the language and these are both dived into in the next chapter.
7. Iterators and Generators
In the last section of the previous chapter, the central part sequences play in
functional programming and the need for their efficient representation was
mentioned. The idea of representing a sequence as an objects that computes
and returns the next value of a sequence just at the time such value is
needed for computation was also introduces. This may seem hard to grasp
at first but this chapter is dedicated to explaining all about this wonderful
idea. It however begins with a description of a profound construct that has
been left out of the discussion till now, iterators.
7.1 Iterators
An iterable in Python is any object that implements the __iter__ special
method that when called returns an iterator (the __iter__ special method is
invoked by a call to iter(obj)). Simply put, a Python iterable is any type
that can be used with a for..in loop. Python lists, tuples, dicts and
sets are all examples of built-in iterables. Iterators are objects that
implement the iterator protocol. The iterator protocol in defines the
following set of methods that need to be implemented by any object that
wants to be used as an iterator.
Any class that fully implements the iterator protocol can be used as an
iterator. This is illustrated in the following by implementing a simple
iterator that returns Fibonacci numbers up to a given maximum value.
class Fib:
def __init__(self, max):
self.max = max
>>>for i in Fib(10):
print i
0
1
1
2
3
5
8
A custom range function for looping through numbers can also be modelled
as an iterator. The following is a simple implementation of a range function
that loops from 0 upwards.
class CustomRange:
def __init__(self, max):
self.max = max
def __iter__(self):
self.curr = 0
return self
def __next__(self):
numb = self.curr
if self.curr >= self.max:
raise StopIteration
self.curr += 1
return numb
for i in CustomRange(10):
print i
0
1
2
3
4
5
6
7
8
9
Before attempting to move on, stop for a second and study both examples
carefully. The essence of an iterator is that an iterator object knows how to
calculate and return the elements in the sequence as needed not all at once.
The CustomRange does not return all the elements in the range when it is
initialized rather it returns an object that when the object’s __iter__
method is called returns an iterator object that can calculate the next
element of the range using the steps defined in the __next__ method. It is
possible to define a range function that returns all positive whole numbers
(an infinite sequence) by simply removing the upper bound on the method.
The same idea applies to the Fib iterator. This basic idea just explained
above can be seen in built-in functions that return sequences. For example,
the built-in range function does not return a list as one would intuitively
expect but returns an object that returns a range iterator object when its
__iter__ method is called. To get the sequence as expected the range
iterator object is passed to the list constructor as shown in the following
example.
>>> ran = range(0, 10)
>>> type(ran)
<class 'range'>
>>> dir(ran)
['__class__', '__contains__', '__delattr__', '__dir__', '__doc__',
'__eq__', '__format__', '__ge\
__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__',
'__iter__', '__le__', '__l\
en__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__reversed__', '__\
setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index',
'start', 'step', 'stop']
>>> iter = ran.__iter__()
>>> iter
<range_iterator object at 0x1012a4090>
>>> type(iter)
<class 'range_iterator'>
>>> iter.__next__()
0
>>> iter.__next__()
1
>>> iter.__next__()
2
>>> iter.__next__()
3
>>> iter.__next__()
4
>>> iter.__next__()
5
>>> iter.__next__()
6
>>> iter.__next__()
7
>>> iter.__next__()
8
>>> iter.__next__()
9
>>> iter.__next__()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> ran=range(10)
>>> ran
range(0, 10)
>>> list(ran) # use list to calculate all values in the sequence at once
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>>
7.2 Generators
Generators and iterators have a very intimate relationship. In short, Python
generators are iterators and understanding generators gives one an idea of
how iterators can be implemented. This may sound quite circular but after
going through an explanation of generators, it will become clearer. PEP 255
that describes simple generators refers to generators by their full name,
generator-iterators. Generators just like the name suggests generate (or
consume) values when their __next__ method is called. Generators are used
either by explicitly calling the __next__ method on the generator object or
using the generator object in a for...in loop. Generators are of two types:
1. Generator Functions
2. Generator Expressions
Generator Functions
Generator functions are functions that contain the yield expression. Calling
a function that contains a yield expression returns a generator object. For
example, the Fibonacci iterator can be recast as a generator using the
yield keyword as shown in the following example.
def fib(max):
a, b = 0, 1
while a < max:
yield a
a, b = b, a + b
The generator object executes when its __next__ method is invoked and the
generator object executes all statements in the function definition till the
yield keyword is encountered.
>>> f.__next__()
0
>>> f.__next__()
1
>>> f.__next__()
1
>>> f.__next__()
2
The object suspends execution at that point, saves its context and returns
any value in the expression_list to the caller. When the caller invokes
__next__() method of the generator object, execution of the function
continues till another yield or return expression is encountered or end of
function is reached. This continues till the loop condition is false and a
StopIteration exception is raised to signal that there is no more data to
generate. To quote PEP 255,
The values of the generator can then be accessed using for...in loops or
via a call to the __next__() method of the generator object as shown below.
>>> squares = (i**2 for i in range(10))
>>> for square in squares:
print(square)
0
1
4
9
16
25
36
49
64
81
Generator expression create generator objects without using the yield
expression.
When the algorithm terminates, the remaining numbers not marked in the
list are all the primes below n. Now this is a rather trivial algorithm and this
is implemented using generators.
from itertools import count
def sieve(ints):
while True:
prime = ints.__next__()
yield prime
# ints is now a generator that produces integers that are
not
# multiples of prime
ints = filter_multiples_of_n(prime, ints)
The above example though very simple, shows the beauty of how
generators can be chained together with the output of one acting as input to
another; think of this stacking of generators with one another as a kind of
processing pipeline. The filter_multiples_of_n function is worth
discussing a bit here because it maybe confusing at first. counts(2) when
initialized returns a generator that returns a sequence of consecutive
numbers from 2 so the line, prime=ints.__next__() returns 2 on the first
iteration. After the yield expression, ints=filter_multiples_of_n(2,
ints) is invoked creating a generator that returns a stream of numbers that
are not multiples of 2 - note that the original sequence generator is captured
within this new generator (this is very important). Now on the next iteration
of the loop within the sieve function, the ints generator is invoked. The
generator loops through the original sequence now [3, 4, 5, 6, 7,
....] yielding the first number that is not divisible by 2, 3 in this case. This
part of the pipeline is easy to understand. The prime, 3, is yielded from the
sieve function then another generator that returns non-multiples of the
prime, 3, is created and assigned to ints. This generator captures the
previous generator that produces non- multiples of 2 and that generator
captured the original generator that produces sequences of infinite
consecutive numbers. A call to the __next__() method of this generator
will loop through the previous generator that returns non-multiples of 2 and
every non-multiple of 2 returned by the generator is checked for divisibility
by 3 and if the number is not divisible by 3 it is yielded. This chaining of
generators goes on and on. The next prime is 5 so the generator excluding
the multiples of primes will loop through the generator that returns non-
multiples of 3 which in turn loops through the generator that produces non-
multiple of 2.
– Donald Knuth
To fully grasp the send() method, observe that the argument passed to the
send() method of the generator will be the result of the yield expression so
in the above example, the value that send() is called with is assigned to the
variable, line. The rest of the function is straightforward to understand.
Note that calling send(None) is equivalent to calling the generator’s
__next__() method.
class Task():
def read(source):
for line in source:
yield line
def print_line():
while True:
text = yield
print(text)
def word_count():
word_counts = defaultdict(int)
while True:
text = yield
for word in text.split():
word_counts[word] += 1
print("Word distribution so far is ", word_counts)
def run():
f = open("data.txt")
source = read(f)
tasks = [ Task(print_line()), Task(word_count()) ]
for line in source:
for task in tasks:
try:
task.run(line)
except StopIteration:
tasks.remove(task)
if __name__ == '__main__':
run()
The real benefit of using the new yield from keyword comes from the
ability of a calling generator to send values into the delegated generator as
shown in the following example. Thus if a value is sent into a generator
yield from enables that generator to also implicitly send the same value
into the delegated generator.
>>> def accumulate():
... tally = 0
... while 1:
... next = yield
... if next is None:
... return tally
... tally += next
...
>>> def gather_tallies(tallies):
... while 1:
... tally = yield from accumulate()
... tallies.append(tally)
...
>>> tallies = []
>>> acc = gather_tallies(tallies)
>>> next(acc) # Ensure the accumulator is ready to accept values
>>> for i in range(4):
... acc.send(i)
...
>>> acc.send(None) # Finish the first tally
>>> for i in range(5):
... acc.send(i)
...
>>> acc.send(None) # Finish the second tally
>>> tallies
[6, 10]
The complete semantics for yield from is explained in PEP 380 and given
below.
1. Any values that the iterator yields are passed directly to the caller.
2. Any values sent to the delegating generator using send() are passed
directly to the iterator. If the sent value is None, the iterator’s
__next__() method is called. If the sent value is not None, the
iterator’s send() method is called. If the call raises
StopIteration`, the delegating generator is resumed. Any other
exception is propagated to the delegating generator.
3. Exceptions other than GeneratorExit thrown into the delegating
generator are passed to the throw() method of the iterator. If the call
raises StopIteration, the delegating generator is resumed. Any other
exception is propagated to the delegating generator.
4. If a GeneratorExit exception is thrown into the delegating generator,
or the close() method of the delegating generator is called, then the
close() method of the iterator is called if it has one. If this call results
in an exception, it is propagated to the delegating generator. Otherwise,
GeneratorExit is raised in the delegating generator.
5. The value of the yield from expression is the first argument to the
StopIteration exception raised by the iterator when it terminates.
6. return expr in a generator causes StopIteration(expr) to be raised
upon exit from the generator.
The initial pattern of cells on the grid constitutes the seed of the system.
The first generation is created by applying the above rules simultaneously
to every cell and the discrete moment at which this happens is sometimes
called a tick. The rules continue to be applied repeatedly to create further
generations.
ALIVE = '*'
EMPTY = '-'
TICK = object()
class Grid(object):
def __init__(self, height, width):
self.height = height
self.width = width
self.rows = []
for _ in range(self.height):
self.rows.append([EMPTY] * self.width)
def __str__(self):
output = ''
for row in self.rows:
for cell in row:
output += cell
output += '\n'
return output
class ColumnPrinter(object):
def __init__(self):
self.columns = []
def __str__(self):
row_count = 1
for data in self.columns:
row_count = max(row_count, len(data.splitlines()) + 1)
rows = [''] * row_count
for j in range(row_count):
for i, data in enumerate(self.columns):
line = data.splitlines()[max(0, j - 1)]
if j == 0:
rows[j] += str(i).center(len(line))
else:
rows[j] += line
if (i + 1) < len(self.columns):
rows[j] += ' | '
return '\n'.join(rows)
grid = Grid(5, 5)
grid[1, 1] = ALIVE
grid[2, 2] = ALIVE
grid[2, 3] = ALIVE
grid[3, 3] = ALIVE
columns = ColumnPrinter()
sim = simulate(grid.height, grid.width)
for i in range(6):
columns.append(str(grid))
grid = live_a_generation(grid, sim)
print(columns)
0 | 1 | 2 | 3 | 4 | 5
----- | ----- | ----- | ----- | ----- | -----
-*--- | --*-- | --**- | --*-- | ----- | -----
--**- | --**- | -*--- | -*--- | -**-- | -----
---*- | --**- | --**- | --*-- | ----- | -----
----- | ----- | ----- | ----- | ----- | -----
Generators are a fascinating topic and this chapter has barely scratched the
surface of what is possible. David Beazley gave a series of excellent talks,
1,2 and 3, that go into great detail about very advanced usage of generators.
8. MetaProgramming and Co.
Metaprogramming is quite an interesting area of programming.
Metaprogramming deals with code that manipulates other code. It is a broad
category that covers areas such as function decorators, class decorators,
metaclasses and the use of built-ins like exec, eval and context managers
etc. These constructs sometimes help to prevent repetitive code and most
times add new functionality to a piece of code in elegant ways. In this
chapter, decorators, metaclasses and context managers are discussed.
8.1 Decorators
A decorator is a function that wraps another function or class. It introduces
new functionality to the wrapped class or function without altering the
original functionality of such class or function thus the interface of such
class or function remains the same.
Function Decorators
A good understanding of functions as first class objects is important in
order to understand function decorators. A reader will be well served by
reviewing the material on functions. When functions are first class objects
the following will apply to functions:
The above listed properties of first class functions provide the foundation
needed to explain function decorators. Put simply, function decorators are
“wrappers” that enable the execution of code before and after the
function they decorate without modifying the function itself.
def print_full_name():
print("My name is John Doe")
In the trivial example defined above, the decorator adds a new feature,
printing some information before and after the original function call, to the
original function without altering it. The decorator, logger takes a function
to be decorated, print_full_name and returns a function, func_wrapper
that calls the decorated function, print_full_name, when it is executed.
The decoration process here is calling the decorator with the function to be
decorated as argument. The function returned, func_wrapper is closed over
the reference to the decorated function, print_full_name and thus can
invoke the decorated function when it is executing. In the above, calling
decorated_func results in print_full_name being executed in addition to
some other code snippets that implement new functionality. This ability to
add new functionality to a function without modifying the original function
is the essence of function decorators. Once this concept is understood, the
concept of decorators is understood.
Decorators in Python
Now that the essence of function decorators have been discussed, an
attempt is made to de-construct Python constructs that enable the definition
of decorators more easily. The previous section describes the essence of
decorators but having to use decorators via function compositions as
described is cumbersome. Python introduces the @ symbol for decorating
functions. Decorating a function using the Python decorator syntax is
achieved as shown in the following example.
@decorator
def a_stand_alone_function():
pass
is equivalent to
def func(arg1, arg2, ...):
pass
func = dec2(dec1(func))
without the intermediate func argument. In the above, @dec1 and @dec2 are
the decorator invocations. Stop, think carefully and ensure you understand
this. dec1 and dec2 are function object references and these are the actual
decorators. These values can even be replaced by any function call or a
value that when evaluated returns a function that takes another function.
What is of paramount importance is that the name reference following the @
symbol is a reference to a function object (for this tutorial we assume this
should be a function object but in reality it should be a callable object) that
takes a function as argument. Understanding this profound fact will help in
understanding python decorators and more involved decorator topics such
as decorators that take arguments.
@logger
def print_full_name(first_name, last_name):
print("My name is {} {}".format(first_name, last_name))
print_full_name("John", "Doe")
Note how the *args and **kwargs parameters are used in defining the inner
wrapper function; this is for the simple reason that it cannot be known
beforehand what functions are going to be decorated and thus the function
signature of such functions.
def decorator(func_to_decorate):
func_to_decorate(function_arg1, function_arg2)
return wrapped
return decorator
@decorator_maker_with_arguments("Apollo 11 Landing")
def print_name(function_arg1, function_arg2):
print ("My full name is -- {} {} --".format(function_arg1,
function_arg2))
Functools.wrap
Using decorators involves swapping out one function for another. A result
of this is that meta information such as docstrings in the swapped out
function are lost when using a decorator with such function. This is
illustrated below:
import datetime
@logger
def print_full_name():
"""return john doe's full name"""
print("My name is John Doe")
>>> print(print_full_name.__doc__)
None
>>> print(print_full_name.__name__)
func_wrapper
@wraps(func_to_decorate)
# A wrapper function is defined on the fly
def func_wrapper(*args, **kwargs):
@logger
def print_full_name(first_name, last_name):
"""return john doe's full name"""
print("My name is {} {}".format(first_name, last_name))
>>> print(print_full_name.__doc__)
return john doe's full name
>>>print(print_full_name.__name__)
print_full_name
Class Decorators
Like functions, classes can also be decorated. Class decorations server the
same purpose as function decorators - introducing new functionality
without modifying the actual classes. An example of a class decorator is
given in the following singleton decorator that ensures that only one
instance of a decorated class is ever initialised throughout the lifetime of the
execution of the program.
def singleton(cls):
instances = {}
def get_instance():
if cls not in instances:
instances[cls] = cls()
return instances[cls]
return get_instance
Putting the decorator to use in the following examples shows how this
works. In the following example, the Foo class is initialized twice however
comparing the ids of both initialized objects shows that they both refer to
the same object.
@singleton
class Foo(object):
pass
>>> x = Foo()
>>> id(x)
4310648144
>>> y = Foo()
>>> id(y)
4310648144
>>> id(y) == id(x) # both x and y are the same object
True
>>>
class Foo(object):
__metaclass__ = Singleton
>>> x = Foo()
>>> y = Foo()
>>> id(x)
4310648400
>>> id(y)
4310648400
>>> id(y) == id(x)
True
Instance, static and class methods can also be decorated. The important
thing is to take note of the order in which the decroators are placed in static
and class methods. The decorator must come before the static and class
method decorators that are used to create static and class methods because
these method decorators do not return callable objects. A valid example of
method decorators is shown in the following example.
def timethis(func):
@wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
r = func(*args, **kwargs)
end = time.time()
print(end - start)
return r
return wrapper
@timethis
def instance_method(self, n):
print(self, n)
while n > 0:
n -= 1
@classmethod
@timethis
def class_method(cls, n):
while n > 0:
print(n)
n -= 1
@staticmethod
@timethis
def static_method(n):
while n > 0:
print(n)
n -= 1
>>>Spam.class_method(10)
10
9
8
7
6
5
4
3
2
1
0.00019788742065429688
>>>Spam.static_method(10)
10
9
8
7
6
5
4
3
2
1
0.00014591217041015625
def log(func):
'''Returns a wrapper that wraps func. The wrapper will log the
entry and exit points of the\
function with logging.INFO level.'''
logging.basicConfig()
logger = logging.getLogger(func.__module__)
@functools.wraps(func)
def wrapper(*args, **kwds):
logger.info("About to execute {}".format(func.__name__))
f_result = func(*args, **kwds)
logger.info("Finished the execution of
{}".format(func.__name__))
return f_result
return wrapper
def cache(func):
cache = {}
logging.basicConfig()
logger = logging.getLogger(func.__module__)
logger.setLevel(10)
@functools.wraps(func)
def wrapper(*arg, **kwds):
if not isinstance(arg, collections.Hashable):
logger.info("Argument cannot be cached:
{}".format(arg))
return func(*arg, **kwds)
if arg in cache:
logger.info("Found precomputed result, {}, for
argument, {}".format(cache[arg], arg\
))
return cache[arg]
else:
logger.info("No precomputed result was found for
argument, {}".format(arg))
value = func(*arg, **kwds)
cache[arg] = value
return value
return wrapper
'''
if not kw:
# default level: MEDIUM
debug = 1
else:
debug = kw['debug']
try:
def decorator(f):
def newf(*args):
if debug is 0:
return f(*args)
assert len(args) == len(types)
argtypes = tuple(map(type, args))
if argtypes != types:
msg = info(f.__name__, types, argtypes, 0)
if debug is 1:
raise TypeError(msg)
return f(*args)
newf.__name__ = f.__name__
return newf
return decorator
except KeyError as err:
raise KeyError(key + "is not a valid keyword argument")
except TypeError(msg):
raise TypeError(msg)
def register(cls):
registry[cls.__clsid__] = cls
return cls
@register
class Foo(object):
__clsid__ = ".mp3"
def bar(self):
pass
8.3 Metaclasses
“Metaclasses are deeper magic than 99% of users should ever worry
about. If you wonder whether you need them, you don’t”
– Tim Peters
All values in Python are objects including classes so a given class object
must have another class from which it is created. Consider, an instance, f,
of a user defined class Foo. The type/class of the instance, f, can be found
by using the built-in method, type and in the case of the object, f,the type
of f is Foo.
>>> class Foo(object):
... pass
...
>>> f = Foo()
>>> type(f)
<class '__main__.Foo'>
>>>
This introspection can also extended to a class object to find out the
type/class of such a class. The following example shows the result of
applying the type() function to the the Foo class.
class Foo(object):
pass
>>> type(Foo)
<class 'type'>
In Python, the class of all other class objects is the type class. This applies
to user defined classes as shown above as well as built-in classes as shown
in the following code example.
>>>type(dict)
<class 'type'>
A class such as the type class that is used to create other classes is called a
metaclass. That is all there is to a metaclass - a metaclass is a class that are
used in creating other classes. Custom metaclasses are not used often in
Python but sometimes it is necessary to control the way classes are created
most especially when working on big projects with big team.
The following snippet is the class definition for a simple class that every
Python user is familiar with but this is not the only way a class can be
defined.
# class definition
class Foo(object):
def __init__(self, name):
self.name = name
def print_name():
print(self.name)
The following snippet shows a more involved method for defining the same
class with all the syntactic sugar provided by the class keyword stripped
away. This snippet provides a better understanding of what actually goes on
under the covers during the execution of a class statement.
class_name = "Foo"
class_parents = (object,)
class_body = """
def __init__(self, name):
self.name = name
def print_name(self):
print(self.name)
"""
# a new dict is used as local namespace
class_dict = {}
#the body of the class is executed using dict from above as local
# namespace
exec(class_body, globals(), class_dict)
# viewing the class dict reveals the name bindings from class body
>>>class_dict
{'__init__': <function __init__ at 0x10066f8c8>, 'print_name': <function
blah at 0x10066fa60>}
# final step of class creation
Foo = type(class_name, class_parents, class_dict)
During the execution of class statement, the interpreter carries out the
following procedures behind the scene:
Metaclasses in Action
It is possible to define custom metaclasses that can be used when creating
classes. These custom metaclasses will normally inherit from type and re-
implement certain methods such as the __init__ or __new__ methods.
Imagine that you are the chief architect for a shiny new project and you
have diligently read dozens of software engineering books and style guides
that have hammered on the importance of docstrings so you want to enforce
the requirement that all non-private methods in the project must have
*docstrings; how would you enforce this requirement?
def change_gear(self):
print("Changing gear")
def start_engine(self):
print("Changing engine")
car = Car()
Traceback (most recent call last):
File "abc.py", line 47, in <module>
class Car(object):
File "abc.py", line 42, in __init__
raise TypeError("%s must have a docstring" % key)
TypeError: change_gear must have a docstring
class C(B):
pass
>>> class B(object, metaclass=Final):
... pass
...
>>> class C(B):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in __init__
TypeError: B is final
In the example, the metaclass simply performs a check ensuring that the
final class is never part of the base classes for any class being created.
class Vehicle(object):
__metaclass__ = ABCMeta
@abstractmethod
def change_gear(self):
pass
@abstractmethod
def start_engine(self):
pass
class Car(Vehicle):
Once a class implements all abstract methods then such a class becomes a
concrete class and can be instantiated by a user.
from abc import ABCMeta, abstractmethod
class Vehicle(object):
__metaclass__ = ABCMeta
@abstractmethod
def change_gear(self):
pass
@abstractmethod
def start_engine(self):
pass
class Car(Vehicle):
def change_gear(self):
print("Changing gear")
def start_engine(self):
print("Changing engine")
@staticmethod
def convert(name):
s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()
It would not be possible to modify class attributes such as the list of base
classes or attribute names in the __init__ method because as has been said
previously, this method is called after the object has already been created.
On the other hand, when the intent is just to carry out initialization or
validation checks such as was done with the DocMeta and Final
metaclasses then the __init__ method of the metaclass should be
overridden.
The with statement can be used with any object that implements the context
management protocol. This protocol defines a set of operations, __enter__
and __exit__ that are executed just before the start of execution of some
piece of code and after the end of execution of some piece of code
respectively. Generally, the definition and use of a context manager is
shown in the following snippet.
class context:
def __enter__(self):
set resource up
return resource
If the initialised resource is used within the context then the __enter__
method must return the resource object so that it is bound within the with
statement using the as mechanism. A resource object must not be returned
if the code being executed in the context doesn’t require a reference to the
object that is set-up. The following is a very trivial example of a class that
implements the context management protocol in a very simple fashion.
>>> class Timer:
... def __init__(self):
... pass
... def __enter__(self):
... self.start_time = time.time()
... def __exit__(self, type, value, traceback):
... print("Operation took {} seconds to
complete".format(time.time()-self.start_time))
...
...
>>> with Foo():
... print("Hey testing context managers")
...
Hey testing context managers
Operation took 0.00010395050048828125 seconds to complete
>>>
This context generator function, time_func in this case, must yield exactly
one value if it is required that a value be bound to a name in the with
statement’s as clause. When generator yields, the code block nested in the
with statement is executed. The generator is then resumed after the code
block finishes execution. If an exception occurs during the execution of a
block and is not handled in the block, the exception is re-raised inside the
generator at the point where the yield occurred. If an exception is caught
for purposes other than adequately handling such an exception then the
generator must re-raise that exception otherwise the generator context
manager will indicate to the with statement that the exception has been
handled, and execution will resume normally after the context block.
9.1 Modules
Modules enable the reuse of programs. A module is a file that contains a
collection of definitions and statements and has a .py extension. The
contents of a module can be used by importing the module either into
another module or into the interpreter. To illustrate this, our favourite
Account class shown in the following snippet is saved in a module called
account.py.
class Account:
num_accounts = 0
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return self.balance
To re-use the module definitions, the import statement is used to import the
module as shown in the following snippet.
Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import account
>>> acct = account.Account("obi", 10)
>>> acct
<account.Account object at 0x101b6e358>
>>>
All executable statements contained within a module are executed when the
module is imported. A module is also an object that has a type - module as
such all generic operations that apply to objects can be applied to modules.
The following snippets show some unintuitive ways of manipulating
module objects.
Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import account
>>> type(account)
<class 'module'>
>>> getattr(account, 'Account') # access the Account class using getattr
<class 'cl.Account'>
>>> account.__dict__
{'json': <module 'json' from
'/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/js\
on/__init__.py'>, '__cached__':
'/Users/c4obi/writings/scratch/src/__pycache__/cl.cpython-34.pyc', '\
__loader__': <_frozen_importlib.SourceFileLoader object at 0x10133d4e0>,
'__doc__': None, '__file__'\
: '/Users/c4obi/writings/scratch/src/cl.py', 'Account': <class
'account.Account'>, '__package__': ''\
, '__builtins__': { ...} ...
}
Each module possesses its own unique global namespace that is used by all
functions and classes defined within the module and when this feature is
properly used, it eliminates worries about name clashes from third party
modules. The dir() function without any argument can be used within a
module to find out what names are available in a module’s namespace.
As mentioned, a module can import another module; when this happens and
depending on the form of the import, the imported module’s name, part of
the name defined within the imported module or all names defined with the
imported module could be placed in the namespace of the module doing the
importing. For example, from account import Account imports and place
the Account name from the account module into the namespace, import
account imports and adds the account name referencing the whole module
to the namespace while from account import * will import and add all
names in the account module except those that start with an underscore to
the current namespace. Using from module import * as a form of import
is strongly advised against as it may import names that the developer is not
aware of and that conflict with names used in the module doing the
importing. Python has the __all__ special variable that can be used within
modules. This value of the __all__ variable should be a list that contains
the names within a module that are imported from such module when the
from module import * syntax is used. Defining this method is totally
optional on the part of the developer. We illustrate the use of the __all__
special method with the following example.
__all__ = ['Account']
class Account:
num_accounts = 0
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return self.balance
class SharedAccount:
pass
Reloading Modules
Once modules have been imported into the interpreter, any change to such a
module is not reflected within the interpreters. However, Python provides
the importlib.reload that can be used to re-import a module once again
into the current namepace.
The paths of the standard library can be found by running the Python
interpreter with the -S option; this prevents the site.py initialization that
adds the third party package paths to the sys.path list. The location of the
standard library can also be overriden by defining the PYTHONHOME
environment variable that replaces the sys.prefix and sys.exec_prefix.
9.3 Packages
Just as modules provide a mean for organizing statements and definitions,
packages provide a mean for organizing modules. A close but imperfect
analogy of the relationship of packages to modules is that of folders to files
on computer file systems. A package just like a folder can be composed of a
number of module files. In Python however, packages are just like modules;
in fact all packages are modules but not all modules are packages. The
difference between a module and package is the presence of a __path__
special variable in a package object that does not have a None value.
Packages can have sub-packages and so on; when referencing a package
and it corresponding sub-packages the dot notation is used so a complex
number sub-package within a mathematics package will be referenced as
math.complex.
Regular Packages
A regular package is one that consists of a group of modules in a folder
with an __init__.py module within the folder. The presence of this
__init__.py file within the folder cause the interpreter to treat the folder as
a package. An example of package structure is the following.
parent/ <----- folder
__init__.py
one/ <------ sub-folder
__init__.py
a.py
two/ <------ sub-folder
__init__.py
b.py
The parent, one and two folders are all packages because they all contain an
__init__.py module within each of their respective folders. one and two
are sub-packages of the parent package. Whenever a package is imported,
the __init__.py module of such a package is executed. One can think of
the __init__.py as the store of attributes for the package - only symbols
defined in this module are attributes of the imported module. Assuming the
__init__.py module from the above parent package is empty and the
package, parent, is imported using import parent, the parent package
will have no module or subpackage as an attribute. The following code
listing shows this.
>>> import parent
>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__', '__package__',
'__spec__', 'parent']
>>> dir(parent)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__',
'__name__', '__package__\
', '__path__', '__spec__']
When the above method is used then the fully qualified name for the
module, parent.one.a, must be used to access any symbol in the module.
Note that when using this method of import, the last symbol can be either a
module or sub-package only; classes, functions or variables defined within
modules are not allowed. It is also possible to import just the module or
sub-package that is needed as the following example shows.
# importing just required module
from parent.one import a
Symbols defined in the a module or modules in the one package can then be
referenced using dot notation with just a or one as the prefix. The import
forms, from package import * or from package.subpackage import *,
can be used to import all the modules in a package or sub-package. This
form of import should however be used carefully if ever used as it may
import some names into the namespace that may cause naming conflicts.
Packages support the __all__ (the value of this should by convention be a
list) variable for listing modules or names that are visible when the package
is imported using the from package import * syntax. If __all__ is not
defined, the statement from package import * does not import all
submodules from the package into the current namespace rather it only
ensures that the package has been imported possibly running any
initialization code in __init__.py and then imports whatever symbols are
defined in the __init__.py module; including any names defined here and
submodules imported here.
Namespace Packages
A namespace package is a package in which the component modules and
sub-packages of the package may reside in multiple different locations. The
various components may reside on different part of the file system, in zip
files, on the network or on any other location searched by interpreter during
the import process however when the package is imported, all components
exist in a common namespace. To illustrate a namespace package, observe
the following directory structures containing modules; both directories,
apollo and gemini could be located on any part of the file system and not
necessarily next to each other.
apollo/
space/
test.py
gemini/
space/
test1.py
Observe that the two different package directories are now logically
regarded as a single name space and either space.test or space.test1 can
be imported as if they existed in the same package. The key to a namespace
package is the absence of the __init__.py modules in the top-level
directory that serves as the common namespace. The absence of the
__init__.py module causes the interpreter to create a list of all directories
within its sys.path variable that contain a matching directory name rather
than throw an exception. A special namespace package module is then
created and a read-only copy of the list of directories is stored in its
__path__ variable. The following code listing gives an example of this.
>>> space.__path__
_NamespacePath(['apollo/space', 'gemini/space'])
Once this directory is added to sys.path along with the other packages, it
would seamlessly merge together with the other space package directories
and the contents can also be imported along with any existing artefacts.
>>> import space.custom
>>> import space.test
>>> import space.test1
If the __import__ call does not find the requested module then an
ImportError is returned.
1. Built-in modules,
2. Frozen modules and
3. Path based modules - this finder handles imports that have to interact
with the import path given by the sys.path variable as shown in the
following.
The interpreter continues the search for the module by querying each finder
in the meta_path to find which can handle the module. The finder objects
must implement the find_spec method that takes three arguments: the first
is the fully qualified name of the module, the second is an import path that
is used for the module search - this is None for top level modules but for
sub-modules or sub-packages, it is the value of the parent package’s
__path__ and the third argument is an existing module object that is passed
in by the system only when a module is being reloaded.
If one of the finders locates the module, it returns a module spec that is used
by the interpreter import machinery to create and load the module (loading
is tantamount to executing the module). The loaders carry out the module
execution in the module’s global namespace. This is done by a call to the
importlib.abc.Loader.exec_module() method with the already created
module object as argument.
Customizing the import process
The import process can be customized via import hooks. There are two
types of this hook: meta hooks and import path hooks.
Meta hooks
These are called at the start of the import process immediately after the
sys.modules cache lookup and before any other process. These hooks can
override any of the default finders search processes. Meta hooks are
registered by adding new finder objects to the sys.meta_path variable.
def __init__(self):
self.restr_module_names = ['os']
import sys
# remove os from sys.module cache
del sys.modules['os']
sys.meta_path.insert(0, RestrictedImportFinder())
import os
Each hooks knows how to handle a particular kind of file. For example, an
attempt to get the finder for one of the entries in sys.path is attempted in
the following snippet.
>>> sys.path_hooks
[<class 'zipimport.zipimporter'>, <function FileFinder.path_hook.
<locals>.path_hook_for_FileFind\
er at 0x1003c1b70>]
# sys.prefix is a directory
>>> path = sys.prefix
# sys.path_hooks[0] is associated with zip files
>>> finder = sys.path_hooks[0](path)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
zipimport.ZipImportError: not a Zip file
>>> finder = sys.path_hooks[1](path)
>>> finder
FileFinder('/Library/Frameworks/Python.framework/Versions/3.4')
>>>
New import path hooks can be added by inserting new callables into the
sys.path_hooks.
Why You Probably Should Not Reload Modules…
Now that we understand that the last step of a module import is the exec of
the module code within the global namespace of the importing module, it is
clearer why it maybe a bad idea to use the importlib.reaload to reload
modules that have changed.
A module reload does not purge the global namespace of objects from the
module being imported. Imagine a module, Foo, that has a function,
print_name imported into another module, Bar; the function,
Foo.print_name, is referenced by a variable, x, in the module, Bar. Now if
the implementation for print_name is changed for some reason and then
reloaded in Bar, something interesting happens. Since the reload of the
module Foo will cause an exec of the module contents without any prior
clean-up, the reference that x holds to the previous implementation of
Foo.print_name will persist thus we have two implementations and this is
most probably not the behaviour expected.
For this reason, reloading a module is something that maybe worth avoiding
in any sufficiently complex Python program.
A set-up script using distutils is a setup.py file. For a program with the
following package structure,
```python
parent/
__init__.py
spam.py
one/
__init__.py
a.py
two/
__init__.py
b.py
```
The setup.py file must exist at the top level directory so in this case, it
should exist at parent/setup.py. The values used in the set-up script are
self explanatory. py_modules will contain the names of all single file python
modules, packages will contains a list of all packages,scripts will contain a
list of all scripts within the program. The rest of the arguments though not
exhaustive of the possible parameters are self explanatory.
Once the setup.py file is ready, the following snippet is used at the
commandline to create an archive file for distribution.
>>> python setup.py sdist
sdist will create an archive file (e.g., tarball on Unix, ZIP file on
Windows) containing your setup script setup.py, your modules and
packages. The archive file will be named parent-1.0.tar.gz (or .zip), and will
unpack into a directory parent-1.0.. To install the created distribution, the
file is unzipped and python setup.py install is run inside the directory.
This will install the package in the site-packages directory for the
installation.
One can also create one or more built distributions for programs. For
instance, if running a Windows machine, one can make the use of the
program easy for end users by creating an executable installer with the
bdist_wininst command. For example:
class Account:
"""base class for representing user accounts"""
num_accounts = 0
def del_account(self):
Account.num_accounts -= 1
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance) =
Some of the methods from the inspect module for handling source code
include:
>>> inspect.getfile(test.Account)
'/Users/c4obi/src/test_concat.py'
>>>
1. Signature: This can be used to represent the call signature and return
annotation of a function or method. A Signature object can be
obtained by calling the inspect.signature method with a function or
method as argument. Each parameter accepted by the function or
method is represented as a Parameter object in the parameter
collection of the Signature object. Signature objects support the
bind method for mapping from positional and keyword arguments to
parameters. The bind(*args, **kwargs) method will return a
BoundsArguments object if *args and **kwargs match the signature
else it raises a TypeError. The Signature class also has the
bind_partial(*args, **kwargs) method that works in the same way
as Signature.bind but allows the omission of some arguments.
>>> def test(a, b:int) -> int:
... return a^2+b
...
>>> inspect.signature(test)
<inspect.Signature object at 0x101b3c518>
>>> sig = inspect.signature(test)
>>> dir(sig)
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribut\
e__', '__gt__', '__hash__', '__init__', '__le__', '__lt__',
'__module__', '__ne__', '__new__', '__re\
duce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__',
'__slots__', '__str__', '__subcla\
sshook__', '_bind', '_bound_arguments_cls', '_parameter_cls',
'_parameters', '_return_annotation', '\
bind', 'bind_partial', 'empty', 'from_builtin', 'from_function',
'parameters', 'replace', 'return_an\
notation']
>>> sig.parameters
mappingproxy(OrderedDict([('a', <Parameter at 0x101cbf708 'a'>), ('b',
<Parameter at 0x101cbf828 'b\
'>)]))
>>> str(sig)
'(a, b:int) -> int'
>>> sig
<inspect.Signature object at 0x101b3c5c0>
>>> sig.bind(1, 2)
<inspect.BoundArguments object at 0x1019e6048>
@wraps(func)
def wrapper(*args, **kwargs):
bound_values = sig.bind(*args, **kwargs)
# Enforce type assertions across supplied arguments for
name, value in bound_values.\
arguments.items():
if name in bound_types:
if not isinstance(value, bound_types[name]):
raise TypeError('Argument {} must be
{}'.format(name,bound_types[name]) )
return func(*args, **kwargs)
return wrapper
return decorate
The inspect module has predicates for this method that include isclass,
ismethod, isfunction, isgeneratorfunction, isgenerator,
istraceback, isframe, iscode, isbuiltin, isroutine, isabstract,
ismethoddescriptor.