Study Guide Module 1 (Numbers, Sets and Functions)
Study Guide Module 1 (Numbers, Sets and Functions)
We will assume that the notion of set, e.g. a collection of elements, is a natural idea, i.e.
everyone understands immediately what a set is. There are four important sets we will
use throughout this course. They consist of infinite collections of numbers, with di↵erent
features. These sets are:
1
2 Module 1. Numbers, sets and functions
• N: the set of natural numbers. This is the set toddlers learn to partially enumerate
and consists of the positive integers, i.e. 1, 2, 3, . . . Note that it does not contain zero
(0) and is an infinite set.
• Z: the set of relative integers. This set contains the negative and positive integers,
and zero, i.e. . . . 3, 2, 1, 0, 1, 2, 3, . . . Note that it is an infinite set. Its symbol
comes from the German “Zahlen”, meaning “numbers”.
• Q: the set of rational numbers. This set consists of numbers that can be expressed
p
as a fraction , where p 2 Z and q 2 N. It is made of positive and negative numbers
q
(along with zero), that result from the division of positive and/or negative integers,
either in a finite decimal form (e.g. 3/16 = 0.1875) or in a periodic decimal form (e.g.
3/11 = 0.27272727 . . . = 0.27). It is, of course, an infinite set.
• R: the set of real numbers. This is the set we will use most often and is fundamental
for calculus. It is a set that contains all the above sets as subsets, but it is not simply
the combination of those above, as will be clear shortly.
The set of real numbers R is introduced because N, Z, Q are not adequate to correctly
describe objects or operations that are central to science and engineering, like natural laws
or sequences of operations. In fact, in some cases sets di↵erent to R can give rise to bizarre
situations, because they lack important “structural properties” of R. Although we will not
explore this in depth, some examples can clarify these ideas.
p
Example 1.1.1. Which of the previous sets contain 2?
p p
Solution. R is the only one of these sets that contains 2. In fact, 2 = 1.4142135623 . . .
and cannot be written as an integer or a fraction of two integers. Its decimal form is not
periodic.
p p p p
In general, radical expressions or surds such as 2, 3 3, 3 5 · 4 23, . . ., belong to a specific
set, namely the set of irrational numbers or I, no part of which intersects Q.
Example 1.1.2. Where does the Euler number “e = 2.718281828459 . . .”, the natural
basis of logarithms, belong? Is it possible to define it in a recursive or repeated way?
Solution. The Euler number belongs to R only. Again, this is a number that has a non-
periodic decimal form. It does not arise from any elementary operation we know, like
performing a division or taking a square root.
Although e is not a member of Q, it can be shown to result from a recursive operation “in-
side” Q. The operation always involves elements of Q, i.e. rational numbers, but ultimately
tends to an element that is not part of Q. This is a problematic occurrence.
n = 1, (1 + 1/1)1 = 2
n = 2, (1 + 1/2)2 = 2.25
n = 3, (1 + 1/3)3 = 2.370
n = 4, (1 + 1/4)4 = 2.44140625
...
n = 500, (1 + 1/500)500 = 2.71556852065 · · ·
...
5,000,000
n = 5, 000, 000, (1 + 1/5, 000, 000) = 2.71828155523 · · · .
The last number is close to the actual value of e = 2.718281828459 . . . and higher n’s will
produce better approximations. Note though that any previous result is a rational number,
since it is the result of a multiplication of the rational expression (1 + 1/n) n times with
itself.
Now, if we imagine taking this sequence to the limit as n becomes indefinitely large, the
result is exactly e, although e cannot be expressed as a rational number since it is non
periodic. We will later specify in detail what we mean by taking the “limit” of an expression
like (1 + 1/n)n .
So, the major point is: we have a sequence or an operation always defined within Q that
converges to a number that is not in Q. This is related to the concept of completeness of a
set, i.e. Q is not complete.
A real number is a positive or negative number (or zero) with periodic or non periodic
decimal part. The set of all real numbers R is a complete set.
p R will not su↵er from problems such as the one described above
Because of its definition,
for Q, for the cases 2 and e. Saying that R is complete, (whereas Q is not), means
that recursive or repeated operations like that used to define “e” always “stay” in R and
ultimately result in an element that belongs to R.
–2 –1 0 1 2 3
Figure 1.1. The number line representing R, with elements of Z
The real line is drawn to scale so that the distance between two successive integer numbers
is the same. A number in any of the number sets previously defined, is represented by a
4 Module 1. Numbers, sets and functions
point on the real line. For example, ⇡ = 3.14159 . . . is located between 3.1 and 3.2, as
shown in the Figure below.
A second property of R associated with completeness is that, for any two elements on the
real line, there is also a third element in between those two, no matter how close the two
initial elements are. To understand this properly, the concept of ordering is needed.
3.1 3.2
• If a < b and c > 0 then ac < bc. For example, we have 2 < 5 and 3 > 0, so: 2·3 < 5·3,
i.e. 6 < 15.
• If a < b and c < 0 then ac > bc. For example, we have 2 < 5 and 3 < 0, so:
2 · ( 3) > 5 · ( 3), i.e. 6 > 15.
Note: From now on, the symbol of ordinary multiplication between two elements (numbers
or letters) will be “·”. The symbol “⇥”, must not be used to indicate ordinary multiplica-
tion.
Example 1.1.3. Solve 3x 4 < 8.
1.2. Sets and their operations 5
Solution. Adding 4 to both sides of the inequality gives 3x < 12. Now multiply both sides
1
of the inequality by and we have x < 4. All these steps, that conform with the rules
3
given above, are equivalent to “taking 4 to the right hand side (RHS) and dividing by 3
to find x”.
Note: From now on, to indicate the right hand side or the left hand side of an expression
we will use the abbreviations RHS and LHS, respectively.
Solution. Adding 6 to both sides gives 2x > 3x 2; adding 3x to both sides gives
1 2
5x > 2. Multiplying both sides by (and hence reversing the inequality) gives x < .
5 5
We assume that the student knows how to solve these inequalities. More examples on
inequalities and the algebra required to manipulate them are in the “Revision” section, at
the end of this Study Guide.
As previously mentioned, there is no need to formally define what a set is: the concept of
set is an intuitive (or “primitive”) notion, like counting, that every human being develops
during childhood. Also, the idea of an element being part of a set or “belonging to” a set
will be considered as primitive, and indicated by the symbol “2” (conversely, “2” / if the
element is not a member of the set). Let us now concentrate on particularly useful sets in
R, i.e. intervals.
The branch of Mathematics that studies the relations between sets is called “set theory”.
For our purpose, two important types of sets are the open and closed sets, equivalently
called open and closed intervals .
Note that open intervals do not contain their end points, i.e. a 2
/ (a, b) and b 2
/ (a, b). An
example of how to represent a given open interval on the real line is shown below.
Figure 1.3. Open interval ( 0.5, 1.5) = {x 2 R | 0.5 < x < 1.5}
With reference to the symbols used above, the expression (a, b) = {x 2 R | a < x < b}
should be read as follow: (a, b) is equal (“=”) to the set of all elements x belonging to R,
6 Module 1. Numbers, sets and functions
such that (“|”) x is larger than a (“a < x”) and smaller than b (“x < b”). It should also be
clear from the context that the notation (a, b) indicates an interval rather than an ordered
pair of numbers a and b in the Cartesian plane.
Note that the braces { and } are used to denote a set, in general. So, for instance, the set
of the first three elements of N is written as {1, 2, 3} or the set of three friends James, Alex
and Frank is {James, Alex, F rank}.
In this case, the end points are included, as shown below. The expression [a, b] is interpreted
in the same way as was the case for (a, b), the only di↵erence being in the character of the
inequalities, i.e. “” now, not “<”.
Note also that half open (or half closed) intervals can be defined, for example [a, b) or (a, b].
Any open or closed interval (a, b) or [c, d] in R can also be indicated by capital letters A
and B. For example, we may indicate the above generic sets as follows:
As stated, this can be read as “the set A consists of all elements x 2 R such that x belongs
to (a, b)” and “the set B consists of all elements x 2 R such that x belongs to [c, d]”. Let
us refresh some properties of sets that should be well known.
Note: Technically, the notation includes the case of A being a subset of itself. We have
that A ✓ A always. We can exclude this case by asking that, at least, one element of B
does not belong to A. In this case A is said to be a proper subset of B and we write
A ⇢ B.
1.2. Sets and their operations 7
A [ B = {x 2 R | x 2 A or x 2 B}.
A \ B = {x 2 R | x 2 A and x 2 B}.
A B = A\B = {x 2 R | x 2 A, and x 2
/ B}
Note: What this means is that one can also “subtract” one set from the other set. For
example, if A = {1, 2, 3, 4, 5} and B = {1, 2, 3, 6}, A\B = {4, 5}.
A = ;.
Note: Clearly for a generic set B, ; [ B = B and ; \ B = ;. Also, be aware that the
empty set is not the set that contains zero, i.e. ; =
6 {0}
The properties introduced above are convenient and will be used to express sets in R that
are combinations of intervals or contain gaps or holes on the real line. Two examples are
given below.
Example 1.2.1. Rewrite the set A = ( 1, 1) [ (1, +1) using the complement operation.
Solution.
A = ( 1, 1) [ (1, +1) = R\{1}.
Note: We introduce here the symbol 1 to denote “infinity”. The symbol 1 enables us
easily to describe the process of a number getting larger and larger without bound as well
as the idea of a number set without bound on the left or right or both as exemplified by
the following notations: x ! 1, (0, 1), ( 1, 0), ( 1, 1). We emphasise that 1 is not a
number and should not be maniulated as if it is.
Example 1.2.2. Given the two intervals A = (1, 5) and B = [2, 4], express A [ B, A \
B, A\B.
Solution.
A [ B = (1, 5)
A \ B = B = [2, 4]
A\B = (1, 2) [ (4, 5)
8 Module 1. Numbers, sets and functions
The set of natural numbers N has special relevance not only for abstract or historic reasons
(it was a set well known to ancient Greek mathematicians, e.g. Pythagoras) but because
many operations that are important in real life depend on positive integers. Processes that
involve an integer number of di↵erent, subsequent steps are used everyday in engineering,
computer science and robotics. For example, when we code we are telling a machine to “do
this first, then do that, then something else”.
In this sense, one of the most useful characteristics of N is that it is possible to prove that a
general statement is true using the induction principle, which establishes whether a certain
inequality, procedure or method is valid for every natural number.
Solution. Clearly, all (i), (ii), (iii) and (iv) are properties, but only (i), (ii), and (iv) are
1
true. Property (iii) is false because n = 1 gives , which is negative. The same occurs for
3
n = 2, 3, whereas for n = 4 the property is undefined. In general, it is sufficient to prove
that the property is not valid for a single number N 2 N to conclude that the property is
false.
Property (iv) is also true because while the inequality is invalid for n = 1, 2 these two
numbers have been correctly excluded from N. It is thus a property valid only for a subset
of N, i.e. N\{1, 2}.
We are now ready to formulate one of the most powerful principles of Mathematics, which
we enunciate in the form of a “theorem”.
Suppose that a property P(n) is true for n = 1. Now suppose also that if the property is
assumed to be valid for a generic n, then this results in it also being valid for n + 1. Then
P(n) is valid for all numbers n 2 N.
1.3. The induction principle 9
Let us see how this theorem allows us to prove important relations among elements of N.
Example 1.3.2. Carl Friedrich Gauss (1777-1855) is arguably one of the greatest math-
ematicians of all times. A famous (but not completely verified) anecdote tells us that one
day, when he was 8 (or 9) years old, he and the other pupils in his class were being unruly.
The teacher, to discipline the students, asked the whole class to compute the sum of the
first 100 natural numbers, i.e. the result of 1 + 2 + 3 + 4 + . . . + 100. Soon all the children,
except little Carl, started to do one addition after the other, in silence. After thinking for
less than one minute, Carl stood up, walked to the teacher and told him the solution: 5050.
The teacher was stunned. How did he do it?
Solution. The very young Gauss had probably discovered the following property P(n), valid
for every natural n:
n
n(n + 1)
P(n) : k= .
k=1
2
The above symbol indicates a sum over an element k, and, in this case, runs from k = 1
up to k = n:
n
k = 1 + 2 + 3 + 4 + . . . + n.
k=1
The symbol is also known as the “ -notation” or summation notation.
Applied to the first 100 natural numbers, the property gives
100(100 + 1)
1 + 2 + 3 + 4 + . . . + 100 = = 50(101) = 5050.
2
Using the induction principle we are able to prove this identity. The first step is to see if
P(n) is true for the first element of N, e.g. n = 1. This is often easy. In this example the
LHS of the property for n = 1 is
LHS = 1,
and the RHS is
1(1 + 1)
RHS = = 1.
2
So, we have easily proved that LHS=RHS for P(1), since the “sum of integers” reduces to
1 in this case.
Now, let us suppose that the property P(n) is valid. This is often called the “induction
hypothesis”. We will use this to prove that
(n + 1)(n + 2)
P(n + 1) : 1 + 2 + 3 + . . . + n + (n + 1) =
2
is also valid. In fact, indicating the LHS of P(n + 1) as S(n + 1), we have
S(n + 1) = 1 + 2 + 3 + . . . + n + (n + 1)
= (1 + 2 + 3 + . . . + n) + (n + 1)
n(n + 1)
= + (n + 1) [Using the induction hypothesis]
2
n
= (n + 1) +1
2
(n + 1)(n + 2)
= .
2
10 Module 1. Numbers, sets and functions
In the last line, we first factored out (n + 1) from the sum and then re-expressed n/2 + 1 as
(n + 2)/2. The last term of this expression is clearly the RHS of property P(n + 1), so we
conclude that P(n) is valid for all natural numbers n. Note that it is unlikely that young
Gauss would have known about the induction principle, but his intuition, if the anecdote
is true, was correct and typical of a child prodigy.
Note: Alternatively, the expression resulting from applying the induction hypothesis can
be expanded and later factorised to arrive at the same result, i.e.
n(n + 1) n2 + n 2n + 2 n2 + 3n + 2 (n + 1)(n + 2)
+ (n + 1) = + = = .
2 2 2 2 2
Other methods are also possible.
Example 1.3.3. Prove that the sum of the first n consecutive odd natural numbers is
equal to n2 .
Solution. The general form of the property stated above, i.e. P(n), can be written as
P(n) : 1 + 3 + 5 + . . . + (2n 1) = n2
The LHS of P(n) gives the sum of the first n odd, natural numbers, whereas the RHS of
P(n) is the required square. To realise this, let us consider, for instance, the first four sums
of odd natural numbers:
1= 1 = 12
1+3= 4 = 22
1+3+5= 9 = 32
1 + 3 + 5 + 7 = 16 = 42
The property is thus valid for n = 1, 2, 3 and 4, but to prove it for every n 2 N we have to
use the induction theorem.
First, the property is clearly true for n = 1, since then LHS=RHS:
P(1) : 1 = 12 .
Now, let us assume that P(n) is valid (the induction hypothesis). We seek to prove that
where we used basic algebra in the last line. This is the desired result, since the last line
corresponds to the RHS of P(n + 1). We thus conclude that P(n) is valid for all natural
numbers n
Note: Using the summation notation, the LHS of P(n) in this example can be expressed
as (note that the sum starts at zero)
n
1 + 3 + 5 + . . . + (2n + 1) = (2k + 1).
k=0
The induction principle can be used to prove important inequalities too. An example of this
is the so-called Bernoulli’s inequality whose proof is contained in the “Extras” Appendix.
In science, engineering and many aspects of human life, it is common practice to relate
quantities together using mathematical expressions such as formulae. A quantitive relation
between two quantities is called a “function”. Although this idea may appear as primitive,
it turns out that a rigorous definition of a function is difficult to provide and could be quite
obscure and involved. For this reason, it is better to consider functions simply as “laws” or
“relations” connecting objects belonging to di↵erent sets, without attempting to produce
a rigorous definition. R is the set where functions that are of most interest to us will be
considered.
Function, domain, range and graph are the four most fundamental definitions needed when
discussing quantitative relations among sets.
Let X and Y be two subsets of the real line R, i.e. X ✓ R, Y ✓ R. A real-valued function
f of a single real variable is a certain “relation” (equivalently called a “law” or “mapping”)
between the set of inputs X and the set of outputs Y . This relation is such that to any
element from the input set x 2 X there corresponds exactly one element from the output
set y 2 Y . Formally, this mapping can be written as f : X ! Y .
Note: Besides the notation f : X ! Y , it is also common to use y = f (x), intending that
x 2 X is the generic input (or independent) variable and y 2 Y is the corresponding output
(or dependent) variable resulting from the “application” of f . Sometimes, the letter y is
omitted and the function is denoted simply as f (x).
Note: We are sometimes a bit lazy and write f : R ! R when we really mean f : X ! Y
for some (possibly proper) subsets X, Y 2 R. For example, we might write f : R !
R, f (x) = x12 , where 0 is not allowed as a value for x and f (x) can only be positive reals.
Mostly, this will not cause any problems, especially once we have embraced the idea of
natural domain a little later.
12 Module 1. Numbers, sets and functions
Note: Given a function y = f (x), x is the argument of f and y is the value of f at the
value x. The range is the set of y values for which y = f (x) for all x 2 X.
Figure 1.5. The function f maps every element from the input set (domain) X into exactly
one element from the output set (range) Y .
Note 1: The notation R2 indicates that both x and y = f (x) belong to R, i.e. R2 = R⇥ R.
Providing the graph of a function f is equivalent to drawing the collection of (x, f (x)) on
the Cartesian plane.
Note 2: The symbol ⇥ is here intended to denote what is known as a “Cartesian product”.
For example, given two sets A = {1, 2, 3} and B = {4, 5}, we have:
A ⇥ B = {(1, 4), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5)}.
The set arising from the Cartesian product contains all the possible pairs that can be
obtained by pairing elements of A with elements of B.
For any given function f : X ! Y and for all elements x 2 X = dom(f ), graph(f ) should
have exactly one intersection point with a vertical line.
More formally:
1.4. Functions and their properties 13
If the vertical line test fails, then G is not the graph of a function. If we draw the graph
of a relation f and the vertical line test fails, then f is not in fact a function. According
to its definition, a function is a relation for which any one element of dom(f ) is associated
with one and only one element of ran(f ). This definition still leaves the freedom that more
than one element of dom(f ) could be associated with the same element in ran(f ), as the
next definition explains.
If f is bijective, graph(f ) has only one intersection point with a horizontal line y = b for
every b 2 ran(f ). This is called the horizontal line test.
Let us consider a few examples.
Example 1.4.1. Let the relation between x and y be given by
x2 + y 2 = 1,
with x 2 X = [ 1, 1] and y 2 Y = [ 1, 1]. Can the relation between elements of X and Y
be regarded as a function?
Solution. The graph that corresponds to x2 + y 2 = 1 is given by a circle with radius one
and with the centre at the origin. Clearly, the graph does not pass the vertical line test, as
each vertical line between x = 1 and x = 1, crosses the circle at two points. Therefore,
the relation f is not a function.
Figure 1.6. The unit circle x2 + y 2 = 1 (left) fails the vertical line test. An example of a
bijective function f is given on the right. Each horizontal line intersects only one point in
the graph of the function, for every value y in the range of f .
Solution. The mapping f is a function (even though it has di↵erent rules for di↵erent
elements in its domain), because each value of x is mapped into exactly one value in the
range by the rule f . The domain of f is given by dom(f ) = [0, 2], the range of f is also
[0, 2]. This kind of function, where di↵erent rules apply to x in di↵erent intervals of the
domain, is generally called a stepwise (or piecewise) function.
Figure 1.7. The graph of the stepwise function discussed in the example above.
Solution. Although their graphs are both parts of a parabola and they are represented by
the same mathematical expression, g1 (x) and g2 (x) have di↵erent domains and are thus
considered distinct, i.e. g1 (x) 6= g2 (x) .
To avoid these types of situations, it is common to define a function with the implicit
understanding that the function is defined for all values for which the defining rule makes
sense. This idea is encapsulated in the definition of natural domain.
Solution. The natural domain of f is the entire real line except for the origin x = 0:
D = ( 1, 0) [ (0, +1). At x = 0 the function is not defined, because a division by zero
is impossible. Similarly, ran(f ) = ( 1, 0) [ (0, +1), because a fraction cannot be zero
unless its numerator is zero. Another equivalent way of indicating the domain is to use the
complement operator between sets, i.e. D = R \ {0}
Figure 1.8. Sketch of a generic composition between two functions f and g. Note that the
natural domain of g does not necessarily coincide with the range of f , and could be smaller
or larger.
Given the mathematical expressions for two functions f (x) and g(x), it is easy to obtain
the expression for g(f (x)): it is sufficient to substitute the expression for f (x) as the
independent variable of g(x). Some extra care must be taken when considering the range
of f and the domain of g, as the following examples show.
1
Example 1.4.6. Suppose that f (x) = 2x + 1 and g(x) = . What is g(f (x))? Is it the
x
same as f (g(x))?
16 Module 1. Numbers, sets and functions
Solution. We have
1 1
g(f (x)) = = .
f (x) 2x + 1
In general, f (g(x)) 6= g(f (x)). In fact, for the same f (x) and g(x), we obtain
2
f (g(x)) = 2g(x) + 1 = + 1 6= g(f (x)).
x
The above definition implies that if f is injective, every element y of ran(f ) comes from a
di↵erent x in dom(f ).
Note: If the relation f is a function and is also injective then it is also automatically
bijective, due to the definition of the term ’function’.
If a function f : X ! Y is injective, then its inverse is the function that associates with
every y 2 ran(f ) the unique x 2 dom(f ) for which y = f (x). This function is indicated as
f 1 and has the following properties:
1 1
f f = f (f (x)) = x for every x 2 X,
1 1
f f = f (f (y)) = y for every y 2 Y.
1.4. Functions and their properties 17
Note: From the two definitions above there follows the important property that
1 1
dom(f ) = ran(f ) and ran(f ) = dom(f )
The most typical example of invertible functions are the so called “monotonic” functions,
i.e. those functions that are strictly increasing or decreasing. It is thus useful to give a
formal definition of an increasing (or decreasing) function.
f (x1 ) f (x2 )
0,
x1 x2
with the strict inequality valid for strictly increasing functions. The above expression is
called the incremental ratio of a function and corresponds to the slope (or gradient) when
f (x) is a line. According to our intuition, a line with a positive slope is increasing. For
decreasing functions the above expression has the symbol replaced with a symbol.
1
Solution. f is strictly increasing and thus invertible. Its inverse is given by f 1 (x) = x3 .
Note: One of the properties of the inverse function is that graph(f ) is symmetrical to
graph(f 1 ) with respect to the line y = x. Plotting y = f 1 (x) is equivalent to plotting
x = f (y), which is the graph(f ) with the roles of x and y interchanged. As an example,
1
graphs of y = x3 and y = x 3 are shown in the figure below.
18 Module 1. Numbers, sets and functions
p
3
y= x
y = x3
1
Figure 1.9. Graphs of y = x3 (in blue) and y = x 3 (in red). They are symmetric with
respect to the line y = x (dashed, in black).
Some important functions are not invertible, unless they are “restricted” to specific domains
where they are instead everywhere increasing or decreasing. This procedure is defined
below.
The following example clarifies this concept, which will be used shortly.
Example 1.4.9. Is the function f (x) = x2 with its natural domain R invertible?
y = x2
y = x2
p
y= x
Figure 1.10. Graph of y = x2 (left). This function is not injective and needs to be restricted
for its inverse to exist. On the right, graphs of y = x2 (in red) restricted to x 0 and its
p
inverse y = x (in blue) are shown. They are symmetric with respect to the line y = x
(dashed, in black).
Consider instead f1 (x) = x2 , for x 2 X = [0, +1). Then f1 (x) is strictly increasing and
p
its inverse is f1 1 (x) = x. Analogously, restricting the domain by defining f2 (x) = x2 , for
p
x 2 X = ( 1, 0), implies that f2 1 (x) = x. With these choices, we also have that:
and
dom(f2 ) = ran(f2 1 ) = ( 1, 0] ran(f2 ) = dom(f2 1 ) = [0, 1),
as expected.
Note: An expression for the inverse of a function is found by interchanging the dependent
(i.e. y) and independent (i.e. x) variables and solving the new expression for y.
Example 1.4.10. Find expressions for the two inverses of the function f (x) = x2 4x + 5,
restricted to the following sets:
A = {x 2 R | x 2} and B = {x 2 R | x 2}.
4± 42 4(5 x) p p
y1,2 = =2± 4 5+x=2± x 1,
2
where, in the second equality, we divided the numerator by 2 (taking 4 inside the square
root).
20 Module 1. Numbers, sets and functions
fA (x)
fA 1 (x)
fB (x)
fB 1 (x)
p p
Now we have to choose between the two solutions y1 = 2 + x 1 and y2 = 2 x 1.
Each corresponds to a di↵erent restriction of f (x). Since y1 2 and y2 2, it is clear that
they are inverses of the required restrictions fA (x) and fB (x) respectively, where
p
A = {x 2 R | x 2} and fA 1 = y1 (x) = 2 + x 1
and p
B = {x 2 R | x 2} and fB 1 = y2 (x) = 2 x 1,
as requested. Finally, note that
and
dom(fB ) = ran(fB 1 ) = ( 1, 2] ran(fB ) = dom(fB 1 ) = [1, 1),
as expected.
1.5. Elementary functions and their graphs 21
The student should already be familiar with a number of elementary functions, e.g. ratio-
nal, algebraic, trascendental etc. We will revise these types of functions later on. In the
following, a number of simpler, important functions are introduced.
This function is of great pedagogical value: it shows that we do not need a relation of the
type y = f (x) to define a function. It is also important from an historical perspective,
because Dirichlet (1805-1859) was one of the pioneers in the study of functions and his
contributions helped put calculus on firm foundations. The modern definition of a function
was introduced by him.
0 x2Q
D(x) =
1 x 2 I.
x for x 0,
f (x) =
x for x < 0.
2
y = |x|
–2 –1 O 1 2 x
–1
1 for x > 0,
sign(x) = 0 for x = 0,
1 for x < 0.
The sign of a number is indicated by sign(x) (or, in some texts, sgn(x)). Thus, e.g.
sign( 3) = 1 and sign(2) = 1. The graph of f (x) = sign(x) is shown below. It fol-
lows immediately from the previous definition that |x| = x sign(x).
1 for x > 0,
H(x) =
0 for x 0.
Oliver Heaviside (1850-1925) was a great mathematician and electrical engineer, who con-
tributed theoretical and practical ideas to the field of telecommunication engineering, in
1.5. Elementary functions and their graphs 23
particular. For example, he formulated vector analysis, which is extremely useful for de-
scribing electromagnetic signals, and invented the coaxial cable. He was completely self-
taught.
Both the sign and the Heaviside function are important in signal processing, data analysis
applications and signal transmission techniques.
The integer part of a number x 2 R is the largest integer z 2 Z such that z x. The
integer part of x is indicated as bxc. For example, b3c = 3, b2.5c = 2, b⇡c = 3. When x is
negative, some care needs to be taken because, from the definition, b 1c = 1, b 0.5c =
1, b 1.1c = 2 etc.
The decimal part of a number x is given by {x} = x bxc. Again, some care must be taken
when dealing with negative numbers, since {2.6} = 0.6, { 1.3} = 0.7, { 3.2} = 0.8. Plots
of the integer and fractional parts are given below.
These two functions are commonly used in software engineering and IT applications. The
integer function is also known in computer science as the “floor” function.
The simplest functions of all are constants y = f (x) = a, i.e. those functions that have
the same value a for all x’s in R. Their graphs are straight lines parallel to the x-axis.
Slightly less simple are the linear functions of the form y = f (x) = mx + c. The graph
of such a relation is a straight line which has a constant gradient (or slope) of m. The
y-intercept is c, i.e. the line intersects the y-axis at the point (0, c), whereas the x-intercept
c
is at x = . Obviously, if m > 0 the linear function is increasing. Given two points on
m
the line P1 (x1 , y1 ), P2 (x2 , y2 ), the gradient is represented by the ratio of the “rise” over the
“run”, i.e.
Rise y 2 y1
m= = .
Run x2 x1
Note that this is a special case of the more general “incremental ratio” introduced before.
Powers are the simplest functions after linear ones, with a general rule given by f (x) = xn ,
where n 2 R. Let us discuss their major features, which depend on the exponent n. Four
cases are relevant: (i) n 2 N, (ii) n is positive and n 2 Q, (iii) n is irrational and positive,
(iv) n 2 R and negative.
case (i): n 2 N
• f (x) is “even” when n is even, i.e. f ( x) = f (x). Even functions are not injective,
in general.
For both even and odd exponents, plots have two similar characteristics: (a) when |x| < 1,
the graph “pushes” towards the x-axis as n increases; (b) when |x| > 1, the graph becomes
steeper as n increases.
p
case (ii): n is positive and n 2 Q. In the case n = , with p, q mutually prime (i.e. we
q
cannot reduce the exponent further), the following is true
• f (x) is defined all over R if q is odd but only on [0, +1) if q is even. For example,
f (x) = x1/2 , x3/4 , x5/6 are defined only for non-negative values of x, whereas f (x) =
x1/3 , x2/3 , x3/5 are defined for every x 2 R. Some examples are shown in the figures
below.
26 Module 1. Numbers, sets and functions
Figure 1.22. Examples of xp/q for an odd integer q and even and odd values of the integer
p.
case (iv): the exponent n 2 R is negative. The same properties defined above are valid,
e.g. x n is even for n even, x p/q is defined for all x 2 R for q odd etc. The only important
1.5. Elementary functions and their graphs 27
point is that when the exponent is negative, the function is not defined at x = 0, and this
is true for every value of the exponent n.
Figure 1.23. Examples of x p/q for di↵erent p and q. Note that, when q is even, the function
is only defined for positive values x.
P (x) = an xn + an 1x
n 1
+ · · · + a2 x2 + a1 x + a0 ,
where ai 2 R are constants, for i = 0, 1, . . . , n. The constants ai can be zero, except for the
term corresponding to i = n (i.e. an 6= 0).
Note: The degree of P (x) corresponds to the highest term xn in the expression above.
This term, by definition, cannot have a zero coefficient an = 0.
Some examples are given by the following functions:
P1 (x) = 2x3 + 5x2 + 4x 7 is a cubic or third-degree polynomial.
P2 (x) = 7x4 + 5x3 6x2 8x 3 is a fourth-degree polynomial.
P3 (x) = x5 + 3x4 6x2 5x 7 is a fifth-degree polynomial.
In the Revision section, it is discussed how to “reduce” a polynomial to simpler, lower
degree expressions. Note that this procedure is very useful, especially when integrating
rational functions.
28 Module 1. Numbers, sets and functions
Graphs of polynomial functions depend on a number of factors and do not follow the same
simple regularity as for n = 2 (quadratics, parabolas) or n = 3 (cubics). Sketches of the
examples above are given in Figure 1.24 below.
Rational functions, as the name suggests, are quotients (i.e. ratios) of two polynomial
functions.
P (x) an xn + an 1 xn 1 + · · · + a2 x2 + a 1 x + a0
R(x) = = .
Q(x) bm x m + b m 1 x m 1 + · · · + b 2 x 2 + b1 x + b 0
Note 1: Note that, in general, coefficients an and bm are unrelated. The coefficients an
and bm are not necessarily all di↵erent.
Note 2: R(x) is undefined for all those x0 2 R that are solutions to Q(x) = 0, i.e. those
values x0 satisfying
m 1
bm x m
0 + bm 1 x0 + · · · + b2 x20 + b1 x0 + b0 = 0.
x2 5x + 7
R2 (x) = ,
x3 x2 +x 1
Solution. R1 (x) is not defined for x = 0, R2 (x) is not defined for x = 1 (which is the only
real solution for x3 x2 + x 1 = 0) and R3 (x) is not defined for x = 2 (which is the only
real solution for x3 x2 + 2x 8 = 0). Other solutions for the denominators in R2 (x) and
R3 (x) are not real.
It is useful to divide rational functions into two broad categories: (i) proper and (ii) im-
proper. If the degree of P (x) (deg(P (x))) is less than that of Q(x) (i.e. deg(P (x)) <
deg(Q(x))), then R(x) is “proper”. If deg(P (x)) deg(Q(x)), then R(x) is “improper”.
In the previous examples, R1 (x), R2 (x) are proper, whereas R3 (x) is improper.
Improper rational functions can always be rewritten as the sum of two terms: a polynomial
S(x) and a proper rational function T (x)/Q(x). It should be intuitively clear that deg(S(x))
= n m and deg(T (x)) (m 1). For instance, it turns out that
To obtain S(x), T (x), the division rule between two polynomials has to be applied. A review
of the method is in the Revision section.
for a certain n 2 N, where P0 (x), P1 (x), . . . , Pn (x) are polynomial. Such an equation is
named an “algebraic” equation.
30 Module 1. Numbers, sets and functions
1
f (x) = p is an algebraic function, solution to the second order algebraic equation:
x+1
(1 x)y 2 2y + 1 = 0.
(y 3)3 = 1 x2
or
y3 9y 2 + 27y + x2 28 = 0.
Note that dom(g(x)) = R.
These types of functions will be relevant in two applications: the calculation of limits of
functions containing special radicals (or surds) and the definition of “implicit” functions.
Their graphs are not generally easy to plot by hand. The two examples above are shown
in the following figure.
1 p
Figure 1.25. Graphs of algebraic functions f (x) = p (in blue) and g(x) = 3+ 3 1 x2
x+1
(in red).
• The exponential is always positive. It is increasing for a > 1, decreasing for 0 < a < 1.
• Because for all y’s we have that y = ap > 0, the logarithm is defined only for positive
values of its argument. This means that, for a > 0, (a 6= 1), the domain and range of
the exponential and logarithm are the following:
• if a > 1, then loga x 0 for x 1, loga x < 0 for 0 < x < 1. The logarithm is always
increasing for a > 1. For a base 0 < a < 1, the opposite statements are true.
Plots of exponential and logarithm functions are shown below. Algebraic properties of
exponentials and logarithms, including the change of base, are described in the Revision
section.
Figure 1.26. Examples of graphs of exponential functions for di↵erent bases a, with a > 1
and 0 < a < 1. The same colour is used for reciprocal bases, i.e. 2x and (1/2)x are in blue,
ex and (1/e)x are in red etc.
32 Module 1. Numbers, sets and functions
Figure 1.27. Examples of graphs of logarithmic functions for di↵erent bases a, with a > 1
and 0 < a < 1. Again, the same colour is used for reciprocal bases, i.e. log2 x and log1/2 x
are in blue, ln x and log1/e x are in red etc.
Trigonometric functions are defined as special properties of the unit circle. For a circle of
unit radius centred at the origin, (i.e. the curve x2 + y 2 = 1), the abscissa and the ordinate
of a generic point P on the circle are given by cos ✓ and sin ✓ respectively, where ✓ is the
sin ✓
angle taken between the radius OP and the x-axis. The tangent is defined as tan ✓ =
cos ✓
.
1.5. Elementary functions and their graphs 33
y
1
P(cos ✓, sin ✓)
sin ✓
✓
1 O cos ✓ 1 x
Figure 1.28. In this plot, the angle is indicated by ✓ and the abscissa and ordinate of a
generic point P (x, y) are given by x = cos ✓ and y = sin ✓ respectively.
• The angle is measured in radians and its value is indicated using the symbol c . One
radian is indicated as 1c , the angle corresponding to the arc whose length is equal to
the radius of the circle. In the ordinary sexagesimal (base 60) system, a round angle
is 360 . One radian is thus the ratio between the round angle and the length of the
circumference. The case of a circle of unit radius r = 1, having a circumference equal
to 2⇡r = 2⇡ gives the universal result
Two of the advantages in using radians are that: (1) the length of an arc of a circle making
an angle ✓ is simply s = r✓, and (2) the area of the sector of a circle of angle ✓ is A = 12 r2 ✓.
Unless explicitly stated, we will refer to angles from now on using radians and, unless
explicitly required, we will drop the symbol c .
• Sine, cosine and tangent are periodic. The period is T = 2⇡ for sine and cosine, T = ⇡
for the tangent.
34 Module 1. Numbers, sets and functions
• Sine and cosine are such that | sin x| 1, | cos x| 1. The tangent instead is such
that ran(tan x) = R.
• The sine and cosine functions are defined for all values x 2 R, whereas the tangent is
⇡
not defined for those x = + k⇡, with k 2 Z. Geometrically, in the unit circle this
2
corresponds to OP being parallel to the vertical line passing through the point with
coordinates (1, 0).
• Sine and tangent are odd functions, the cosine is an even function.
Figure 1.29. Plots of the trigonometric functions cos x, sin x and tan x.
• Other functions that are useful and can be defined in terms of sine and cosine are the
secant and cosecant functions, i.e.
1 1
sec x = cosec x = .
cos x sin x
Sometimes, the cosecant is abbreviated as csc x.
Note that these functions are not defined for those values where cos x and sin x go to
⇡
zero, i.e. x = + k⇡ and x = k⇡ (with k 2 Z), for sec and cosec respectively.
2
• By analogy with the tangent, a function called cotangent can be also introduced:
cos x
cot x = .
sin x
Given that the cotangent is not defined for those values x such that sin x = 0, values
x = k⇡ (with k 2 Z) are not part of the cotangent’s domain, a property shared with
the cosec function, for the same reason. cosec. The range of the cotangent is, similarly
to the tangent, given by ran(cot x) = R.
sec2 x tan2 x = 1.
1.6. Inverse trigonometric functions 35
A summary of the symmetry properties for trigonometric functions and a table with the
values of cos, sin and tan for the most common angles are presented in the Revision section.
Figure 1.30. Plots of the trigonometric functions sec x and cosec x (sometimes also abbre-
viated as csc x).
Important note: For any two angles x, y 2 R, there exist formulae to express the sine and
cosine of their sums and di↵erences (i.e. (x + y) and (x y)) in terms of the original sine
and cosine of x and y. The so-called addition and subtraction formulae are given below:
From these, the so-called double-angle formulae for sin and cos are obtained by making
y = x in the expressions above.
These formulae are very useful and will be used in many examples.
By restricting the trigonometric functions sin x, cos x and tan x to appropriate domains, the
functions become strictly increasing (sin x, tan x) or strictly decreasing (cos x) and can be
inverted. The resulting inverse functions are very useful in a large number of applications
and, as such, are identified by the special symbols arcsin x, arccos x and arctan x. The
way to read these symbols is, for instance for arcsin x: “the arc whose sine is x”, or “the
angle whose sine is x”. Similar definitions apply to the arccos x and arctan x. Their most
important properties are described below.
36 Module 1. Numbers, sets and functions
The inverse sine: Given that f (x) = sin x is strictly increasing on A = [ ⇡/2, ⇡/2], the
restriction fA is invertible.
Important note: Although arcsin x is read as: “the arc whose sine is x” what we really
mean is “the arc, in the interval [ ⇡/2, ⇡/2], whose sine is x”. It is implicit that the original
function sin x has been restricted to the appropriate domain [ ⇡/2, ⇡/2]. This is important
because, rigorously, there are infinitely many arcs whose sine is x if we do not restrict the
function sin x to the interval [ ⇡/2, ⇡/2].
The inverse cosine: Given that g(x) = cos x is strictly decreasing on B = [0, ⇡], the
restriction gB is invertible.
Note: Observe how the di↵erent domains of restriction A, B defined above for the functions
sin x and cos x provide results in di↵erent quadrants for their inverses arcsin x, arccos x at
the same x1 and x3 considered in the examples above (i.e. fourth quadrant for arcsin x,
second quadrant for arccos x).
1.6. Inverse trigonometric functions 37
The inverse tangent: Given that h(x) = tan x is strictly increasing for C = ( ⇡/2, ⇡/2),
the restriction hC is invertible.
The inverse tangent is thus denoted by
1 1
(hC ) : tan x, or by y = arctan x
and has the following domain and range:
dom(arctan x) = R ran(arctan x) = ( ⇡/2, ⇡/2)
Note: Clearly, since h(x) = tan x = sin x/ cos x is not defined at x = ±⇡/2 (i.e. cos(±⇡/2) =
0), the range for arctan x does not include these values and the domain of restriction for
⇡
tan x excludes ± .
2
Arguments similar to those used for arcsin, arccos and arctan can be used to find the inverse
functions of cot, sec and cosec. For example, given that h(x) = cot x is strictly decreasing
for C = (0, ⇡), the restriction hC is invertible. The inverse cotangent can thus be denoted
by
(hC ) 1 : cot 1 x, or by y = arccot x
and has the following domain and range:
dom(arccot x) = R ran(arccot x) = (0, ⇡).
Parity of inverse trigonometric functions: The following parity properties are impor-
tant:
Solution Let ✓ = sin 1 (2x). This implies that sin ✓ = 2x. In a right angle triangle with an
angle ✓ < ⇡/2, it is known that
opp
sin ✓ = ,
hyp
where “opp” and “hyp” indicate the opposite side and the hypotenuse, respectively. Sketch-
ing such a triangle for the present case, we see that:
opp 2x
sin ✓ = = .
hyp 1
This implies that the hypotenuse has unit length and the side opposite to ✓ is of length 2x.
Using Pythagoras’ theorem to find the adjacent side (“adj”),
An alternative solution can be found if we express cos x in terms of sin x using the funda-
mental identity of trigonometry, i.e.:
By using the properties of inverse functions we have, for every y 2 [ ⇡/2, ⇡/2], that
sin(arcsin y) = y. This also means that we have to reject the negative sign solution in
the equation above because arcsin(2x) belongs to the first or fourth quadrant, where cos is
positive. We can thus finally write
Solution Let ✓ = tan 1 (3x). This implies that tan ✓ = 3x. In a right angle triangle with
an angle ✓ < ⇡/2, we find:
opp 3x
tan ✓ = = .
adj 1
This implies that the adjacent side has unit length and the opposite side is 3x. Using
Pythagoras’ theorem, the hypotenuse has length
1 + (3x)2 = 1 + 9x2 .
The presence of a 2 inside the argument for cos has an important e↵ect on the final expres-
sion, because:
1
cos(2 tan (3x)) = cos(2✓) = cos2 ✓ sin2 ✓ = 1 2 sin2 ✓,
where we used the double angle formula for cos(2✓). Finally, given that
opp
sin ✓ = ,
hyp
we have:
2 2
1 opp 3x 1 9x2
cos(2 tan (3x)) = 1 2 =1 2 p = .
hyp 1 + 9x2 1 + 9x2
40 Module 1. Numbers, sets and functions
Given any x 2 R, the three most important hyperbolic functions are defined as follows:
1
hyperbolic cosine : cosh x = (ex + e x ),
2
1 x
hyperbolic sine : sinh x = (e e x ),
2
sinh x ex e x
hyperbolic tangent : tanh x = = x x
.
cosh x e +e
Note 1: All three functions above are defined for every real number x, i.e. their domains
coincide with R. The first two are pronounced “cosh” and “shine”, whereas tanh x is
pronounced “than” or “tansh”. The tangent is also simply read as the “hyperbolic tan”.
1
Figure 1.32. Graphs for the parabola x2 + 1, the exponential ex and the three hyperbolic
2
functions cosh x, sinh x and tanh x. The parabola is proposed as a comparison with cosh x:
they are similar but inherently di↵erent.
Note 2: Historically, hyperbolic functions were introduced as some of the first “special
functions”, i.e. mathematical functions that are very useful in specific, special problems.
In this case, the hyperbolic cosine arises from a classic engineering problem of the XVII
century, namely the “catenary” problem (in Latin, “catena” means “chain”). What is the
shape assumed by a chain under its own weight when constrained at its ends? It turns
out that the U-shaped curve that describes that shape is a hyperbolic cosine. This is not
1.7. Hyperbolic functions and their inverses 41
as trivial as it may seem. For example, the great Galileo Galilei (1564-1642) was wrongly
convinced that the shape was a parabola.
The graphs of the hyperbolic tangent, and those of hyperbolic cosine and sine along with
1 x 1
e are shown above. Note that, for every x 2 R, sinh x ex cosh x.
2 2
Note 3: For large positive x we have cosh x ⇡ 12 ex ⇡ sinh x. For large negative x instead
cosh x ⇡ 12 e x ⇡ sinh x (reflection in the x-axis for large negative values). From the
graphs and their definitions, it is straightforward to prove that:
Note 4: In computer science and computational physics applications, tanh can also be
used to approximate a step function, for example the sign function previously discussed.
Hyperbolic functions satisfy identities similar to the trigonometric (or circular) functions
sin x and cos x. In fact
• There is a fundamental relation (or identity) similar to that for the trigonometric
case:
cosh2 x sinh2 x = 1.
• Other relations akin to those of trigonometry can be established. For example, for
every x, y 2 R:
• Functions analogous to the trigonometric cotangent, secant and cosecant also exist
cosh x 1 1
coth x = , sech x = , cosech x = .
sinh x cosh x sinh x
These functions arise in integration problems and are pronounced “coth”, “shec” and
“coshec”. A relation similar to that between sec and tan exists between tanh and
sech:
sech2 x + tanh2 x = 1
42 Module 1. Numbers, sets and functions
• There is a rule, know as Osborn(e)’s Rule (the ’e’ seems to be optional), which can be
used to turn any identity relating trigonometric functions into a similar identity for
hyperbolic functions. The rule states that we can take the trig identity and simply
change each trig function to the analogous hyperbolic function and then make some
sign changes as follows: We count the number of sines in each term. For every product
of two sines (including squares of course), we change the sign of that term. Thus a
term with three sines changes sign but one with four sines picks up a double-sign
change, so stays the same. The rule applies also to sines hidden inside tans or as
reciprocals (cosecs).
• Analogously to the geometric relations between trigonometric functions and points
on the unit circle, there exists a correspondence between the hyperbolic functions and
points on the unit hyperbola. It is discussed in the second example below.
Solution
If we calculate the squares of hyperbolic functions
1 x 2 1 2x
cosh2 x = (e + e x
) = e +2+e 2x
2 4
1 x 2 1
sinh2 x = (e e x
) = (e2x 2 + e 2x
),
2 4
it is easy to subtract and prove that
1 1
cosh2 x sinh2 x = (e2x + 2 + e 2x
(e2x 2+e 2x
)) = (4) = 1.
4 4
Note that this results also follows from Osborn’s rule applied to cos2 x + sin2 x = 1.
Figure 1.33. The relation between hyperbolic functions x = cosh t and y = sinh t, and the
unit hyperbola x2 y 2 = 1.
1.7. Hyperbolic functions and their inverses 43
Example 1.7.2. Investigate the relation between the unit hyperbola and the functions
sinh x, cosh x and tanh x.
Solution If we draw the unit hyperbola, i.e. the curve corresponding to the equation
x2 y 2 = 1 and consider a generic point P (x, y) with x 1, we can introduce a parameter
t 2 R such that
x = cosh t, y = sinh t.
As illustrated in the figure above, varying t 2 ( 1, 1) corresponds to sliding the point P
along the branch of the hyperbola for x 1 (i.e. the right branch). The arrow shows the
direction of increasing t. Note how the fundamental identity results in the equation of the
unit hyperbola:
x2 y 2 = cosh2 t sinh2 t = 1.
Segments OQ and P Q respectively correspond to cosh a and sinh a, and it can be proven
(try!) that RV is tanh a.
Hyperbolic functions, properly restricted, admit inverses that are useful in applications,
especially in integration.
The inverse hyperbolic cosine: Given that f (x) = cosh x is strictly increasing for
A = [0, 1), the restriction fA is invertible. This corresponds to the right “half” of the
hyperbolic cosine with respect to the y axis.
The inverse hyperbolic cosine is thus denoted by
1 1
(fA ) : cosh x, or by y = arccosh x
and, according to the properties of cosh x, must have the following domain and range:
0<x x2 1 1,
implying that
ln(x x2 1) < 0, x 1.
This is in contradiction with our initial assumption y 0. In other words, choosing the
negative sign is wrong because
Figure 1.34. The graphs of the three inverse hyperbolic functions y = arccosh x, y =
arcsinh x and y = arctanh x.
1.7. Hyperbolic functions and their inverses 45
dom(arctanh x) = ran(tanh x) = ( 1, 1)
1+x
Note that is positive for 1 < x < 1, and ran(arctanh x) = R.
1 x
Graphs of the three fundamental inverse hyperbolic functions show that the inverse sinh
and inverse tanh are odd, whereas the inverse cosh is neither even nor odd, similarly to the
case of inverse trigonometric functions.
Finally, note that inverse functions for coth x, sech x and cosech x can be introduced in
a similar way as for the cases discused above. These inverse functions are respectively
indicated as arccoth x, arcsech x and arccosech x.