PPL 1
PPL 1
Imperative Core
1.1.1 Assignment
x = y;
y = 3;
x = x + 1;
x + 2 = y + 5;
are not.
To understand what happens when you execute the statement x = t; sup-
pose that within the recesses of your computer’s memory, there is a com-
partment labelled x. Executing the statement x = t; consists of filling this
compartment with the value of the expression t. The value previously contained
in compartment x is erased. If the expression t is a constant, for example 3,
its value is the same constant. If it is an expression with no variables, such as
3 + 4, its value is obtained by carrying out mathematical operations, in this
case, addition. If expression t contains variables, the values of these variables
must be looked up in the computer’s memory. The whole of the contents of the
computer’s memory is called a state.
Let us consider, initially, that expressions, such as x + 3, and statements,
such as y = x + 3;, form two disjoint categories. Later, however, we shall be
brought to revise this premise.
In these examples, the values of expressions are integers. Computers can
only store integers within a finite interval. In Java, integers must be between
-231 and 231 - 1, so there are 232 possible values. When a mathematical op-
eration produces a value outside of this interval, the result is kept within the
interval by taking its modulo 232 remainder. Thus, by adding 1 to 231 - 1, that
is to say 2147483647, we leave the interval and then return to it by removing
232 , which gives -231 or -2147483648.
Exercise 1.1
What is the value of the variable x after executing the following state-
ment?
x = 2 * 1500000000;
The integers are of type byte, short, int or long corresponding to the
intervals [-27 , 27 - 1], [-215 , 215 - 1], [-231 , 231 - 1] and [-263 ,
263 - 1], Respectively. Constants are written in base 10, for example, -666.
Decimal numbers are of type float or double. Constants are written in sci-
entific notation, for example 3.14159, 666 or 6.02E23.
Booleans are of type boolean. Constants are written as false and true.
Characters are of type char. Constants are written between apostrophes, for
example ‘b’.
To declare a variable of type T, replace the type int with T. The general
form of a declaration is thus {T x = t; p}.
4 1. Imperative Core
The operations for decimal numbers are +, -, *, /, along with some transcen-
dental functions: Math.sin, Math.cos, ...
For all data types, the expression (b) ? t : u evaluates to the value of t if
the boolean expression b has the value true, and evaluates to the value of u
if the boolean expression b has the value false.
Character strings are of type String. Constants are written inside quotation
marks, for example "Principles of Programming Languages".
let y = ref 4
in let x = ref 5
in let x = ref 6
in y := !x
and this program assigns the value 6 to the variable y, so it is the most recent
declaration of x that is used. We say that the first declaration of x is hidden by
the second.
Java, Caml and C allow the creation of variables with an initial value that
can never be changed. This type of variable is called a constant variable. A
variable that is not constant is called a mutable variable. Java assumes that
all variables are mutable unless you specify otherwise. To declare a constant
variable in Java, you precede the variable type with the keyword final, for
example
final int x = 4;
y = x + 1;
The following statement is not valid, because an attempt is made to alter the
value of a constant variable
final int x = 4;
x = 5;
In Caml, to indicate that the variable x is a constant variable, write let x
= t in p instead of writing let x = ref t in p. When using constant vari-
ables, you do not write !x to express its value, but simply x. So, you can write
let x = 4 in y := x + 1, while the statement let x = 4 in x := 5 is in-
valid. In C, you indicate that a variable is a constant variable by preceding its
type with the keyword const.
1.1.3 Sequence
1.1.4 Test
1.1.5 Loop
finite expression. And the fact that a loop may fail to terminate is a consequence
of the fact that it is an infinite object.
In Caml, this statement is written while b do p. In C, it is written as it
is in Java.
1.2.1 Input
Input constructs in Java are fairly complex, so we will use an extension of Java
created specially for this book: the class Ppl1 .
Evaluation of the expression Ppl.readInt() waits for the user to type a
number on her/his keyboard, and returns this number as the value of the
expression. A typical usage is n = Ppl.readInt();. The class Ppl also contains
the construction Ppl.readDouble which allows decimal numbers to be read
from the keyboard, and the construction Ppl.readChar which allows characters
to be read.
1.2.2 Output
Exercise 1.3
Write a Java program that reads an integer n from the keyboard, and
outputs a boolean indicating whether the number is prime or not.
We define an infinite set Var whose elements are called variables. We also define
the set Val of values which are integers, booleans, etc. A state is a function that
associates elements of a finite subset of Var to elements of the set Val.
For example, the state [x = 5, y = 6] associates the value 5 to the vari-
able x and the value 6 to the variable y. On the set of states, we define an
update function + such that the state s + (x = v) is identical to the state s,
except for the variable x, which now becomes associated with the value v. This
operation is always defined, whether x is originally in the domain of s or not.
We can then simply define a function called Θ, which for each pair (t,s)
composed of an expression t and a state s, produces the value of this expression
in this state. For example, Θ(x + 3,[x = 5, y = 6]) = 8.
This is a partial function, because a state is a function with a finite domain
while the set of variables is infinite. For example, the expression z + 3 has no
1.3 The Semantics of the Imperative Core 9
A state s is a function that maps a finite subset of Var to the set Val. It will be
helpful for the next chapter if we decompose this function as the composition
of two other functions of finite domains: the first is known as the environment,
which maps a finite subset of the set Var to an intermediate set Ref, whose
elements are called references and the second, is called the memory state, which
maps a finite subset of the set Ref to the set Val.
Var Ref Val
e m
This brings us to propose two infinite sets, Var and Ref, and a set Val of
values. The set of environments is defined as the set of functions that map a
finite subset of the set Var to the set Ref. The set of memory states is defined as
the set of functions mapping a finite subset of the set Ref to the set Val. For the
set of environments, we define an update function + such that the environment
e + (x = r) is identical to e, except at x, which now becomes associated with
10 1. Imperative Core
the reference r. For the set of memory states, we define an update function +
such that the memory state m + (r = v) is identical to m, except at r, which
now becomes associated with the value v.
However, constant variables complicate things a little bit. For one, the envi-
ronment must keep track of which variables are constant and which are mutable.
So, we define an environment to be a function mapping a finite subset of the
set Var to the set {constant, mutable} × Ref. We will, however, continue
to write e(x) to mean the reference associated to x in the environment e.
Then, at the point of execution of the declaration of a constant variable
x, we directly associate the variable to a value in the environment, instead of
associating it to a reference which is then associated to a value in the mem-
ory state. The idea is that the memory state contains information that can be
modified by an assignment, while the environment contains information that
cannot. To avoid having a target set for the environment function that is overly
complicated, we propose that Ref is a subset of Val, which brings us to pro-
pose that the environment is a function that maps a finite subset of Var to
{constant, mutable} × Val and the memory state is a function that maps
a finite subset of Ref to Val.
Var Val
Ref
a x b
Even though each label is associated with a unique reference, nothing prevents
two labels from being associated with the same reference, since an environment
is a function, but not necessarily an injective function. Finally, we represent
the memory state by filling each square with a value.
a x b
4 5
At first glance, this definition may seem circular, since to define the value
of an expression of the form t + u, we use the value of expressions t and u.
But the size of these expressions is smaller than that of t + u. This definition
is therefore a definition by induction on the size of expressions.
The first clause of this definition indicates that the value of an expression
that is a mutable variable is m(e(x)). We apply the function e to the variable x,
which produces a reference, and the function m to this reference, which produces
a value. If the variable is a constant variable, on the other hand, we find its
value directly in the environment.
The definition of the function Θ for Caml is identical, except in the case of
variables, where we have the unique clause
– Θ(x,e,m) = e(x),
where the variable x is either mutable or constant.
For example, if e is the environment [x = r] and m is the memory state
[r = 4] and that the variable x is mutable in e, the value Θ(x,e,m) is 4 in
Java, but is r in Caml.
Caml also has a construct ! such that
– Θ(!t,e,m) = m(Θ(t,e,m)).
If x is a variable, then the value of !x is Θ(!x,e,m) = m(Θ(x,e,m)) =
m(e(x)) that is the value of x in Java. This explains why we write y := !x +
1 in Caml, where we write y = x + 1; in Java.
In Caml, references that can be associated to an integer in memory are of
the type int ref. For example, the variable x and the value r from this example
are of the type int ref. In contrast to the variable x, the expressions !x, !x +
1, ... are of the type int.
The definition of the function Θ for C is the same as the definition used for
Java.
1.3 The Semantics of the Imperative Core 13
Exercise 1.4
Give the definition of the function Θ for expressions of the form t & u
and t | u.
Unlike the boolean operator & that evaluates its two arguments, the
operator && evaluates its second argument only if the first argument
evaluates to true. Give the definition of the function Θ for expressions
of the form t && u.
Answer the same question for the boolean operator ||, which only eval-
uates its second argument if the first argument evaluates to false.
– When the statement p is a test of the form if (b) p1 else p2 , the function
Σ is defined as follows. If Θ(b,e,m) = true then
14 1. Imperative Core
– This brings us to the case where the statement p is a loop of the form while
(b) q. We have seen that introducing the imaginary statement skip; such
that Σ(skip;,e,m) = m, we can define the statement while (b) q as a
shorthand for the infinite statement
When dealing with these types of infinite constructs, we often try to ap-
proach them as limits of finite approximations. We therefore introduce an
imaginary statement called giveup; such that the function Σ is never de-
fined on (giveup;,e,m). We can define a sequence of finite approximations
of the statement while (b) q.
p0 = if (b) giveup; else skip;
p1 = if (b) {q if (b) giveup; else skip;} else skip;
...
pn+1 = if (b) {q pn } else skip;.
The statement pn tries to execute the statement while (b) q by completing
a maximum of n complete trips through the loop. If, after n loops, it has not
terminated on its own, it gives up.
If isn’t hard to prove that for every integer n and state e, m, if Σ(pn ,e,m)
is defined, then for all n’ greater than n, Σ(pn ,e,m) is also defined, and
Σ(pn ,e,m) = Σ(pn ,e,m). This formalises the fact that if the statement
while (b) q terminates when the maximum number of loops is n, then it
also terminates, and to the same state, when the maximum number of loops
is n’.
There are therefore two possibilities for the sequence Σ(pn ,e,m): either it is
never defined, or it is defined beyond a certain point, and in this case, it is
constant over its domain. In the second case, we call the value it takes over
its domain the limit of the sequence. In contrast, the sequence does not have
1.3 The Semantics of the Imperative Core 15
a limit if it is never defined. We can now define the function Σ in the case
where the statement p is of the form while (b) q
Note that the statements pi are not always shorter than p, but if p contains
k nested while loops, pi contains k - 1. The definition of the function Σ is
thus a double induction on the number of nested while loops, and on the size
of the statement.
Exercise 1.5
What is the memory state Σ(x = 7;,[x = r],[r = 5])?
The definition of the function Σ for Caml is not very different from the
definition used for Java. In Caml, any expression that evaluates to a reference
can be placed to the left of the sign :=, while in Java, only a variable can appear
to the left of the sign =. The value of the function Σ of Caml for the statement
t := u is defined below:
– Σ(t := u,e,m) = m + (Θ(t,e,m) = Θ(u,e,m))
In the case where the expression t is a variable x, we have Σ(x := u,e,m)
= m + (Θ(x,e,m) = Θ(u,e,m)) = m + (e(x) = Θ(u,e,m)) and we end up
with the same definition of Σ used for Java.
The definition of the function Σ for C is not very different from the defini-
tion used for Java. The main difference is in case of variable declaration
Σ({T x = t; q},e,m) = (Σ(q,e+(x=r),m + (r = Θ(t,e,m))))|Ref−{r}
where r is a new reference that does not appear in e or m, and the notation
m|Ref−{r} designates the memory state m in which we have removed the ordered
pair r = v if it existed. Thus, if we execute the statement {int x = 4; p} q
in the state e, m, we execute the statement p in the state e + (x = r), m +
(r = 4) in C as in Java. In contrast, we execute the statement q in the state
e, m + (r = 4) in Java and in the state e, m in C.
As, in the environment e, there is no variable that allows the reference r
to be accessed, the ordered pair r = 4 no longer serves a purpose sitting in
memory. Thus, whether it is is left alone, as in Java or Caml, or deleted, as
in C, is immaterial. However, we will see, in Exercise 2.17, that this choice in
C is a source of difficulty when the language contains other constructs.
Exercise 1.6
The incomplete test allows the creation of a statement composed of a
boolean expression and a statement. This statement is written if (b)
p. The value of the function Σ for this statement is defined as follows. If
Θ(b,e,m) = true then
16 1. Imperative Core
Exercise 1.9
Give the definition of the Σ function for the declaration of a variable
without an initial value.
Exercise 1.10
Imagine an environment e — which cannot be created in Java — [x =
r, y = r], m, the memory state [r = 4], p, the statement x = x + 1;,
and m’, the memory Σ(p,e,m). What is the value associated with y in
the state e, m’? Answer the same question for the environment [x =
r1 , y = r2 ] and memory [r1 = 4, r2 = 4].
Draw these two states.
Exercise 1.11
Imagine that all memory states have a special reference: out. Define the
function Σ for the output construct System.out.print from the Section
1.2.
Exercise 1.12
In this exercise, imagine a data type that allows integers to be of any
size. To each statement p in the imperative core of Java, we associate the
1.3 The Semantics of the Imperative Core 17