Course Notes
Course Notes
Object-Oriented Programming 1
Single-object abstractions . . . . . . . . . . . . . . . . . . . . . . . . . 1
Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Multi-object abstractions (= entity-relationship abstractions) . . . . . 2
Advanced topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
i
ii CONTENTS
Inheritance 83
Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Static type checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
CONTENTS iii
Typecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Dynamic binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Class Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Interfaces 101
Iterators 149
iv CONTENTS
Generics 161
Basic concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Approach 1: Duplicate collection classes . . . . . . . . . . . . . . 161
Approach 2: Using subtype polymorphism . . . . . . . . . . . . . 164
Aproach 3: Generics . . . . . . . . . . . . . . . . . . . . . . . . . 166
Bounded type parameters . . . . . . . . . . . . . . . . . . . . . . . . . 168
Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Wildcards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Upper-bounded wildcards . . . . . . . . . . . . . . . . . . . . . . 171
Lower-bounded wildcards . . . . . . . . . . . . . . . . . . . . . . 171
Generic methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Object-Oriented
Programming
Single-object abstractions
• Introduction: Topic of the course
• Principles of programming in Java
– Concepts: values, variables, types, methods, parameters, arguments,
return values, classes, fields, objects, object creation, object references,
method activations and the call stack
• First steps in modular programming Part 1 Part 2
– Example: squareroot
– Example: max3
– Example: timeofday
– Concepts: Using Eclipse, creating JUnit test cases, creating classes,
instance methods, encapsulation, private versus public, using Git,
constructors
• Managing complexity through modularity and abstraction
– Concepts: modularity, abstraction, API, client module, importance
of documentation, information hiding, procedural abstraction, data
abstraction, immutable versus mutable abstractions, abstract value/s-
tate of an object, Java’s built-in datatypes and operators
• Representation objects and representation exposure
– Concepts: representation object, representation exposure
• How to properly document single-object abstractions
– Concepts: defensive programming, contractual programming, precon-
ditions, postconditions, class representation invariants (= private class
invariants), class abstract state invariants (= public class invariants),
getters, mutators
Inheritance
• Polymorphism and dynamic binding
1
2 OBJECT-ORIENTED PROGRAMMING
Advanced topics
(Students of course H02C5A can ignore this material.)
• Iterators
– Concepts: (external) iterators, iterables, nested classes, inner classes,
local classes, anonymous classes, enhanced for loop, internal itera-
tors, lambda expressions, capturing outer variables, effectively final
variables
ADVANCED TOPICS 3
5
6 INTRODUCTION: TOPIC OF THE COURSE
comprehensive documentation for APIs. For the sake of clarity and precision,
we will supplement informal documentation with formal specifications.
Principles of programming
in Java
7
8 PRINCIPLES OF PROGRAMMING IN JAVA
The JLearner Programming
Language
In this document, we define the syntax of JLearner programs and what it means
to execute them. We assume that the reader recognizes these concepts from
their introductory programming course; this document serves to further clarify
these concepts and to offer a reasonably precise vocabulary for discussing and
reasoning about JLearner (and Java) programs and their execution.
Contents:
9
10 THE JLEARNER PROGRAMMING LANGUAGE
Classes
JLearner class declarations are of the form class ClassName { FieldDeclarations },
where FieldDeclarations is a sequence of zero or more field declarations. A JLearner
field declaration is of the form Type FieldName;.
The Heap
The JLearner heap at each point during the execution of a program contains the
objects created by the program so far. For each instance O of a class C created
by the program and for each field F of C, it stores the value of O.F. (We also
say that it binds O.F to some value.) For each array A created by the program,
it stores the element type and the length N of the array, and for each of the
components of the array (identified by an index I between 0, inclusive, and N,
exclusive), it stores the value of A[I]. (We also say that it binds A[I] to some
METHODS 11
value.) We often refer to the values stored by the components of an array as the
array’s elements.
For example, consider the following snapshot from an execution of an example
program in the JLearner environment:
In this snapshot (showing a particular program execution state), the heap contains
four objects: one array, with element type Person and length 3, and three instances
of class Person. The first component of the array stores a reference to a Person
instance whose age field currently stores the value 11. The second component
stores a reference to another Person instance, whose age field currently stores the
value 21.
The JLearner heap is always well-typed:
• for any instance O of a class C and for any field F of C, the value of O.F
is a value of the declared type of F
• for any array A with element type T, the value of any component A[I] of
A is a value of type T
Methods
The JLearner methods are of the form ReturnType MethodName(ParameterDeclarations) { Statements }
where ReturnType is either void or a type, and ParameterDeclarations is a comma-
separated sequence of zero or more parameter declarations of the form
Type ParameterName.
local variables that are currently in scope to a JLearner value. The variable
environment is used to evaluate variable expressions.
A method call suspends the current method activation and starts a new method
activation. It pushes a corresponding activation record onto the method activation
stack, also known as the call stack. The call stack contains an activation record
for each method activation that is in progress. (At any point during the execution
of a program, only one method activation, the one at the top of the activation
stack, is active; the other method activations are suspended.)
For each activation, the corresponding activation record stores the current
variable environment and the program counter (also known as the instruction
pointer), which indicates which part of the body of the method being executed
will be executed next. (JLearner shows the instruction pointer of the active
activation by highlighting the program element in green; it does not show the
instruction pointers of the suspended activations.)
When a method activation is finished, the corresponding activation record is
removed (or popped) from the activation stack, and execution of the caller is
resumed.
Consider again the same execution snapshot shown earlier (repeated here). The
method activation stack currently contains three activation records. The record
at the top of the stack (shown at the bottom in the picture; it is customary
in computer science that stacks are depicted upside-down) corresponds to an
activation of method incrementAge. This is the method that is currently active.
When this activation is finished, the suspended activation of method incrementAges
is resumed. The activation record for incrementAge binds method parameter person
to a reference to the Person object with identification number 12. The activation
record for incrementAges binds method parameter persons to a reference to the
array object, and local variable i to the integer value 2.
The method activation stack is always well-typed: if an activation record binds
a method parameter or local variable of declared type T to a value V, then value
VARIABLES AND VARIABLE DECLARATIONS 13
V is of type T.
JLearner has four kinds of variables: object fields, array components, method
parameters, and local variables. Variables are created at some point during a
program execution and, from that point on, store a value. They can be mutated
to store a different value by means of assignments.
Variable declarations, on the other hand, are parts of the program text and exist
independently of any execution of the program. JLearner has three kinds of
variable declarations: field declarations, method parameter declarations, and
local variable declarations.
• for each field declaration of a class C, there is one object field in the heap
per instance of class C
• for each method parameter declaration of a method M, there is one method
parameter on the method activation stack for each currently active or
suspended activation of method M
• similarly, for each local variable declaration in a method M, there is one
local variable on the method activation stack for each currently active or
suspended activation of method M whose instruction pointer is within the
scope of the local variable declaration
• for the single field declaration int value of class Node, there are three corre-
sponding object fields, one in each of the three instances of Node
• for the single method parameter declaration Node node of method sum, there
are three corresponding method parameters, one in each of the three
currently active or suspended activations of method sum
• for the single local variable declaration int value in method sum, there
are two corresponding local variables, one in each of the two currently
suspended activations of method sum. (There is no corresponding local
variable in the active activation of method sum because its instruction
pointer is not in the scope of the local variable declaration.)
• for the method parameter declaration int x of method foo, there are no
corresponding method parameters on the method activation stack, because
there is no current activation of method foo.
Expressions
An expression is a part of a program that can be evaluated in a given variable
environment to produce a result value. Evaluation of an expression may also
produce side effects. In JLearner, the side effects are the object creations and the
variable mutations. That is, in addition to producing a result value, evaluation
of an expression may create objects, bind different values to object fields and
array components in the heap, and bind different values to method parameters
and local variables in the active method activation record.
Some expressions can be evaluated to a variable: field selection expressions can
be evaluated to an object field, array component selection expressions can be
evaluated to an array component, and variable expressions can be evaluated to a
method parameter or local variable. These are the expressions that can be used
as the left-hand side of an assignment expression. To evaluate these expressions
EXPRESSIONS 15
to produce a value, they are first evaluated to a variable, and then the variable
is inspected to obtain its current value.
The JLearner expressions are:
• the literal expressions, e.g. true, 42, null
• the operator expressions, e.g. 3 + 5 or x = y * 3
• the variable expressions, e.g. x or myVariable
• the object creation expressions, e.g. new Rectangle() or new int[7]
• the field selection expressions, e.g. myRectangle.width
• the array component selection expressions, e.g. myArray[5]
• the method call expressions, e.g. myMethod(42, false)
• the parenthesized expressions, e.g. (7 - 1)
Literal expressions
The literal expressions are:
• the Boolean literal expressions, true and false
• the integer literal expressions, e.g. 10, 42, 739
• the null literal expression, null
Evaluation of a literal expression has no side-effects and produces as a result
value the value denoted by the expression.
Operator expressions
The operator expressions are:
• the unary operator expressions, e.g. -3, ++x, y--, -myMethod(77)
• the binary operator expressions, e.g. myVariable / myMethod(10),
myVariable = 33
a variable Var, then looks up the value V1 of Var, then evaluates E2 to value
V2, then computes the result V of applying op to V1 and V2, then mutates Var
to store V, and then completes with value V.
Other expressions
The variable expressions are the method parameter names and the local variable
names.
Evaluation of a class instance creation expression new C() creates a new instance
O of class C in the heap, then for each field F declared by C initializes object
field O.F with the default value of the declared type of F, and then completes
with a reference to O as the result value. The default value of type boolean is
false, the default value of type int is 0, and the default value of a reference type
is null.
The field selection expressions are the expressions of the form Expression.FieldName.
The array component selection expressions are the expressions of the form
Expression[Expression].
The method call expressions are the expressions of the form MethodName(Expressions),
where Expressions (known as the argument expressions of the method call
expression) is a comma-separated sequence of zero or more expressions.
Evaluation of a method call expression M(Expressions) first evaluates the argument
expressions Expressions, from left to right, to obtain argument values Args. It
must be the case that the number of arguments equals the number of declared
parameters of method M. Then, a new activation record is pushed onto the
method activation stack, and for each method parameter declaration P of
method M, a method parameter P is added to the new activation record’s
variable environment, initialized with the corresponding argument value. The
new activation record’s instruction pointer is initialized to point to the first
statement of the body of method M. Program execution then proceeds by
executing the body of M. When the new activation finishes with a result value
V, its activation record is removed from the activation stack and evaluation of
the method call expression completes with value V.
The parenthesized expressions are the expressions of the form (Expression).
Operators Associativity
Unary (-, ++, --)
* / % Left
+ - Left
== != < <= > >=
&&
||
= op= Right
Statements
The JLearner statements are:
STATEMENTS 19
• the local variable declaration statements, of the form Type VariableName = Expression;
or Type[] VariableName = ArrayInitializer;, where ArrayInitializer is of the
form { Expressions }, where Expressions is a comma-separated sequence of
zero or more expressions
• the expression statements, of the form Expression;
• the if statements, of the form if (Expression) Statement or if (Expression) Statement else Statement
• the while statements, of the form while (Expression) Statement
• the for statements, of the form for (Initialization; Condition; Update) Statement,
where Initialization is either nothing or of the form Type VariableName = Expression
or of the form VariableName = Expression, Condition is either nothing or an
expression, and Update is either nothing or an expression
• the block statements, of the form { Statements }, where Statements is a
sequence of zero or more statements
• the return statements, of the form return; or return Expression;
• the assert statements, of the form assert Expression;
Execution of a local variable declaration statement T X = E; first evaluates initial-
izer E to a value V (which must be of type T), and then adds a new local variable
named X to the active activation record’s variable environment, initialized with
V.
Execution of a local variable declaration statement T[] X = {Expressions}; first
evaluates Expressions, from left to right, to a sequence of values Vs, then creates a
new array A in the heap with element type T and length N equal to the number
of values, then initializes each component of A with the corresponding value
from Vs, and then adds a new local variable named X to the active activation
record’s variable environment, initialized with a reference to A.
Execution of an expression statement E; evaluates E.
Execution of an if statement if (E) S first evaluates E to a Boolean value V. If
V is true, it then executes S.
Execution of an if statement if (E) S1 else S2 first evaluates E to a Boolean
value V. If V is true, it then executes S1; otherwise, it then executes S2.
Execution of a while statement while (E) S first evaluates E to a Boolean value
V. If V is true, it then executes S and while (E) S, in that order.
evaluates E to a value V and then causes the active method activation to finish
with result value V.
Execution of an assert statement assert E; evaluates E to a value, which must
be the value true.
Further reading
The syntax and meaning of the full Java programming language is specified in
the Java Language Specification.
First Steps in Modular
Programming (Part I)
Contents
• Installing Eclipse and FSC4J
• Building and running our first program in Eclipse
• The problem
• Encapsulating the fields of class Interval
• Moving the methods inside class Interval
• Enforcing encapsulation: accessibility modifiers
• Instance methods
In the previous lecture, we used the JLearner tool to make acquaintance with
the Java programming language. We used variables, arrays, and class objects to
store and process data. From this perspective, Java is not much different from
Python. The main difference is that Java is statically typed, which means that
we need to specify the type of each local variable, method parameter, method
result, array element, and class field in our program. The types we have seen are
int, the type of integers, int[], the type of arrays of integers, int[][], the type
of arrays of arrays of integers, and the various class types corresponding to the
classes we declared, such as Node and List.
In this lecture, we will move on to modular programming. In modular program-
ming, we split our program into modules, with the goal of being able to build,
understand, verify, and evolve each module separately and in parallel with the
other modules of the program. To achieve this, it is necessary that we clearly
specify the API between a module and its clients (the modules that use the
module).
To illustrate this, let’s write a program that stores and manipulates intervals of
integers. An interval is defined by a lower bound and an upper bound, so we
declare a class Interval, with fields lowerBound and upperBound:
class Interval {
int lowerBound;
int upperBound;
21
22 FIRST STEPS IN MODULAR PROGRAMMING (PART I)
source code files end with .java.) Any .java file outside the src folder will be
ignored by Eclipse.
In Java, each class must be in its own file. Let’s create a source code file for the
Interval class. Right-click the interval project’s src node in the Project Explorer
and choose New -> Class. In the New Java Class dialog box, enter Interval as
the class name and choose Finish. You will see that a new node corresponding
to a new source file Interval.java appears in the Project Explorer, and an editor
appears for editing the new source files. Notice that the source file node is below
a new node interval. This node represents the package interval. Indeed, Eclipse
by default adds new classes in a project xyz to a package named xyz. In Java,
packages serve to group classes. We will learn more about packages later.
The initial contents of the Interval.java source file are as follows:
package interval;
For now, replace the given class declaration by the one shown above:
package interval;
class Interval {
int lowerBound;
int upperBound;
}
We now want to write the code that uses the Interval class. In JLearner,
statements and expressions can be written directly in the respective boxes. In
real Java, however, all statements and expressions must be inside methods, and
all methods must be inside classes. Therefore, we will create a class to hold
our program. In the Package Explorer, right-click on the interval package and
choose New -> JUnit Test Case. A JUnit test case is a class intended to just
contain some code that we want to run. Enter IntervalTest as the name for the
test case, and choose Finish. If Eclipse asks whether to add JUnit to the build
path, choose OK.
The initial contents of source file IntervalTest are as follows:
package interval;
import org.junit.jupiter.api.Test;
class IntervalTest {
@Test
void test() {
24 FIRST STEPS IN MODULAR PROGRAMMING (PART I)
We can write our code inside the test method. Replace the existing code by the
code we wrote above:
package interval;
import org.junit.jupiter.api.Test;
class IntervalTest {
@Test
void test() {
Interval interval = new Interval();
interval.lowerBound = 3;
interval.upperBound = 7;
We are now ready to run our code. Right-click on the IntervalTest.java node
in the Package Explorer, and choose Run As -> JUnit Test. The JUnit view
should appear and show a green bar, indicating that our program was executed
successfully, without detecting any errors.
To see what happens if an error is detected, change width == 4 to width == 5
and run the program again. A shortcut for running the same program again
is to simply click the green Play button in the Eclipse toolbar. Notice that
you now get a red bar in the JUnit view. The Failure Trace part of the JUnit
view tells you what went wrong. In this case an AssertionError occurred at
IntervalTest.java:16, meaning line 16 of file IntervalTest.java. Double-click this
message in the Failure Trace to highlight this line in the editor. If we change
width == 5 back to width == 4 and run the program again, we again get a green
bar.
The problem
We have now written a program consisting of two source files: Interval.java
and IntervalTest.java. However, our program is not modular: we cannot change
the structure of class Interval without breaking the IntervalTest program. For
example, suppose we decide that it is better to store intervals by storing their
ENCAPSULATING THE FIELDS OF CLASS INTERVAL 25
lower bound and their width, rather than their lower bound and their upper
bound. Update file Interval.java as follows:
package interval;
class Interval {
int lowerBound;
int width;
}
Even though, conceptually, the class stores the same information as before, just
in a different form, we have now broken our client code (i.e. the code that uses
our class). Indeed, Eclipse now shows a red wavy line below the references to
field upperBound in IntervalTest.java; these references are now broken since this
field no longer exists. These red wavy lines are known as compilation errors.
They are errors that the programming environment detects even before we run
the program. Errors detected only while running a program are called run-time
errors. We here see a major advantage of statically typed programming languages:
many errors can be detected even before we run the program.
import org.junit.jupiter.api.Test;
class IntervalTest {
@Test
void test() {
Interval interval = new Interval();
interval.lowerBound = 3;
interval.upperBound = 7;
Then, in the program, use these methods (known as getters and setters) instead
of accessing the fields of the Interval object directly:
package interval;
import org.junit.jupiter.api.Test;
class IntervalTest {
@Test
void test() {
ENCAPSULATING THE FIELDS OF CLASS INTERVAL 27
Notice that Eclipse does not show any compilation errors anymore. Furthermore,
if we run the program, we get a green bar. More importantly, though, if we now
decide to change the structure of class Interval again, we only need to update
the relevant getters and setters, and we do not need to touch our client program
in method test at all. Indeed, replace field int width by field int upperBound in
Interval.java. Eclipse will show compilation errors in the places where field width
is referenced in file IntervalTest.java, but notice that these references are all
inside the getters and setters. Updating those eliminates the errors:
package interval;
import org.junit.jupiter.api.Test;
class IntervalTest {
@Test
void test() {
Interval interval = new Interval();
setLowerBound(interval, 3);
28 FIRST STEPS IN MODULAR PROGRAMMING (PART I)
setUpperBound(interval, 7);
class Interval {
int lowerBound;
int upperBound;
(You will understand later why we did not need to write the static keyword
when we defined the getters and setters in class IntervalTest originally.)
ENFORCING ENCAPSULATION: ACCESSIBILITY MODIFIERS 29
import org.junit.jupiter.api.Test;
class IntervalTest {
@Test
void test() {
Interval interval = new Interval();
Interval.setLowerBound(interval, 3);
Interval.setUpperBound(interval, 7);
class Interval {
private int lowerBound;
private int upperBound;
By making the fields of class Interval private, we now get the guarantee that any
client code of class Interval that has no compilation errors will not break if we
change the way we store the interval properties. If our module is used by many
clients around the world, this is an extremely valuable guarantee.
Instance methods
Notice that all of the methods of class Interval take a reference to an Interval
object as their first argument. This is of course a very common phenomenon.
For that reason, Java supports a more concise notation for this case, known as
instance methods. An instance method is a method declared without the static
keyword. (A method that does have the static keyword is known as a static
method.) An instance method declared in a class C has an implicit parameter of
type C, called the receiver. To refer to the receiver in the body of an instance
method, use the keyword this. When calling an instance method, the receiver
object is written before the method name, with a dot in between the two:
package interval;
class Interval {
private int lowerBound;
private int upperBound;
int getLowerBound() {
return this.lowerBound;
}
int getUpperBound() {
return this.upperBound;
}
INSTANCE METHODS 31
int getWidth() {
return this.upperBound - this.lowerBound;
}
package interval;
import org.junit.jupiter.api.Test;
class IntervalTest {
@Test
void test() {
Interval interval = new Interval();
interval.setLowerBound(3);
interval.setUpperBound(7);
package interval;
class Interval {
private int lowerBound;
private int upperBound;
int getLowerBound() {
return lowerBound;
}
int getUpperBound() {
return upperBound;
}
int getWidth() {
return upperBound - lowerBound;
}
So, again, this program has exactly the same meaning as the previous ones, and
will behave in exactly the same way; it’s just shorter. Notice that Eclipse tells
you whether it interprets a name as a reference to a local variable or a reference
to a field: references to fields are shown in blue, references to local variables in
black.
First Steps in Modular
Programming (Part II)
Constructors
At this point, to use an Interval object to store a given interval, we must first
create the object using a new Interval() expression, and then set its properties
through the setLowerBound, setUpperBound, and/or setWidth setters. It is often
convenient, and sometimes necessary, to allow client code to create an object
and initialize its properties to desired values in one step. For this purpose, Java
allows us to declare constructors. For example, we can allow clients to create
an Interval object storing the interval with lower bound 3 and upper bound 7
using expression new Interval(3, 7) by inserting a constructor declaration into
class Interval as follows:
package interval;
class Interval {
private int lowerBound;
private int upperBound;
int getLowerBound() {
return lowerBound;
}
int getUpperBound() {
return upperBound;
}
int getWidth() {
return upperBound - lowerBound;
}
33
34 FIRST STEPS IN MODULAR PROGRAMMING (PART II)
lowerBound = value;
}
import org.junit.jupiter.api.Test;
class IntervalTest {
@Test
void test() {
Interval interval = new Interval(3, 7);
Overloading
But what if we want to initialize an Interval object with a given lower bound
and a given width? That is, we would like to additionally declare the following
constructor:
Interval(int lowerBound, int width) {
this.lowerBound = lowerBound;
API SEMANTICS: DOCUMENTATION 35
Java does not allow this, because there would be no way to decide which
constructor to use to execute an instance creation expression new Interval(3, 7).
Java does allow a class to declare multiple constructors, but each constructor
must have either a different number of parameters or different parameter types.
So, one way to add our additional constructor is as follows:
Interval(int lowerBound, int width, Void dummy) {
this.lowerBound = lowerBound;
this.upperBound = lowerBound + width;
}
(Void is a built-in class with no instances; the only value of type Void is null.)
This works: now, we can replace expression new Interval(3, 7) by the equivalent
expression new Interval(3, 4, null).
Declaring multiple members with the same name is known as overloading. Java
supports overloading of constructors, as well as methods. For example, we can
extend our class with a method setWidth with two parameters that updates the
lower bound or the upper bound, depending on the second argument:
void setWidth(int value, boolean updateLowerBound) {
if (updateLowerBound)
lowerBound = upperBound - value;
else
upperBound = lowerBound + value;
}
package interval;
import org.junit.jupiter.api.Test;
class IntervalTest {
@Test
void test() {
Interval interval = new Interval(3, 4, null);
interval.setUpperBound(8);
assert interval.getLowerBound() == 3;
}
}
36 FIRST STEPS IN MODULAR PROGRAMMING (PART II)
It checks that updating an interval’s upper bound leaves its lower bound un-
changed. If we were to update method setUpperBound in class Interval so that it
updates the lower bound and leaves the width unchanged, this change would
break this client: the assert statement would fail. Who would be to blame?
To eliminate this error, who would have to change their code? The author of
Interval or the author of IntervalTest?
In this case, it is reasonable to say that the client is to blame: they assumed that
method setUpperBound would leave the lower bound unchanged, but the author of
class Interval has made no such promises. Unless module authors make specific
guarantees about how a method will behave, it is always safest to assume the
worst and not rely on anything that is not explicitly guaranteed.
So, if we, as a module author, want to allow clients to assume certain facts
about our methods’ behavior, we have to state those facts clearly in the module’s
documentation. Writing clear and precise documentation for modules will be a
central focus of this course. For maximum clarity and precision, we will write
statements in documentation both informally and formally. A fully documented
version of class Interval looks like this:
package interval;
/**
* An object of this class stores an interval of integers.
*
* @invar This interval's lower bound is not greater than its upper bound.
* | getLowerBound() <= getUpperBound()
* @invar This interval's width equals the difference between its upper bound
* and its lower bound.
* | getWidth() == getUpperBound() - getLowerBound()
*/
class Interval {
/**
* @invar This interval's lower bound is not greater than its upper bound.
* | lowerBound <= upperBound
*/
private int lowerBound;
private int upperBound;
int getLowerBound() {
return lowerBound;
}
int getUpperBound() {
return upperBound;
}
int getWidth() {
return upperBound - lowerBound;
}
/**
* Initializes this interval with the given lower bound and upper bound.
API SEMANTICS: DOCUMENTATION 37
*
* @pre The given lower bound is not greater than the given upper bound.
* | lowerBound <= upperBound
* @post This interval's lower bound equals the given lower bound.
* | getLowerBound() == lowerBound
* @post This interval's upper bound equals the given upper bound.
* | getUpperBound() == upperBound
*/
Interval(int lowerBound, int upperBound) {
this.lowerBound = lowerBound;
this.upperBound = upperBound;
}
/**
* Initializes this interval with the given lower bound and width.
*
* @pre The given width is nonnegative.
* | 0 <= width
* @post This interval's lower bound equals the given lower bound.
* | getLowerBound() == lowerBound
* @post This interval's width equals the given width.
* | getWidth() == width
*/
Interval(int lowerBound, int width, Void dummy) {
this.lowerBound = lowerBound;
this.upperBound = lowerBound + width;
}
/**
* Sets this interval's lower bound to the given value.
*
* @pre The given value is not greater than this interval's upper bound.
* | value <= getUpperBound()
* @post This interval's lower bound equals the given value.
* | getLowerBound() == value
* @post This interval's upper bound has remained unchanged.
* | getUpperBound() == old(getUpperBound())
*/
void setLowerBound(int value) {
lowerBound = value;
}
/**
* Sets this interval's upper bound to the given value.
*
* @pre The given value is not less than this interval's lower bound.
* | getLowerBound() <= value
* @post This interval's upper bound equals the given value.
* | getUpperBound() == value
* @post This interval's lower bound has remained unchanged.
* | getLowerBound() == old(getLowerBound())
*/
void setUpperBound(int value) {
upperBound = value;
}
/**
38 FIRST STEPS IN MODULAR PROGRAMMING (PART II)
/**
* Sets this interval's width to the given value.
*
* @pre The given value is nonnegative.
* | 0 <= value
* @post This interval's width equals the given value.
* | getWidth() == value
* @post If the caller specified that the lower bound should be updated, the
* upper bound has remained unchanged.
* | !updateLowerBound || getUpperBound() == old(getUpperBound())
* @post If the caller specified that the lower bound should not be updated,
* the lower bound has remained unchanged.
* | updateLowerBound || getLowerBound() == old(getLowerBound())
*/
void setWidth(int value, boolean updateLowerBound) {
if (updateLowerBound)
lowerBound = upperBound - value;
else
upperBound = lowerBound + value;
}
documentation for the module author’s own use to facilitate their reasoning
about the correctness of their module.
• Public invariants, indicated by tag @invar in the documentation comment
for the class declaration itself, state conditions on the values returned by the
getters of an object that must be true whenever no constructor or method
of the object is being executed. It is a module author’s responsibility to
ensure this.
40 FIRST STEPS IN MODULAR PROGRAMMING (PART II)
Managing Complexity
through Modularity and
Abstraction
The question addressed by this course is: how can we manage the complexity of
the task of building large software systems?
The approach we teach is an instance of a general approach for any complex
task: divide and conquer. This means that we try to split the problem up into
subproblems, such that 1) each subproblem is easier to solve than the overall
problem, and 2) solving all of the subproblems yields a solution for the overall
problem.
In the case of software development, this means trying to decompose the system
development task into a set of simpler subsystem development tasks, such that
the resulting subsystems can be composed to yield a system that solves the
overall task.
One of the most effective ways to come up with such a decomposition is to try
to come up with effective abstractions. In this course, we teach two main types
of abstractions: procedural abstractions and data abstractions.
Procedural abstractions
In the case of procedural abstractions, we try to identify operations such that it
would make the system development task easier if those operations were built
into the programming language.
Only a very limited number of operations are built into the Java programming
language: +, -, *, /, %, <, >, <=, >=, ==, !=, &&, ||, and the bitwise operators &, |, ^,
<<, >>, and >>>. Anything else you want to do in a Java program, you need to
implement in terms of these basic operations.
For example, if an application needs to perform square root computations, it will
41
42 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION
module developers’ life would not have been made much easier. Again
they would be confronted with the complexity of the implementation. A
module’s documentation must hide the implementation complexity as much
as possible and define the meaning of the API as simply as possible.
/**
* Returns the square root of the given nonnegative integer, rounded down.
*
* @pre The given integer is nonnegative.
* | 0 <= x
* @post The result is nonnegative.
* | 0 <= result
* @post The square of the result is not greater than the given integer.
* | (long)result * result <= x
* @post The square of one more than the result is greater than the given integer.
* | x < ((long)result + 1) * ((long)result + 1)
*/
public static int squareRoot(int x) {
int result = 0;
while ((long)result * result <= x)
result++;
return result - 1;
}
(Note that we need to cast the int result to long before computing its square or
adding one, to avoid arithmetic overflow.)
Notice that the documentation does not reveal the algorithm used to compute
the square root. (The algorithm shown is a very slow and naive one; better-
performing ones undoubtedly exist. Faster algorithms would undoubtedly be
more complex and further increase the complexity reduction achieved by the
layer of abstraction.)
Proper documentation for a module, then, simplifies both the task of developing
the module itself, and the task of developing client modules that use the module:
• The developers that implement the module need to look only at the
module’s API documentation to learn what the module is supposed to do.
• The developers that implement the client modules (i.e. the modules that
use the module) need to look only at the module’s API documentation to
learn what the module does.
44 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION
Data abstractions
Procedural abstractions allow clients to work with a more powerful programming
language, that has more operations built in.
Similarly, application development would often benefit from having more
datatypes built in besides the ones that are built into the Java programming
language itself.
The datatypes built into Java itself are the following:
Datatype Values
byte The integers between -128 and 127
short The integers between -32768 and 32767
int The integers between -231 and 231 - 1
long The integers between -263 and 263 - 1
char The integers between 0 and 65535
boolean true and false
float The single-precision floating-point numbers
double The double-precision floating-point numbers
DATA ABSTRACTIONS 45
• Values of type char are used to represent text characters. The notation 'X'
denotes the char value assigned to character X by the Unicode character
encoding standard. For example, 'A' denotes the char value 65.
• The arithmetic operations (+, -, *, /, %) do not operate directly on types
byte, short, or char; instead, operands of type byte, short, or char are first
promoted to type int. For example, the type of expression 'A'- 'Z' is int
and its value is -25.
• An expression that applies an arithmetic operator to two operands of
type int is of type int itself, even though the mathematical result of the
operation might not be within the limits of type int. In that case (known
as arithmetic overflow), the value 232 is added to or subtracted from the
result as many times as necessary to bring it within the limits of the type.
For example: the expression 2000000000 + 2000000000 is of type int and its
value is -294967296 (= 4000000000 - 232 ).
• If either operand of an arithmetic expression is of type long and the other
operand is of any integer type, the expression itself is of type long and 264
is added to or subtracted from the result as many times as necessary to
bring it within the limits of type long.
• Type float has three kinds of values:
1. The real values, of the form M × 2E where mantissa M is an integer
between -224 + 1 and 224 - 1, and exponent E is an integer between
-149 and 104. Note, however, that there are two distinct "zero" values:
positive zero (0f) and negative zero (-0f).
2. The infinities, Float.POSITIVE_INFINITY and Float.NEGATIVE_INFINITY.
3. The Not-a-Number (NaN) values; these are used to represent the
result of operations that have no well-defined result in mathematics,
such as 0f/0f and Float.POSITIVE_INFINITY/Float.POSITIVE_INFINITY. You
can tell whether a float value is a Not-a-Number value using method
Float.isNaN.
• Analogously, type double has three kinds of values:
1. The real values, of the form M × 2E where mantissa M is an integer
between -253 + 1 and 253 - 1, and exponent E is an integer between
-1074 and 971. Again, there are two distinct zero values: positive zero
(0.0) and negative zero (-0.0).
2. The infinities, Double.POSITIVE_INFINITY and Double.NEGATIVE_INFINITY.
3. The NaN values. You can tell whether a double value is a NaN value
using method Double.isNaN.
• This means that float values can be used to represent positive numbers
between 10-37 and 1038 with 6 decimal digits of precision (as well as their
negation), and double values can be used to represent positive numbers
between 10-307 and 10308 with 15 decimal digits of precision (as well as
their negation).
• Floating-point numbers can be written using scientific notation:
1.00001e-37f denotes the float value closest to 1.00001 × 10 , and
-37
46 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION
/**
* Each instance of this class represents a rational number.
*
* @immutable
*
* @invar The denominator is positive.
* | 0 < getDenominator()
* @invar The numerator is greater than the minimum {@code long} value.
* | Long.MIN_VALUE < getNumerator()
* @invar The fraction is irreducible: the greatest common divisor of
* the absolute value of the numerator and the denominator is one.
* | MoreMath.gcd(Math.abs(getNumerator()), getDenominator()) == 1
*/
public class Fraction {
/**
* @invar | 0 < denominator
48 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION
/**
* An object that represents the number zero.
*/
public static final Fraction ZERO = new Fraction(0, 1);
/**
* Returns an object representing the rational number defined by the given
* numerator and denominator.
*
* @throws IllegalArgumentException if the given denominator is zero.
* | denominator == 0
* @may_throw ArithmeticException if arithmetic overflow occurs.
* | true
* @post The result is not null.
* | result != null
* @post The rational number represented by the result equals the rational
* number defined by the given numerator and denominator.
* | BigInteger.valueOf(result.getNumerator())
* | .multiply(BigInteger.valueOf(denominator)).equals(
* | BigInteger.valueOf(numerator).multiply(
* | BigInteger.valueOf(result.getDenominator())))
*/
public static Fraction of(long numerator, long denominator) {
if (denominator == 0)
throw new IllegalArgumentException("denominator is zero");
if (numerator == 0)
return ZERO;
long gcd = MoreMath.gcd(
MoreMath.absExact(numerator),
MoreMath.absExact(denominator));
if (denominator < 0)
gcd = -gcd;
return new Fraction(numerator / gcd, denominator / gcd);
}
/**
* Returns whether this object and the given object represent the same
* rational number.
*
* @throws IllegalArgumentException if {@code other} is null.
* | other == null
* @post
* | result == (
DATA ABSTRACTIONS 49
/**
* Returns an object representing the rational number obtained by adding
* the rational number represented by this object to the rational number
* represented by the given object.
*
* @throws IllegalArgumentException if {@code other} is null.
* | other == null
* @may_throw ArithmeticException if arithmetic overflow occurs.
* | true
* @post The result is not null.
* | result != null
* @post a/b == c/d + e/f if and only if adf == cbf + ebd.
* | BigInteger.valueOf(result.getNumerator()).
* | multiply(BigInteger.valueOf(this.getDenominator())).
* | multiply(BigInteger.valueOf(other.getDenominator())).
* | equals(
* | BigInteger.valueOf(this.getNumerator()).
* | multiply(BigInteger.valueOf(result.getDenominator())).
* | multiply(BigInteger.valueOf(other.getDenominator())).
* | add(BigInteger.valueOf(other.getNumerator()).
* | multiply(BigInteger.valueOf(result.getDenominator())).
* | multiply(BigInteger.valueOf(this.getDenominator()))))
*/
public Fraction plus(Fraction other) {
if (other == null)
throw new IllegalArgumentException("other is null");
long gcd = MoreMath.gcd(this.denominator, other.denominator);
long numerator = Math.addExact(
Math.multiplyExact(this.numerator, other.denominator / gcd),
Math.multiplyExact(other.numerator, this.denominator / gcd));
long denominator =
Math.multiplyExact(this.denominator, other.denominator / gcd);
return Fraction.of(numerator, denominator);
}
/**
* Returns the absolute value of the given number.
*
50 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION
/**
* Returns whether the first given number divides the second given number.
*
* @pre The first given number is not zero.
* | a != 0
* @post | result == (b % a == 0)
*/
public static boolean divides(long a, long b) {
return b % a == 0;
}
/**
* Returns the greatest common divisor of the two given integers.
*
* @pre The given numbers are nonnegative.
* | 0 <= a && 0 <= b
* @pre At least one given number is nonzero.
* | a != 0 || b != 0
* @post The result is positive.
* | 0 < result
* @post The result divides both given numbers.
* | divides(result, a) && divides(result, b)
* @post No greater number divides both given numbers.
* | LongStream.range(result, Math.max(a, b)).allMatch(x ->
* | !(divides(x + 1, a) && divides(x + 1, b)))
*/
public static long gcd(long a, long b) {
if (a == 0) return b;
if (b == 0) return a;
if (a < b) return gcd(b % a, a);
return gcd(a % b, b);
}
Notice that class Fraction does not expose any means for clients to mutate the
association between a Fraction instance and the rational number it represents.
Classes like this, where clients cannot mutate an instance’s abstract value, are
called immutable classes. We say they implement an immutable value abstraction.
In order to allow clients to understand that a class is immutable without having
to check all methods, we include the @immutable tag in the Javadoc comment for
the class.
DATA ABSTRACTIONS 51
Notice, furthermore, that class Fraction does not expose a constructor. Instead,
it exposes a static method of to allow clients to obtain a Fraction instance that
represents a given abstract value. Exposing such a method, typically called of or
valueOf, is common practice for immutable classes; it allows the implementation
to avoid the creation of a new instance in some cases. For example, class Fraction
reuses an existing object whenever the client requests an object that represents
abstract value zero. Of course, such reuse is safe only if the class is immutable.
Notice also that the documentation for Fraction uses Java’s BigInteger class to
avoid arithmetic overflow inside the documentation.
Client modules can use the Fraction class to perform financial computations in a
safe and simple way. For example: the assert statement in the following code
snippet succeeds:
Fraction tenCents = Fraction.of(10, 100);
Fraction total = Fraction.of(0, 100);
for (int i = 0; i < 10; i++)
total = total.plus(tenCents);
assert total.equals(Fraction.of(100, 100));
/**
* Each instance of this class stores, at each point in time, a rational number.
*
* @invar The denominator is positive.
* | 0 < getDenominator()
* @invar The numerator is greater than the minimum {@code long} value.
* | Long.MIN_VALUE < getNumerator()
* @invar The fraction is irreducible: the greatest common divisor of
* the absolute value of the numerator and the denominator is one.
* | MoreMath.gcd(Math.abs(getNumerator()), getDenominator()) == 1
*/
public class FractionContainer {
/**
* @invar | 0 < denominator
* @invar | Long.MIN_VALUE < numerator
* @invar | MoreMath.gcd(Math.abs(numerator), denominator) == 1
*/
private long numerator;
private long denominator;
/**
* Returns whether the rational number stored by this object
52 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION
/**
* Initializes this object so that it stores the number zero.
* @post | getNumerator() == 0
*/
public FractionContainer() {
denominator = 1;
}
/**
* Mutates this object so that it stores the rational number defined
* by the given numerator and denominator.
*
* @throws IllegalArgumentException if the given denominator is zero.
* | denominator == 0
* @may_throw ArithmeticException if arithmetic overflow occurs.
* | true
* @post The rational number stored by this object equals the rational
* number defined by the given numerator and denominator.
* | BigInteger.valueOf(getNumerator())
* | .multiply(BigInteger.valueOf(denominator)).equals(
* | BigInteger.valueOf(numerator)
* | .multiply(BigInteger.valueOf(getDenominator())))
*/
public void set(long numerator, long denominator) {
if (denominator == 0)
throw new IllegalArgumentException("denominator is zero");
long gcd = MoreMath.gcd(
MoreMath.absExact(numerator),
MoreMath.absExact(denominator));
if (denominator < 0)
gcd = -gcd;
this.numerator = numerator / gcd;
this.denominator = denominator / gcd;
DATA ABSTRACTIONS 53
/**
* Mutates this object so that it stores the rational number obtained by adding
* the old rational number stored by this object to the rational number
* defined by the given numerator and denominator.
*
* @throws IllegalArgumentException if the given denominator is zero.
* | denominator == 0
* @may_throw ArithmeticException if arithmetic overflow occurs.
* | true
* @post a/b == c/d + e/f if and only if adf == cbf + ebd.
* | BigInteger.valueOf(getNumerator()).
* | multiply(BigInteger.valueOf(old(getDenominator()))).
* | multiply(BigInteger.valueOf(denominator)).
* | equals(BigInteger.valueOf(old(getNumerator())).
* | multiply(BigInteger.valueOf(getDenominator())).
* | multiply(BigInteger.valueOf(denominator)).add(
* | BigInteger.valueOf(numerator).
* | multiply(BigInteger.valueOf(getDenominator())).
* | multiply(BigInteger.valueOf(old(getDenominator())))))
*/
public void add(long numerator, long denominator) {
if (denominator == 0)
throw new IllegalArgumentException("denominator is zero");
long gcd = MoreMath.gcd(this.denominator, MoreMath.absExact(denominator));
if (denominator < 0)
gcd = -gcd;
long newNumerator = Math.addExact(
Math.multiplyExact(this.numerator, denominator / gcd),
Math.multiplyExact(numerator, this.denominator / gcd));
long newDenominator =
Math.multiplyExact(this.denominator, denominator / gcd);
set(newNumerator, newDenominator);
}
Notice that the association between the FractionContainer instance and the rational
number it stores changes 11 times during this computation: initially, it stores
the value 0, then, after the first add call, the value 1/10, then the value 2/10,
etc. Therefore, it is not correct to say that an instance of FractionContainer is a
rational number; rather, we can only say that it stores, at any given point in
time, a particular rational number.
The example shows both the main advantage and the main disadvantage of
mutable classes compared to immutable classes: when working with mutable
54 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION
classes, performance is often better because the program creates fewer objects.
On the other hand, when working with immutable classes, reasoning about the
program is generally easier because there is no need to distinguish between an
object and its abstract value; indeed, it is fine to think of object Fraction.ZERO as
being the number zero.
To highlight the fact that it is not correct to think of a FractionContainer instance
as being a fraction, we have named the class FractionContainer rather than Fraction;
however, in practice mutable classes are very often called after the type of abstract
values they store. For example, in the Java Collections API, class ArrayList is a
mutable class for storing lists of objects; it is not correct to think of an ArrayList
object as being a list of objects.
Representation Objects and
Representation Exposure
In a data abstraction, each instance of the class that implements the abstraction is
associated conceptually with a particular abstract state. In the case of a mutable
abstraction, this association can change over time; in the case of an immutable
abstraction, it is fixed upon creation of the object. Clients can use getter
methods to inspect an object’s abstract state. In order for the class to be able to
implement these getter methods, it must store some concrete representation of
the object’s abstract state in the computer’s memory. For example, in order for
class Interval to be able to implement methods getLowerBound(), getUpperBound(),
and getWidth(), it must store sufficient information to be able to derive an Interval
object’s lower bound, upper bound, and width in the computer’s memory. One
example implementation of class Interval that we have seen stores an Interval
object’s lower bound and width in fields of the object. We call these fields, and
the way they relate to the object’s abstract state, the object’s representation.
Usually, there are many different possible ways to design the represention for a
given abstraction: for example, another example implementation of class Interval
that we have seen stores an Interval object’s lower bound and upper bound,
rather than its lower bound and width.
In this chapter, we first use the example of a String class to introduce the concept
of representation objects, and to show how representation exposure can break
immutability. Then, we use a FractionList example to show how representation
exposure can break consistency of an abstraction’s representation. Finally, we
use a ByteBuffer example to show how representation exposure can break modular
reasoning.
55
56 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE
Strings
For example, consider (an extract from) the API of class String from the Java
Platform API:
/**
* Each instance of this class represents a sequence of text characters.
*
* @immutable
*/
public class String {
/**
* Returns the length of the sequence of characters represented by this object.
*
* @post | 0 <= result
*/
public int length()
/**
* Returns the character at the given index in the sequence of characters
* represented by this object. The first character is at index 0.
*
* @throws IndexOutOfBoundsException if the given index is less than zero or
* not less than the length of this object.
* | index < 0 || length() <= index
*/
public char charAt(int index)
/**
* Returns a `String` object of length 1 containing the single given character.
*
* @post | result != null
* @post | result.length() == 1
* @post | result.charAt(0) == c
*/
public static String valueOf(char c)
/**
* Returns a `String` object that represents the sequence of characters
* obtained by concatenating the sequence of characters represented by this
* object and the given object, respectively.
*
* @throws NullPointerException | other == null
* @post | result != null
* @post | result.length() == length() + other.length()
* @post | IntStream.range(0, length()).allMatch(i ->
* | result.charAt(i) == charAt(i))
* @post | IntStream.range(0, other.length()).allMatch(i ->
* | result.charAt(length() + i) == other.charAt(i))
*/
public String concat(String other)
yields a String object that represents the sequence of characters Hi!. (An
equivalent String object can be written more concisely using the expres-
sion "Hi!". Also, concatenation of String objects can be written more
concisely using the + operator; another equivalent expression is therefore
String.valueOf('H') + String.valueOf('i') + String.valueOf('!').)
It is impossible to represent the abstract value of a String object using just the
fields of the object: a class with N fields can store at most N values, and a
String object must be able to store arbitrarily many characters. Therefore, any
implementation of class String must necessarily use auxiliary objects to help
represent its state.
The following implementation of class String uses an auxiliary array object to
store the characters of the string:
/**
* Each instance of this class represents a sequence of text characters.
*
* @immutable
*/
public class String {
/**
* @invar | characters != null
*
* @representationObject
*/
private final char[] characters;
/**
* Returns the length of the sequence of characters represented by this object.
*
* @post | 0 <= result
*/
public int length() {
return characters.length;
}
/**
* Returns the character at the given index in the sequence of characters
* represented by this object. The first character is at index 0.
*
* @throws IndexOutOfBoundsException if the given index is less than zero
* or not less than the length of this object.
* | index < 0 || length() <= index
*/
public char charAt(int index) {
if (index < 0 || length() <= index)
throw new IndexOutOfBoundsException();
return characters[index];
}
58 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE
/**
* Returns a `String` object of length 1 containing the single given character.
*
* @post | result != null
* @post | result.length() == 1
* @post | result.charAt(0) == c
*/
public static String valueOf(char c) {
return new String(new char[] {c});
}
/**
* Returns a `String` object that represents the sequence of characters
* obtained by concatenating the sequence of characters represented by this
* object and the given object, respectively.
*
* @throws NullPointerException | other == null
* @post | result != null
* @post | result.length() == length() + other.length()
* @post | IntStream.range(0, length()).allMatch(i ->
* | result.charAt(i) == charAt(i))
* @post | IntStream.range(0, other.length()).allMatch(i ->
* | result.charAt(length() + i) == other.charAt(i))
*/
public String concat(String other) {
char[] cs = new char[characters.length + other.characters.length];
System.arraycopy(characters, 0, cs, 0, characters.length);
System.arraycopy(other.characters, 0, cs, characters.length,
other.characters.length);
return new String(cs);
}
After a String object S has been initialized, field S.characters points to an array
object, let’s call it A, that stores the abstract value of S. We call A a representation
object of S.
Notice that the existence of A is completely invisible to clients of S: the API of
class String provides no means for clients to obtain a reference to an instance’s
representation object. We say the representation object is encapsulated.
Representation Exposure
Let’s see what happens if we break encapsulation. Here is a first attempt to add
a method toCharArray to class String:
/**
* Returns an array containing the sequence of characters represented by this
* object.
*
* @post | result != null
* @post | result.length == length()
* @post | IntStream.range(0, length()).allMatch(i ->
REPRESENTATION EXPOSURE 59
* | result[i] == charAt(i))
*/
public char[] toCharArray() {
return characters; // WRONG! Representation exposure
}
Although this method satisfies its postconditions, it is still wrong. This is because
it leaks the receiver object’s representation object; it exposes the representation
object to clients. This is wrong because it allows clients to perform inappropriate
mutations of the abstract value of the String object by mutating the representation
object. That is, it allows clients to break the immutability of the String object:
String a = String.valueOf('a');
assert a.charAt(0) == 'a'; // Succeeds
a.toCharArray()[0] = 'b';
assert a.charAt(0) == 'a'; // Fails
This method passes the client-supplied array characters to the String constructor,
which installs its argument as the representation object for the new String object.
As a result, the client can mutate the new String object’s abstract state by
mutating the array object:
char[] cs = {'a'};
String a = String.valueOf(cs);
assert a.charAt(0) == 'a'; // Succeeds
cs[0] = 'b';
assert a.charAt(0) == 'a'; // Fails
Here, too, arrays are used for two different purposes: to represent a String
object’s abstract state, and to contain a sequence of characters to be passed as
an argument to a method call. It is crucial to always use separate objects for
these two different purposes.
/**
* @invar | elements != null
* @invar | Arrays.stream(elements).allMatch(element -> element != null)
*
* @representationObject
*/
private Fraction[] elements;
/**
* Returns the number of elements in the list of fractions stored by this
* object.
*
* @post | 0 <= result
*/
public int getSize() {
return elements.length;
}
/**
* Returns the element at the given index in the list of fractions stored by
* this object.
*
* @throws IndexOutOfBoundsException | 0 < index || index <= getSize()
*/
public Fraction getElementAt(int index) {
if (index < 0 || getSize() <= index)
throw new IndexOutOfBoundsException();
return elements[index];
}
/**
* Returns the sum of the elements of the list of fractions stored by this
62 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE
* object.
*
* @post | Objects.equals(result,
* | IntStream.range(0, getSize())
* | .mapToObj(i -> getElementAt(i))
* | .reduce(Fraction.ZERO, (x, y) -> x.plus(y)))
*/
public Fraction getSum() {
return Arrays.stream(elements).reduce(Fraction.ZERO, (x, y) -> x.plus(y));
}
/**
* Initializes this object to store an empty list of fractions.
*
* @post | getSize() == 0
*/
public FractionList() {
elements = new Fraction[0];
}
/**
* Adds the given element to the end of the list of fractions stored by this
* object.
*
* @throws NullPointerException | element == null
* @mutates | this
* @post | getSize() == old(getSize()) + 1
* @post | Arrays.equals(
* | IntStream.range(0, old(getSize()))
* | .mapToObj(i -> getElementAt(i)).toArray(),
* | old(IntStream.range(0, getSize())
* | .mapToObj(i -> getElementAt(i)).toArray()))
* @post | Objects.equals(getElementAt(old(getSize())), element)
*/
public void add(Fraction element) {
if (element == null)
throw new IllegalArgumentException("element is null");
Suppose, now, that we want to add a method for retrieving an array containing
a FractionList object’s list of fractions. Here is a first attempt at adding such a
method:
public Fraction[] getElements() {
return elements; // WRONG! Leaks representation object.
}
This method leaks the receiver object’s representation object to the client. Even
though class FractionList is a mutable class, this is still wrong, because this allows
MODULAR REASONING: @MUTATES 63
Method getSum relies on the receiver’s representation invariants for its safe execu-
tion; indeed, running this method after breaking the representation invariants
causes the method to crash.
As before, we can fix this leak by copying the array:
/**
* Returns a new array containing the list of fractions stored by this object.
*
* @creates | result
* @post | Arrays.equals(result, IntStream.range(0, getSize())
* | .mapToObj(i -> getElementAt(i)).toArray())
*/
public Fraction[] getElements() {
return elements.clone();
}
/**
* @invar | bytes != null
* @invar | 0 <= offset
64 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE
*/
private byte[] bytes;
private int offset;
/**
* Returns an array containing the sequence of bytes stored by this object.
*
* @post | result != null
*/
public byte[] getBytes() { return bytes; }
/**
* Returns the offset stored by this object.
*
* @post | 0 <= result
*/
public int getOffset() { return offset; }
/**
* Initializes this object so that it stores the given sequence of bytes and
* offset zero.
*
* @throws IllegalArgumentException if the given array is null
* | bytes == null
* @post | Arrays.equals(getBytes(), bytes)
* @post | getOffset() == 0
*/
public ByteBuffer(byte[] bytes) {
if (bytes == null)
throw new IllegalArgumentException("bytes is null");
this.bytes = bytes;
}
/**
* Writes the given byte into the sequence of bytes stored by this object
* at the current offset, and increments the offset.
*
* @throws ArrayIndexOutOfBoundsException if the current offset is outside
* the bounds of the sequence of bytes stored by this object.
* | getBytes().length <= getOffset()
* @mutates | this // WRONG!
* @post | getOffset() == old(getOffset()) + 1
* @post | getBytes().length == old(getBytes().length)
* @post | getBytes()[old(getOffset())] == b
* @post | IntStream.range(0, getBytes().length).allMatch(i ->
* | i == old(getOffset()) || getBytes()[i] == old(getBytes())[i])
*/
public void put(byte b) {
this.bytes[offset] = b;
offset++;
}
Calling method put on myBuffer mutates myBytes. Based on the documentation for
method put, this is completely unexpected; indeed, method put’s @mutates clause
asserts that the method mutates only this, i.e. myBuffer itself.
There are (at least) two possible ways to fix the ByteBuffer class, corresponding
with two different design decisions, both of which are reasonable and useful and
occur in practice. The relevant design question here is: do we want to hide the
array that backs the ByteBuffer object from the client, or do we want the client
to be aware of it? Both options have advantages and disadvantages: the former
option leads to a more abstract and arguably simpler API; the latter option has
better performance because it avoids the need to copy the array.
We show elaborations of both options below.
ByteBuffer (opaque)
In this version of class ByteBuffer, the backing array is hidden from the client.
This class is similar to Java’s ByteArrayOutputStream class.
/**
* Each instance of this class stores a sequence of bytes and an offset into
* that sequence.
*/
public class ByteBuffer {
/**
* @invar | bytes != null
* @invar | 0 <= offset
*
* @representationObject
*/
private byte[] bytes;
private int offset;
/**
* Returns a new array containing the sequence of bytes stored by this object.
*
* @creates | result
* @post | result != null
*/
public byte[] getBytes() { return bytes.clone(); }
/**
* Returns the offset stored by this object.
*
* @post | 0 <= result
*/
public int getOffset() { return offset; }
66 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE
/**
* Initializes this object so that it stores the given sequence of bytes
* and offset zero.
*
* @throws NullPointerException if the given array is null
* | bytes == null
* @inspects | bytes
* @post | Arrays.equals(getBytes(), bytes)
* @post | getOffset() == 0
*/
public ByteBuffer(byte[] bytes) {
this.bytes = bytes.clone();
}
/**
* Writes the given byte into the sequence of bytes stored by this object
* at the current offset, and increments the offset.
*
* @throws ArrayIndexOutOfBoundsException if the current offset is outside
* the bounds of the sequence of bytes stored by this object.
* | getBytes().length <= getOffset()
* @mutates | this
* @post | getOffset() == old(getOffset()) + 1
* @post | getBytes().length == old(getBytes().length)
* @post | getBytes()[old(getOffset())] == b
* @post | IntStream.range(0, getBytes().length).allMatch(i ->
* | i == old(getOffset()) || getBytes()[i] == old(getBytes())[i])
*/
public void put(byte b) {
this.bytes[offset] = b;
offset++;
}
Notice that in this version, the @mutates clause of method put still only mentions
this. This is fine, because the meaning of @mutates | O is that the method can
mutate object O as well as any representation object of O (as well as the represen-
tation objects of those objects, if they have any, and so on). Since field bytes is
marked as @representationObject, the array referenced by this field is considered
a representation object of the ByteBuffer object. Since the representation objects
of an object O must never become exposed to clients of O, the constructor and
method getBytes must make the necessary copies. In return, the client can safely
assume that when calling method myBuffer.put, none of the array objects it has
references to are mutated:
byte[] myBytes = {1, 2, 3};
ByteBuffer myBuffer = new ByteBuffer(myBytes);
assert myBytes[0] == 1; // Succeeds
myBuffer.put(4);
assert myBytes[0] == 1; // Succeeds
ByteBuffer (transparent)
In this version of class ByteBuffer, the client is aware of the backing array.
Therefore, the backing array is not a representation object and the contents of
the backing array are not part of the state of the ByteBuffer object. Here, the
ByteBuffer object does not store a sequence of bytes; it merely stores a reference
to an array object.
This class is similar to Java’s ByteBuffer class.
/**
* Each instance of this class stores a reference to a byte array and an offset
* into that array.
*/
public class ByteBuffer {
/**
* @invar | array != null
* @invar | 0 <= offset
*/
private byte[] array;
private int offset;
/**
* Returns the array reference stored by this object.
*
* @post | result != null
* @immutable This object is associated with the same array reference
* throughout its lifetime.
*/
public byte[] getArray() { return array; }
/**
* Returns the offset stored by this object.
*
* @post | 0 <= result
*/
public int getOffset() { return offset; }
/**
* Initializes this object to store the given array reference and offset zero.
*
* @throws IllegalArgumentException if the given array reference is null
* | array == null
* @post | getArray() == array
* @post | getOffset() == 0
*/
public ByteBuffer(byte[] array) {
if (array == null)
throw new IllegalArgumentException("array is null");
this.array = array;
}
68 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE
/**
* Writes the given byte into the referenced array at the current offset, and
* increments the offset.
*
* @throws ArrayIndexOutOfBoundsException if the current offset is outside
* the bound of the referenced array.
* | getArray().length <= getOffset()
* @mutates | this, getArray()
* @post | getOffset() == old(getOffset()) + 1
* @post | getArray()[old(getOffset())] == b
* @post The elements of the array referenced by this object, except for the
* element at the old offset, have remained unchanged.
* | IntStream.range(0, getArray().length).allMatch(i ->
* | i == old(getOffset())
* | || getArray()[i] == old(getArray().clone())[i])
*/
public void put(byte b) {
this.array[offset] = b;
offset++;
}
Notice that method put must mention both this and getArray() in its @mutates
clause; this way, the client will not be surprised when its array object is mutated:
byte[] myBytes = {1, 2, 3};
ByteBuffer myBuffer = new ByteBuffer(myBytes);
assert myBytes[0] == 1; // Succeeds
myBuffer.put(4);
assert myBytes[0] == 4; // Succeeds, as expected:
// `myBuffer.put()` mutates `myBuffer.getArray()`
// a.k.a. `myBytes`
Conclusion
In this chapter, we introduced the notion of representation objects: objects used
internally by an abstraction to help represent an instance’s abstract state. We
showed by means of three examples (String, FractionList, and ByteBuffer) that it
is crucial that representation objects never be exposed to clients, since doing
so can break immutability of the abstraction, consistency of the abstraction’s
representation, and/or modular reasoning about the abstraction by clients. If
an object used by an abstraction is exposed to clients, it is not a representation
object of the abstraction, its state is not a part of the state of the abstraction,
and methods that mutate it must declare this explicitly in their @mutates clause.
How to properly document
single-object abstractions
Concepts
Data abstraction definition and implementation
In Java, a data abstraction is defined by giving its public aspects: the class name,
the names, parameter types, and return types of the class’ public methods, the pa-
rameter types of the class’ public constructors, and the documentation comments
associated with the class, the public methods, and the public constructors.
For example, the following defines a Point abstraction:
/**
* Each instance of this class represents a point with integer coordinates
* in a two-dimensional coordinate system.
*/
public class Point {
/**
* Initializes this instance so that it represents the given point.
* @post | getX() == x
* @post | getY() == y
*/
public Point(int x, int y)
69
70 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS
That is, a Point class implements the data abstraction defined above if its public
aspects are as given above. Generally, many different possible implementations
exist for any given data abstraction. For example, here are two implementations
of the Point abstraction:
/**
* Each instance of this class represents a point with integer coordinates
* in a two-dimensional coordinate system.
*/
public class Point {
private int x;
private int y;
/**
* Initializes this instance so that it represents the given point.
* @post | getX() == x
* @post | getY() == y
*/
public Point(int x, int y) {
this.x = x;
this.y = y;
}
/**
* Each instance of this class represents a point with integer coordinates
* in a two-dimensional coordinate system.
*/
public class Point {
/**
* Stores the distance between this point and the origin of the coordinate system
* (that is, the r component of the polar coordinates of this point).
*/
private double radius;
/**
* Stores the angle, in radians, made by the origin of the
* coordinate system, this point, and the positive X axis
* (that is, the theta component of the polar coordinates of this point).
*/
private double angle;
/**
* Initializes this instance so that it represents the given point.
* @post | getX() == x
* @post | getY() == y
*/
public Point(int x, int y) {
radius = Math.sqrt((double)x * x + (double)y * y);
angle = Math.atan2(y, x);
}
Notice that both implementations have exactly the same public aspects; that is,
they implement exactly the same abstraction. A client cannot tell whether they
are using one or the other.
Abstract state
The abstract state of a given instance O of a data abstraction at a given point in
time is given by the return values of the getters of O at that time. For example, if
at some point in time, we have, for some Interval object O, O.getLowerBound() == 3
and O.getWidth() == 4 and O.getUpperBound() == 7, then O’s abstract state at that
point in time could be written as (getLowerBound(): 3, getWidth(): 4, getUpperBound():
7).
If we then call O.setWidth(10), we change O’s abstract state. After the call returns,
O’s new abstract state is (getLowerBound(): 3, getWidth(): 10, getUpperBound(): 13).
(If we fix the order of the getters, we can write the abstract state as a tuple.
Using this notation, O.setWidth(10) changes O’s abstract state from (3, 4, 7) to
(3, 10, 13).)
Generally, not all elements of an abstraction’s abstract state space are mean-
ingful abstract states of the abstraction. For example, the abstract state
(getLowerBound(): 10, getWidth(): 4, getUpperBound(): 0) is not a meaningful ab-
stract state. We say the abstract state is not valid. For the Interval abstraction,
the valid abstract states are exactly the abstract states (L, W, U) where the
constraints U = L + W and 0 ≤ W hold.
72 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS
If we then call O.setWidth(10), we change O’s concrete state. After the call returns,
O’s new concrete state is (lowerBound: 3, width: 10).
If we fix the order of the fields, we can write the concrete state as a tuple. Using
this notation, O.setWidth(10) changes O’s concrete state from (3, 4) to (3, 10).
Class documentation
If a class implements an immutable abstraction, which implies that the properties
of an instance of the class do not change throughout the lifetime of the instance,
then you shall include an @immutable tag in the Javadoc comment for the class.
If not all abstract states in the abstract state space of a class are valid, then you
shall describe the set of valid abstract states by specifying one or more public
class invariants. A public class invariant for a class can be specified in the form
of an @invar clause in the Javadoc comment for the class or in the form of a
@post clause in the Javadoc comment for a public getter of the class; the choice
between these two forms is a matter of style.
For example, see the following documentation for an immutable TimeOfDay class:
/**
* Each instance of this class represents a time of day, at one-minute resolution.
*
* @immutable
* @invar This object's hours are between 0 and 23
* | 0 <= getHours() && getHours() <= 23
* @invar This object's minutes are between 0 and 59
* | 0 <= getMinutes() && getMinutes <= 59
*/
public class TimeOfDay {
/**
* @invar | 0 <= hours && hours <= 23
* @invar | 0 <= minutes && minutes <= 59
*/
private final int hours;
private final int minutes;
/**
* Initializes this object with the given hours and minutes.
*
* @throws IllegalArgumentException if the given hours are not between 0
* and 23
* | !(0 <= hours && hours <= 23)
* @throws IllegalArgumentException if the given minutes are not between 0
* and 59
* | !(0 <= minutes && minutes <= 59)
74 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS
*
* @post This object's hours equal the given hours
* | getHours() == hours
* @post This object's minutes equal the given minutes
* | getMinutes() == minutes
*/
public TimeOfDay(int hours, int minutes) {
if (!(0 <= hours && hours <= 23))
throw new IllegalArgumentException("hours out of range");
if (!(0 <= minutes && minutes <= 59))
throw new IllegalArgumentException("minutes out of range");
this.hours = hours;
this.minutes = minutes;
}
/**
* Returns whether this time is before the given time.
*
* @pre Argument {@code other} is not {@code null}.
* | other != null
* @post
* The result is {@code true} iff either this object's hours are less
* than the given object's hours, or this object's hours equal the
* given object's hours and this object's minutes are less than the
* given object's minutes.
* | result == (
* | getHours() < other.getHours() ||
* | getHours() == other.getHours() &&
* | getMinutes() < other.getMinutes()
* | )
*/
public boolean isBefore(TimeOfDay other) {
return
hours < other.hours ||
hours == other.hours && minutes < other.minutes;
}
}
Note: the fields in the above example are marked as final. This causes the Java
compiler to check that they are mutated only in the class’ constructor. It is
recommended to mark the fields of immutable classes as final.
If a class exists only to contain static methods, and is not intended to be
instantiated, you shall declare a private constructor to ensure that no public
constructor is implicitly generated.
Fields documentation
All fields of all classes shall be marked as private.
If not all concrete states in the concrete state space of a class are valid, then
you shall describe the set of valid concrete states by means of one or more @invar
clauses in the Javadoc comments for the fields of the class. These specify the
CONSTRUCTORS AND METHODS DOCUMENTATION 75
class’ private class invariants, also called its representation invariants. (It does
not matter which @invar clauses are in the Javadoc comment for which field;
a reasonable approach is to specify all @invar clauses in the Javadoc comment
preceding the entire block of fields.)
The above TimeOfDay example illustrates this.
Defensive programming
In defensive programming, you write code at the start of the method body to
check if the arguments are legal. If not, you throw an IllegalArgumentException
object. You include one or more @throws clauses in the constructor or method’s
Javadoc comment to specify the conditions under which the constructor or
method throws particular types of exceptions.
The constructor in the example above is programmed defensively.
Callers can rely on a method’s @throws clauses: if the condition specified by a
@throws clause is satisfied, the method must throw the specified exception; it is
not allowed to return normally.
Contractual programming
In contractual programming, you do not write any code in the method body
to check whether the arguments are legal; instead, you implement the method
under the assumption that the arguments are legal. You include one or more
@pre clauses in the Javadoc comment for the constructor or method to specify the
conditions that must hold at the start of an execution of the method. These are
called preconditions. If any of these conditions do not hold, the resulting behavior
of the method is unspecified. It may crash or exhibit arbitrary undesired behavior.
It is the caller’s obligation to ensure that the called method’s preconditions are
satisfied.
The method isBefore in the example above is programmed contractually.
Defensive programming is safer than contractual programming, because it keeps
programming errors in one module from leading to failures inside another module.
Programming errors are detected earlier and are easier to diagnose. Therefore,
it is generally the recommended approach. However, in some cases the perfor-
mance cost of the defensive checks is unacceptable. In these cases, contractual
programming is necessary.
76 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS
Postconditions
You shall precisely specify the result and the side-effects of each constructor
and each method, by including one or more @post clauses in the constructor or
method’s Javadoc comment that specify the conditions that must hold when
the constructor or method returns. In these conditions, you can refer to the
method’s result as result. If the method or constructor mutates any objects, you
must fully specify these objects’ new abstract state. Often, to do so you need to
refer to their old abstract state. You can do so using old(E) expressions. The
body of an old(E) expression is evaluated at the start of the execution of the
method.
The rule above implies that you must always explicitly declare at least one
constructor for each public class. Otherwise, Java would implicitly add a default
constructor, and there would be no way for you to document its behavior.
For example:
import java.util.Arrays;
import java.util.stream.IntStream;
/**
* Each instance of this class stores a list of text strings.
*/
public class StringList {
/**
* @invar | elements != null
* @invar | Arrays.stream(elements).allMatch(e -> e != null)
* @representationObject
*/
private String[] elements;
CONSTRUCTORS AND METHODS DOCUMENTATION 77
/**
* @creates | result
* @post The result is not {@code null}
* | result != null
* @post The result's elements are not {@code null}
* | Arrays.stream(result).allMatch(e -> e != null)
*/
public String[] getElements() {
return elements.clone();
}
/**
* Initializes this object so that it stores an empty list of text strings.
*
* @post This object's list of text strings is empty.
* | getElements().length == 0
*/
public StringList() {
elements = new String[0];
}
/**
* Replaces each element of this object by the text string obtained by
* replacing each character of the element by its corresponding uppercase
* letter.
*
* @mutates | this
* @post This object's number of elements equals its old number of elements.
* | getElements().length == old(getElements()).length
* @post Each of this object's elements equals its old element at the same
* index after replacing each character by the corresponding
* uppercase letter.
* | IntStream.range(0, getElements().length).allMatch(i ->
* | getElements()[i].equals(old(getElements())[i].toUpperCase()))
*/
public void allToUpperCase() {
for (int i = 0; i < elements.length; i++)
elements[i] = elements[i].toUpperCase();
}
/**
* Adds the given text strings to the end of this object's list of
* text strings.
*
* @mutates | this
* @inspects | other
*
* @throws IllegalArgumentException if argument {@code other} is
* {@code null}
* | other == null
* @throws IllegalArgumentException if any of the elements of the given array
* are {@code null}
* | Arrays.stream(other).anyMatch(e -> e == null)
*
* @post This object's number of elements equals its old number of
* elements plus the number of given text strings.
78 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS
• In this example, the constraint that method getElements’s result is not null
and that its elements are not null is specified in the form of postconditions
of method getElement. A correct alternative would be to specify it in the
form of public class invariants. The choice between inspector method
postconditions and public class invariants (or a combination of both) is a
matter of style.
• Arrays.stream(array).allMatch(e -> C) expresses that condition C holds for
each element e of array array. Similarly, Arrays.stream(array).anyMatch(e -> C)
expresses that condition C holds for at least one element e of array array.
• IntStream.range(a, b).allMatch(i -> C) expresses that condition C holds for
each integer i between a (inclusive) and b (exclusive).
• Arrays.equals(array1, from1, to1, array2, from2, to2) expresses that the list
of elements of array1 at indices from1 (inclusive) to to1 (exclusive) equals
the list of elements of array2 at indices from2 (inclusive) to to2 (exclusive).
• string1.equals(string2) expresses that String objects string1 and string2 rep-
resent the same list of characters. In contrast, string1 == string2 expresses
that string1 and string2 refer to the same String object.
• The notation {@code text} in an informal part of a Javadoc comment is used
to indicate that text should be formatted as code rather than as natural
language text.
EXPRESSIONS ALLOWED INSIDE JAVADOC FORMAL PARTS 79
Advanced topics
You need not read or apply this section if you are not going for a top score.
To achieve a complete specification of the behavior of a method or constructor,
the @mutates and @inspects clauses should be used to indicate which pre-existing
objects are mutated and inspected by the method or constructor, and the @creates
clause should be used to indicate which objects that are visible to the client after
the method or constructor finishes were created by the method or constructor.
Mutates clauses
A method must not mutate any pre-existing objects not mentioned in its @mutates
clause and not used to represent the state of any objects mentioned in its @mutates
clause. That is, an execution of a method may mutate an object O if and only if
either O was newly created during the method execution (i.e. it did not exist
when the method execution started), or O is mentioned in the method’s @mutates
80 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS
Inspects clauses
Similarly, a method must not inspect the state of any pre-existing mutable objects
not mentioned in its @inspects or @mutates clause and not used to represent the
state of any objects mentioned in its @inspects or @mutates clause.
Instances of immutable classes need not be mentioned in @inspects clauses.
(Documenting which objects are inspected by a method is important for at least
two reasons: 1) the caller must ensure that the inspected objects are in a valid
state, i.e. their representation invariants hold; 2) in a multithreaded program,
no other thread must mutate the inspected objects.)
Defaults
If no @mutates or @inspects clause is specified for a given method or constructor,
the default is that it may inspect and mutate any object that is not an instance
of an immutable class. Exception: if an instance method’s name starts with get
or is, the default is that it may mutate no object and that it may inspect this.
A constructor is always allowed to mutate this.
Creates clauses
By specifying an object in a @creates clause, you indicate that the object was
created during the execution of the method, and furthermore, that the object
is distinct from any direct or indirect representation objects of any objects
mentioned in any of the method’s @inspects or @mutates clauses.
The purpose is to allow the client to conclude that the object will not be inspected
or mutated by any future method calls that mutate or inspect pre-existing objects.
Objects created by a method or constructor that do not become visible to
the client when the method or constructor finishes need not (and cannot) be
ADVANCED TOPICS 81
Polymorphism
Suppose we need to develop a drawing application. A drawing consists of a
number of triangles and circles, so we define a class where each instance represents
a triangle and a class where each instance represents a circle:
package drawings;
public Triangle(int x1, int y1, int x2, int y2, int x3, int y3) {
this.x1 = x1;
this.y1 = y1;
this.x2 = x2;
this.y2 = y2;
this.x3 = x3;
this.y3 = y3;
}
83
84 INHERITANCE
package drawings;
We also say that Triangle and Circle extend Shape and that they inherit from
Shape. Shape is the superclass of Triangle and Circle.
This also means we can use an array of type Shape[] to store both triangles and
circles:
Shape[] shapes = {new Triangle(0, 0, 10, 0, 5, 10), new Circle(5, 10, 5)};
• The static type of a variable reference is the declared type of the variable.
For example, the static type of an expression shape1 that appears in the
scope of a variable declaration Shape shape1 is Shape.
• Analogously, the static type of a field reference is the declared type of the
field, and the static type of a method call is the declared return type of
the method.
Java’s static type-checking rules ensure that whenever an expression E evaluates
to a value V at run time, then V is a value of the static type of E. For example,
Java’s static type checker allows an assignment x = E only if the static type of
expression E is compatible with the declared type of variable x. This ensures that
at any point during the execution of a program, the value stored in a variable is
a value of the declared type of the variable.
To prevent bad method calls or field accesses, Java’s static type checker allows a
method call E.m(...) only if the static type of E is a class that declares a method
named m. Similarly, it allows a field access E.f only if the static type of E is a
class that declares a field named f.
Java’s static type-checker is incomplete, in the technical sense that in some cases
it rejects a program even though no execution of that program would go wrong
at run time.
For example, the following program never goes wrong, but it is rejected by Java’s
static type checker:
Shape myShape = new Triangle(0, 0, 10, 0, 5, 10);
Triangle myTriangle = myShape;
int x1 = myTriangle.getX1();
Java’s type checker rejects the assignment myTriangle = myShape because the static
type of myShape is Shape and class Shape is not a subclass of class Triangle.
Typecasts
To allow programmers to work around the incompleteness of Java’s static type-
checker, Java supports typecasts. The following program is accepted by Java’s
static type checker:
Shape myShape = new Triangle(0, 0, 10, 0, 5, 10);
Triangle myTriangle = (Triangle)myShape;
int x1 = myTriangle.getX1();
Even though the static type of expression myShape is Shape, the static type of
expression (Triangle)myShape is Triangle. The computer checks at run time, when
evaluating expression (Triangle)myShape, that the value of myShape is in fact a
reference to an instance of class Triangle. If not, a ClassCastException is thrown.
We can use instanceof tests and typecasts to compute the drawing commands
for a drawing given by an array of Shapes:
DYNAMIC BINDING 87
Dynamic binding
The implementation of method getDrawingCommands shown above for composing the
drawing commands for a drawing works, but it is not optimal from a modularity
point of view: if we extend our drawing application to also support rectangles,
we need to update method getDrawingCommands.
We can avoid this by using dynamic binding. Since each subclass of class Shape
is supposed to declare a method getDrawingCommand(), we can declare an abstract
method getDrawingCommand() in class Shape:
public abstract class Shape {
Java’s static type checker checks that each class that extends Shape declares a
method named getDrawingCommand that takes no parameters and has return type
String. Correspondingly, since class Shape now declares a method getDrawingCommand,
we can now call getDrawingCommand directly on an expression of static type Shape:
public class Drawing {
}
}
Class Object
If, when declaring a class, we do not explicitly specify a superclass using an
extends clause, the class implicitly extends class java.lang.Object. This means
that every class is a direct or indirect subclass of class java.lang.Object. (Since
package java.lang is always imported implicitly, we can simply write Object.)
This also means that a variable of type Object can be used to store a reference
to any object of any class.
Class Object declares a number of methods:
package java.lang;
/**
* Returns the Class object for this object's class.
*/
public Class getClass() { /* ... */ }
/**
* Returns a number suitable for use as a hash code when using this object as
* a key in a hash table.
*
* Note: two objects that are equal according to the `equals(Object)` method
* must have the same hash code.
*
* The implementation of this method in class java.lang.Object returns a hash
* code based on the identity of this object. That is, this implementation
* usually returns a different number for different objects, although this is
* not guaranteed.
*/
public int hashCode() { /* ... */ }
/**
* Returns a textual representation of this object.
CLASS OBJECT 89
*
* The implementation of this method in class java.lang.Object is based on the
* name of this object's class and this object's identity-based hash code.
*/
public String toString() {
return this.getClass().getName() + "@"
+ Integer.toHexString(this.hashCode());
}
/**
* Returns whether this object is conceptually equal to the given object.
*
* The implementation of this method in class java.lang.Object returns whether
* this object and the given object are the same object.
*/
public boolean equals(Object other) { return other == this; }
// ...
Methods equals, hashCode, and toString are often overridden by immutable classes.
For example:
public class Point {
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + x;
result = prime * result + y;
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
90 INHERITANCE
if (getClass() != obj.getClass())
return false;
Point other = (Point) obj;
if (x != other.x)
return false;
if (y != other.y)
return false;
return true;
}
@Override
public String toString() {
return "Point [x=" + x + ", y=" + y + "]";
}
String[] getLocations() {
return new String[] {"Brussels", "Paris", "Berlin"};
}
class Program {
91
92 MODULAR REASONING ABOUT DYNAMIC BINDING
Modular reasoning
The solution, of course, is to perform modular reasoning instead. In the modular
reasoning approach, we assign a specification to each method, which specifies
the correct behaviors of the method. This way, we define a notion of correctness
of a method: a method M is correct if and only if all of its behaviors are allowed
by its specification, assuming that the method calls performed by M behave in
accordance with the called methods’ specifications.
To convince ourselves of the correctness of a program, we simply need to check
that each method complies with its specification, under the assumption that
called methods comply with theirs. If each of a program’s methods is correct in
MODULAR REASONING 93
this way, and the main method’s specification expresses the allowed behaviors of
the program as a whole, then the correctness of the program as a whole follows
as a corollary.
If we have verified a program modularly, and then we modify one of its methods,
we only need to re-check that that one method still complies with its original
specification. If so, we can immediately conclude that the correctness of the
program as a whole is preserved.
If the changed method no longer complies with its original specification, we
need to update its specification and re-verify its callers as well. This "change
propagation" stops when we reach a method whose original specification is
preserved.
In the example, there are at least two possible specifications that we can assign
to method getLocations:
• We can assign a strong specification:
/**
* @post | result != null
* @post | result.length == 3
* @post | Arrays.stream(result).allMatch(e -> e != null)
*/
String[] getLocations() {
return new String[] {"Brussels", "Paris", "Berlin"};
}
Dynamic binding
Before we discuss modular reasoning about programs that use dynamic binding,
we briefly review the concept of dynamic binding.
For each method call in a Java program, the Java compiler checks, before program
execution starts, that there is a corresponding method in the program. It looks
for this method by considering the static type of the target expression of the call
(the expression before the dot) and the argument expressions. This process is
known as method call resolution; we call the method found this way the resolved
method.
Consider the following example program:
abstract class Company {
String[] getLocations() {
return new String[] {"Brussels", "Paris", "Berlin"};
}
class Program {
/**
* @post | result != null
* @post | result.length == 3
* @post | Arrays.stream(result).allMatch(e -> e != null)
*/
String[] getLocations() {
return new String[] {"Brussels", "Paris", "Berlin"};
}
This approach is modular, in the sense that if we modify a method, and the
modified method still complies with its original specification, we can immediately
conclude that the correctness of the program as a whole is preserved.
However, this approach does not deal optimally with another type of program
modification: adding a new class that extends an existing class.
Indeed, suppose we extend the example program with the following class:
By adding this class, we have enlarged the set of possible callees of call
company.getLocations() in method printLocations. As a result, we need to re-verify
method printLocations. In this case, we discover that it is not correct as-is.
In general, when applying the simple modular reasoning approach defined above
to a program with dynamically bound calls, after adding a class to the program
we need to re-check all methods that perform dynamically bound calls.
We conclude that the basic modular reasoning approach defined above is not
adequate for reasoning about programs that use dynamic binding.
MODULAR REASONING ABOUT DYNAMIC BINDING 97
For example, in the following sequence of specifications for a method abs, each
next specification strengthens the preceding one:
/**
* @pre | false
* @post | true
*/
public static int abs(int x)
/**
* @pre | 0 <= x
* @post | true
*/
public static int abs(int x)
/**
* @pre | 0 <= x
* @post | 0 <= result
*/
public static int abs(int x)
/**
* @pre | true
* @post | 0 <= result
*/
public static int abs(int x)
/**
* @pre | true
* @post | false
*/
public static int abs(int x)
The first specification is the weakest possible one: it does not allow any calls of
the method, so the method is free to crash or exhibit any behavior whatsoever.
The last one is the strongest possible one, because it is unimplementable: since
there exists no method implementation that satisfies postcondition false, it is
vacuously true that every such implementation has the desired behavior (for any
definition of "desired behavior" whatsoever).
Inheritance of specifications
In the literature on behavioral subtyping, some authors propose that the effective
specification of a method should be the conjunction of the declared specification
and any inherited specifications. (The conjunction of a specification with precon-
dition P1 and postcondition Q1 and a specification with precondition P2 and
postcondition Q2 is the specification with precondition "P1 or P2" and postcon-
dition "(if old(P1) then Q1) and (if old(P2) then Q2)".) In this approach, called
specification inheritance, if the implementation of each method of a program
complies with its effective specification, the program automatically complies with
behavioral subtyping.
In this course, however, we do not apply specification inheritance; a method’s
effective specification is exactly its declared specification. (Therefore, we can
speak simply of "a method’s specification" without introducing ambiguity.)
We make one exception to this rule: if the Javadoc comment associated with a
method M does not contain any specification clauses at all (i.e. no @pre, @post,
100 MODULAR REASONING ABOUT DYNAMIC BINDING
Further reading
Note: the material in this section is outside the scope of the course and is
provided for the information of interested students only.
For a much more detailed, formal, academic treatment of the approach to
modular reasoning about programs that involve dynamically-bound method
calls and the notion of behavioral subtyping described in this note, we refer
interested students to Matthew J. Parkinson. Local reasoning for Java. PhD
Thesis, Cambridge University, 2005.
Work on behavioral notions of subtyping in object-oriented programming goes
back to the early nineties with work by America, Liskov and Wing, Leavens,
and others; see the bibliography in Parkinson’s thesis. Some of this early work,
however, confusingly discusses the problem in terms of substituting subtype
objects for supertype objects. This is not a good way to think of the problem of
modular reasoning about object-oriented programs. Most importantly, it invites
confusing behavioral types with implementations. We repeat that a behavioral
type is defined entirely by the type’s documentation, not by any implementations.
The correct image is that of a single object that is of two behavioral types
(one of which is a behavioral subtype of the other). One example of where the
substitution narrative breaks down is if the supertype is an abstract class (or an
interface): in that case, no object can exist whose class is the supertype.
Interfaces
101
102 INTERFACES
// ...
public Value and(Value leftOperand, Value rightOperand) {
return BooleanValue.of(((BooleanValue)leftOperand).value
& ((BooleanValue)rightOperand).value);
}
}
public class IntType extends Type, AddableType, AndableType { // ERROR
// ...
public Value add(Value leftOperand, Value rightOperand) {
return IntValue.of(((IntValue)leftOperand).value
+ ((IntValue)rightOperand).value);
}
public Value and(Value leftOperand, Value rightOperand) {
return IntValue.of((IntValue)leftOperand).value
& ((IntValue)rightOperand).value);
}
}
public class StringType extends Type, AddableType { // ERROR
// ...
public Value add(Value leftOperand, Value rightOperand) {
return StringValue.of(((StringValue)leftOperand).value
+ ((StringValue)rightOperand).value);
}
}
public class Interpreter {
public static Value evaluate(Value value1, char operator, Value value2) {
Type type = value1.getType();
if (type != value2.getType())
throw new UnsupportedOperationException(
"The operand types do not match");
switch (operator) {
case '+':
if (!(type instanceof AddableType))
throw new UnsupportedOperationException(
"Type " + type + " does not support the + operator");
return ((AddableType)type).add(value, value2);
case '&':
if (!(type instanceof AndableType))
throw new UnsupportedOperationException(
"Type " + type + " does not support the & operator");
return ((AndableType)type).and(value1, value2);
// ...
}
}
}
Unfortunately, this is not valid Java code: Java does not allow a class to extend
multiple superclasses. (Some other programming languages, such as C++, do
allow such multiple inheritance.) However, Java does allow a restricted form of
multiple inheritance: it allows a class to extend from one superclass and zero
or more interfaces. An interface, declared using the interface keyword, is in
most ways just like a class, except that it is not allowed to declare any instance
(i.e. non-static) fields and it is not allowed to declare any constructors. We can
therefore turn the incorrect program above into the correct program below:
104 INTERFACES
Consider the following class Color, whose instances represent colors defined by
their red, green, and blue components:
public class Color {
public final int red, green, blue;
public Color(int red, int green, int blue) {
this.red = red;
this.green = green;
this.blue = blue;
}
public int getHue() { /* ... */ }
public int getSaturation() { /* ... */ }
public int getValue() { /* ... */ }
@Override
public String toString() {
return "rgb(" + red + ", " + green + ", " + blue + ")";
}
@Override
public boolean equals(Object other) {
return
other.getClass() == getClass() &&
((Color)other).red == red &&
((Color)other).green == green &&
((Color)other).blue == blue;
}
@Override
public int hashCode() {
return Objects.hash(red, green, blue);
}
}
107
108 IMPLEMENTATION INHERITANCE
@Override
public String toString() {
return
"rgba(" + red + ", " + green + ", " + blue + ", " + transparency + ")";
}
@Override
public boolean equals(Object other) {
return
super.equals(other) &&
((TransparentColor)other).transparency == transparency;
}
@Override
public int hashCode() {
return Objects.hash(super.hashCode(), transparency);
}
}
Lists
If a Java program needs to store a sequence of values, an array can be used.
However, the number of operations available for working with arrays is limited.
For example, it is not possible to add or remove elements. To solve this problem,
we can define a list abstraction, with operations for adding and removing elements
in arbitrary positions.
There are multiple ways to implement such an abstraction, that each have differ-
ent performance characteristics. For example, implementing the list abstraction
by storing the elements in an array supports looking up an element at a given
index and removing the last element in constant time, and adding an element
to the end in amortized constant time, but adding an element to the front
or removing the first element takes time proportional to the size of the list.
Implementing the list abstraction by storing the elements in a doubly-linked list
data structure, on the other hand, supports adding an element to the front or
the back and removing an element from the front or the back in constant time,
but looking up the element at a given index takes time proportional to the size of
109
110 LISTS, SETS, AND MAPS
the list in the worst case. As a result, different implementations are appropriate
for different applications. Therefore, we will implement both an ArrayList class
and a LinkedList class. At the same time, code that manipulates instances of the
abstraction should be able to work with either implementation. Therefore, we
will introduce an interface List that generalizes over the implementations. Code
that accepts objects of type List can be used with ArrayList objects, LinkedList
objects, or any other implementation of the List interface we may build in the
future.
We define interface List as follows:
package collections;
import java.util.Arrays;
import java.util.stream.IntStream;
import java.util.stream.Stream;
/**
* @invar | Arrays.stream(toArray()).allMatch(e -> e != null)
*/
public interface List {
/**
* @inspects | this
* @creates | result
* @post | result != null
*/
Object[] toArray();
/**
* @inspects | this
* @post | result == toArray().length
*/
int size();
/**
* @pre | 0 <= index
* @pre | index < size()
* @post | result == toArray()[index]
*/
Object get(int index);
/**
* @pre | value != null
* @inspects | this
* @post | result == Arrays.stream(toArray()).anyMatch(e -> e.equals(value))
*/
boolean contains(Object value);
/**
* @pre | 0 <= index
* @pre | index <= size()
* @pre | value != null
* @mutates | this
LISTS 111
/**
* @pre | value != null
* @mutates | this
* @post | size() == old(size()) + 1
* @post | Arrays.equals(toArray(), 0, old(size()),
* | old(toArray()), 0, old(size()))
* @post | get(old(size())).equals(value)
*/
default void add(Object value) { add(size(), value); }
/**
* @pre | 0 <= index
* @pre | index < size()
* @mutates | this
* @post | Arrays.equals(toArray(), 0, index, old(toArray()), 0, index)
* @post | Arrays.equals(toArray(), index, size(),
* | old(toArray()), index + 1, old(size()))
*/
void remove(int index);
/**
* @pre | value != null
* @mutates | this
* @post | Arrays.equals(toArray(),
* | IntStream.range(0, old(size()))
* | .filter(i -> i != old(IntStream.range(0, size())
* | .filter(i -> get(i).equals(value))
* | .findFirst().orElse(-1)))
* | .mapToObj(i -> old(toArray())[i]).toArray())
*/
void remove(Object value);
import java.util.Arrays;
/**
* @invar | elements != null
* @invar | 0 <= size
* @invar | size <= elements.length
* @invar | Arrays.stream(elements, 0, size).allMatch(e -> e != null)
* @invar | Arrays.stream(elements, size, elements.length)
* | .allMatch(e -> e == null)
112 LISTS, SETS, AND MAPS
*
* @representationObject
*/
private Object[] elements = new Object[10];
private int size;
/**
* @post | size() == 0
*/
public ArrayList() {}
We can also implement the same API by storing the elements in a doubly-linked
chain of Node objects:
package collections;
/**
* @invar | sentinel != null
* @invar | size == sentinel.next.getLength()
*/
private int size;
/** @representationObject */
private Node sentinel;
/**
* @post | size() == 0
*/
public LinkedList() {
sentinel = new Node();
sentinel.previous = sentinel;
sentinel.next = sentinel;
}
int i = 0;
for (Node node = sentinel.next; node != sentinel; node = node.next)
result[i++] = node.element;
return result;
}
Sets
Checking whether an ArrayList or LinkedList object contains a given value and
removing a given value from an ArrayList or LinkedList take time proportional to
the size of the list. If those operations are important in a given application, a
hash table may be more appropriate. Assuming a good hash function is used,
checking for the presence of an element, adding an element if it is not yet present,
and removing an element take (amortized) constant expected time. Therefore,
hash tables are ideal for implementing a set abstract data type.
We first define interface Set as follows:
package collections;
import java.util.Arrays;
import java.util.stream.Stream;
/**
* @invar The set has no null elements.
* | Arrays.stream(toArray()).allMatch(e -> e != null)
* @invar {@code toArray()} does not contain duplicates.
* | Arrays.stream(toArray()).distinct().count() == size()
*/
public interface Set {
/**
* @inspects | this
* @creates | result
* @post | result != null
*/
Object[] toArray();
/**
* @inspects | this
* @post | result == toArray().length
*/
int size();
/**
* @pre | value != null
* @inspects | this
* @post | result == Arrays.stream(toArray()).anyMatch(e -> e.equals(value))
*/
boolean contains(Object value);
/**
* @pre | value != null
* @mutates | this
* @post The given value is in the set.
116 LISTS, SETS, AND MAPS
/**
* @pre | value != null
* @mutates | this
* @post The given value is no longer in the set.
* | Arrays.stream(toArray()).noneMatch(e -> e.equals(value))
* @post No elements, other than the given value, have disappeared
* from the set.
* | Arrays.stream(old(toArray())).allMatch(eo -> eo.equals(value) ||
* | Arrays.stream(toArray()).anyMatch(e -> e.equals(eo)))
* @post No elements have been added to the set.
* | Arrays.stream(toArray()).allMatch(e ->
* | Arrays.stream(old(toArray())).anyMatch(eo -> e.equals(eo)))
*/
void remove(Object value);
A Set object implemented using a hash table with capacity K distributes its
elements among K buckets. For each bucket, we need some other Set implemen-
tation to store the elements that belong to that bucket. For this purpose, we
first define a simple ArrayList-based Set implementation:
package collections;
/**
* @invar | elements != null
* @invar | elements.stream().distinct().count() == elements.size()
*
* @representationObject
*/
private ArrayList elements = new ArrayList();
elements.add(value);
}
Notice that the contains, add, and remove operations of an ArraySet take time
proportional to the size of the set. This will not matter, because assuming a
good hash function is used, and assuming the hash table is not overloaded, the
expected number of elements in each bucket is independent of the total number
of elements in the hash table.
We can now define the hash table-based Set implementation:
package collections;
import java.util.Arrays;
import java.util.stream.IntStream;
/**
* @invar | buckets != null
* @invar | Arrays.stream(buckets).allMatch(b -> b != null)
* @invar | IntStream.range(0, buckets.length).allMatch(i ->
* | buckets[i].stream().allMatch(e ->
* | Math.floorMod(e.hashCode(), buckets.length) == i))
*
* @representationObject
* @representationObjects Each bucket is a representation object
*/
private Set[] buckets;
/**
* @pre | 0 < capacity
* @post | size() == 0
*/
public HashSet(int capacity) {
buckets = new Set[capacity];
for (int i = 0; i < buckets.length; i++)
buckets[i] = new ArraySet();
}
}
return result;
}
Notice that the HashSet object simply delegates the contains, add, and remove
operations to the element’s bucket, which is determined by retrieving its hash
code using the hashCode() method and deriving the bucket index by taking the
remainder of dividing the hash code by the number of buckets. If method
hashCode() implements a good hash function, which means that the hash codes
of the elements are uniformly distributed, then the expected number of elements
in each bucket is proportional to the load factor, which is the total number of
elements divided by the number of buckets. This means that if the load factor is
bounded, then so is the expected number of elements in each bucket, and as a
result, the expected time taken by contains, add, and remove is independent of the
number of elements in the HashSet object. The HashSet implementation shown
above has a fixed capacity; this works well if the maximum load (i.e. number
of elements) of the HashSet object is known beforehand. If not, code should be
added to increase the capacity and rehash (i.e. copy the elements from the old
buckets to the new buckets) when the load factor exceeds some fixed threshold
value.
Maps
Applications often need to store a set of key-value pairs (where distinct pairs
have distinct keys) and efficiently add a pair and retrieve the value associated
with a given key. In Java, such sets are known as maps and a key-value pair is
called a map entry.
We first define the Map interface:
package collections;
import java.util.Objects;
MAPS 119
/** @immutable */
class Entry {
/**
* @invar | key != null
* @invar | value != null
*/
private final Object key;
private final Object value;
/**
* @pre | key != null
* @pre | value != null
* @post | getKey() == key
* @post | getValue() == value
*/
public Entry(Object key, Object value) {
this.key = key;
this.value = value;
}
@Override
public boolean equals(Object other) {
return other instanceof Entry
&& key.equals(((Entry)other).getKey())
&& value.equals(((Entry)other).getValue());
}
@Override
public int hashCode() {
return Objects.hash(key, value);
}
/**
* @inspects | this
* @creates | result
* @post | result != null
* @post | result.stream().allMatch(e -> e instanceof Entry)
* @post No key appears twice.
* | result.stream().map(e -> ((Entry)e).getKey()).distinct().count()
* | == result.size()
*/
Set entrySet();
/**
* @post | result == entrySet().stream()
* | .filter(e -> ((Entry)e).getKey().equals(key))
120 LISTS, SETS, AND MAPS
/**
* @pre | key != null
* @pre | value != null
* @mutates | this
* @post The given entry is in the entry set.
* | entrySet().contains(new Entry(key, value))
* @post No entries, except for the updated one, have disappeared from the
* entry set.
* | old(entrySet()).stream().allMatch(e ->
* | ((Entry)e).getKey().equals(key) || entrySet().contains(e))
* @post No entries, except for the updated one, have been added to the entry
* set.
* | entrySet().stream().allMatch(e ->
* | ((Entry)e).getKey().equals(key) || old(entrySet()).contains(e))
*/
void put(Object key, Object value);
/**
* @pre | key != null
* @mutates | this
* @post All entries in the entry set were already in the entry set and
* have a key that is different from the given key.
* | entrySet().stream().allMatch(e -> !((Entry)e).getKey().equals(key)
* | && old(entrySet()).contains(e))
* @post All entries that were in the entry set, except for the specified one,
* are still in the entry set.
* | old(entrySet()).stream().allMatch(e ->
* | ((Entry)e).getKey().equals(key) || entrySet().contains(e))
*/
void remove(Object key);
We can implement this interface efficiently using a hash table. Again, we first need
a separate Map implementation for storing the entries that belong to a particular
bucket. For this purpose, we define a simple ArrayList-based implementation:
package collections;
/**
* @invar | entries != null
* @invar | entries.stream().allMatch(e -> e instanceof Entry)
* @invar | entries.stream().map(e -> ((Entry)e).getKey()).distinct().count()
* | == entries.size()
*
* @representationObject
*/
private ArrayList entries = new ArrayList();
/**
* @post | entrySet().size() == 0
*/
public ArrayMap() {}
import java.util.Arrays;
import java.util.stream.IntStream;
/**
* @invar | buckets != null
* @invar | Arrays.stream(buckets).allMatch(b -> b != null)
* @invar | IntStream.range(0, buckets.length).allMatch(i ->
* | buckets[i].entrySet().stream().allMatch(e ->
* | Math.floorMod(((Entry)e).getKey().hashCode(),
* | buckets.length) == i))
122 LISTS, SETS, AND MAPS
*
* @representationObject
* @representationObjects
*/
private Map[] buckets;
/**
* @pre | 0 < capacity
* @post | entrySet().size() == 0
*/
public HashMap(int capacity) {
buckets = new Map[capacity];
for (int i = 0; i < capacity; i++)
buckets[i] = new ArrayMap();
}
(Note: the object allocation involved in wrapping a primitive value into an object
may cause a significant overhead in terms of time and space.)
You can iterate over any collection object using the enhanced for loop:
Collection<Student> myStudents = ...;
for (Student student : myStudents)
System.out.println(student.getName());
The following data structures require that the elements be comparable. Specifi-
THE JAVA COLLECTIONS FRAMEWORK 125
OOP Teams
Suppose I wanted to write a Java program to manage the Object-Oriented
Programming course. In particular, I would want to track team compositions
for the project. Students are encouraged to work in teams of two students, but
they can work alone if they really want to.
The information I want the program to keep track of, then, is, mathematically
speaking, the set S of students, along with the relation is-teammate-of on this
set. (One can think of this relation as a subset of the set of pairs of students S
× S.)
For example, suppose the course has five students: S = {Alice, Bob, Carol, Dan,
Eve}. Suppose Alice and Bob constitute a team, and so do Carol and Dan, but
127
128 ENTITY-RELATIONSHIP ABSTRACTIONS
Eve works alone. Then we have is-teammate-of = {(Alice, Bob), (Bob, Alice),
(Carol, Dan), (Dan, Carol)}. (is-teammate-of must always be a symmetric
relation: if X is a teammate of Y, then Y is necessarily a teammate of X as
well.)
Generalizing from this example, it is often the case that we want to use Java
programs to store information of this form, which we call entity graphs. In the
general case, an entity graph consists of some number of sets of entities, and some
number of relations between particular sets of entities. Furthermore, the graph
may associate particular values with each element of some set of entities. For
example, we may want to store each student’s study programme; mathematically,
this corresponds to a function that maps each element of S to some element of
the set P of study programmes. We call each such function, that associates a
value with each entity from some entity set, an attribute. In general, then, an
entity graph consists of some sets of entities, some relations between the sets of
entities, and some attributes.
Writing a program for managing the OOP course in Java includes the complexity
of correctly storing and manipulating OOP team composition graphs. Writing
such a program would be easier if it could be written in a programming language,
let’s call it Java++, that has a built-in way to store these graphs, that is, that
has a built-in type whose values represent OOP students, and built-in operations
for setting and getting a student’s teammate and study programme.
We can achieve this complexity reduction by splitting the task of writing a Java
program for managing the OOP course into two subtasks:
• Writing a client module for managing the OOP course, written in the
language Java++ that has built-in types and operations for storing OOP
team composition graphs
• Writing an OOP team composition graphs module that implements the
extra datatypes and operations of Java++ (or, in other words, the OOP
team composition graph abstraction) in terms of the regular Java datatypes
and constructs such as classes, fields, and methods.
By composing the OOP team composition graphs module and the client module,
we obtain a program for managing the OOP course written in Java.
A natural way to write a module for storing a particular type of entity graph
is to introduce a Java class for each type of entity, a getter in class E for each
attribute of entity type E, and getters in classes E and F for each relation
between entity types E and F.
For example, here is one way to implement the OOP team composition graph
abstraction in Java:
package teams;
/**
* Each instance of this class represents an Object-Oriented Programming student
* in an OOP team composition graph.
*
* @invar | getStudyProgramme() != null
* @invar | getTeammate() == null || getTeammate().getTeammate() == this
*/
public class OOPStudent {
/**
* @invar | studyProgramme != null
* @invar | teammate == null || teammate.teammate == this
*/
private final StudyProgramme studyProgramme;
/** @peerObject */
private OOPStudent teammate;
/** @immutable */
public StudyProgramme getStudyProgramme() {
return studyProgramme;
}
/**
* Returns this student's teammate, or {@code null} if this student has no
* teammate.
*
* @peerObject
*/
public OOPStudent getTeammate() {
return teammate;
}
/**
* Initializes this object to represent a student with the given study
* programme and no teammate.
*
* @throws IllegalArgumentException if {@code studyProgramme} is null
* | studyProgramme == null
* @post This student's study programme equals the given study programme
* | getStudyProgramme() == studyProgramme
* @post This student has no teammate
* | getTeammate() == null
*/
public OOPStudent(StudyProgramme studyProgramme) {
if (studyProgramme == null)
throw new IllegalArgumentException("`studyProgramme` is null");
this.studyProgramme = studyProgramme;
}
/**
* Registers the fact that this student is teaming up with the given student.
*
130 ENTITY-RELATIONSHIP ABSTRACTIONS
/**
* Registers the fact that this student is splitting up with their teammate.
*
* @throws IllegalStateException if this student has no teammate
* | getTeammate() == null
* @mutates | this
* @post This student has no teammate
* | getTeammate() == null
* @post This student's old teammate has no teammate
* | old(getTeammate()).getTeammate() == null
*/
public void clearTeammate() {
if (teammate == null)
throw new IllegalStateException(
"This student does not have a teammate");
this.teammate.teammate = null;
this.teammate = null;
}
}
The second setTeammate call is incorrect and should cause an exception. Otherwise,
after this call bob.getTeammate() returns alice while alice.getTeammate() returns
carol, which is inconsistent.
Peer groups
This interdependency between the states of the objects that constitute the
entity-relationship abstraction is reflected in the representation invariants: the
representation invariant for an OOPStudent object s1 where s1.teammate == s2 talks
not just about the fields of s1 but about the fields of s2 as well: in particular,
it specifies that s2.teammate == s1. This means that changing the state of object
s2 can break the representation invariant of s1. This, in turn, means that when
implementing the methods of class OOPStudent, it is crucial to remember not just
to preserve the validity of the representation of the objects directly involved in
the call, but also to remember to preserve the validity of the representation of the
other objects whose representation invariants involve the objects directly involved
in the call. For example, in method clearTeammate, to preserve the representation
invariants of this, it is sufficient to set this.teammate = null. The extra assignment
this.teammate.teammate = null is necessary to preserve the representation invariant
of the receiver’s old teammate.
Generalizing from this example, in order to avoid inadvertently breaking the
representation invariants of objects, when implementing methods we need to
know which objects’ representations invariants we may break when mutating
particular objects. For this reason, we introduce the following rules:
• The representation invariants for an object X may mention a non-final
field or non-immutable property of an object Y only if:
132 ENTITY-RELATIONSHIP ABSTRACTIONS
– X = Y, or
– Y is a representation object of X, or
– Y is a peer object of X.
• If X is a peer object of Y, then Y must be a peer object of X.
We define the set of representation objects of an object X as the set of objects
reachable via the fields marked @representationObject of X. For example, the
array object used by an IntList object to store the elements of the list is a
representation object of the IntList object. Representation objects of an object
X must be encapsulated inside object X; that is, it must not be possible to access
the representation objects of X outside the class of X.
We define the set of peer objects of an object X as the set of objects reachable via
the fields marked @peerObject of X. For example, the teammate of an OOPStudent
object is a peer object of the OOPStudent object. This allows the representation
invariant to mention the teammate’s teammate field.
We refer to a set of objects, each of which is a direct or indirect peer object of
the others, as a peer group. The objects of a peer group together constitute an
entity-relationship abstraction. It does not make sense to consider each member
of a peer group as a separate abstraction; the notions of representation validity
and abstract state apply meaningfully only to a peer group as a whole.
The @peerObject tags on fields are internal documentation for internal use by a
module author to reason about the correctness of the module. However, it is
also important for client module authors to know which objects constitute a peer
group. Therefore, we also define an object’s peer group publicly by marking
appropriate public getters as @peerObject as well, as in the example above.
Since the objects of a peer group do not have independent representations
or abstract states, we interpret an object mentioned in a @mutates clause as
representing that object’s entire peer group. This is why, in the documentation
for method clearTeammate(), it is sufficient to write @mutates | this: the method
mutates object this.teammate as well, but since that object is a peer object of
this, it need not be mentioned separately.
Multi-class
entity-relationship
abstractions
133
134 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS
public ProjectCourseStudent() {}
this.team = team;
team.members.add(this);
}
team.members.remove(this);
team = null;
}
}
package bigteams;
public Team() {}
package bigteams;
import org.junit.jupiter.api.Test;
class BigTeamsTest {
@Test
void test() {
ProjectCourseStudent student1 = new ProjectCourseStudent();
ProjectCourseStudent student2 = new ProjectCourseStudent();
Team team = new Team();
student1.join(team);
assertEquals(team, student1.getTeam());
assertEquals(Set.of(student1), team.getMembers());
student2.join(team);
assertEquals(team, student2.getTeam());
assertEquals(Set.of(student1, student2), team.getMembers());
student1.leaveTeam();
assertEquals(Set.of(student2), team.getMembers());
student2.leaveTeam();
assertEquals(Set.of(), team.getMembers());
}
Notice that we use a HashSet to store the members of a team. Class HashSet is
a generic class; it takes the type of its elements as a type parameter. In the
parameterized type HashSet<ProjectCourseStudent>, ProjectCourseStudent is the type
argument for generic class HashSet.
Type HashSet<T> is a subtype of type Set<T>. This means that every instance of
HashSet<T> is also an instance of Set<T>.
Notice also that we implement a bidirectional association between classes
ProjectCourseStudent and class Team, and we make sure that its consistency is
preserved at all times: if a ProjectCourseStudent instance S refers to a Team in-
stance T, then T also refers to S, and vice versa.
Unfortunately, the code above is rejected by Java’s static type checker. Indeed,
class ProjectCourseStudent accesses class Team’s private field members. The private
136 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS
fields of a class may be accessed by the class itself only. We could make field
members public but this would break the abstraction’s encapsulation: clients would
see the abstraction’s internal implementation details and could modify the field
without respecting the abstraction’s representation invariants. For example, a
client could set the members field of some Team instance T to null or could add a
ProjectCourseStudent instance S to its HashSet without updating S’s team field to
point to T, thus breaking the consistency of the bidirectional association and
the validity of the abstraction’s representation.
import logicalcollections.LogicalSet;
/**
* Each instance of this class represents a student in a project course,
* as part of a student-team graph.
*
* @invar If a student is in a team, it is among its members.
* | getTeam() == null || getTeam().getMembers().contains(this)
*/
public class ProjectCourseStudent {
/**
* @invar | true // Phase 1 representation invariant
* @invar | team == null || team.members.contains(this) // Phase 2 rep. inv.
*
* @peerObject
*/
Team team;
/**
* Returns this student's team, or {@code null} if they are not in a team.
*
* @peerObject
137
* @basic
*/
public Team getTeam() { return team; }
/**
* Initializes this object as representing a student who is not in a team.
*/
public ProjectCourseStudent() {}
/**
* Make this student a member of the given team.
*
* @throws IllegalArgumentException if {@code team} is null.
* | team == null
* @throws IllegalStateException if this student is already in a team.
* | getTeam() != null
*
* @mutates_properties | this.getTeam(), team.getMembers()
*
* @post The given team's members equal its old members plus this student.
* | team.getMembers().equals(LogicalSet.plus(old(team.getMembers()), this))
*/
public void join(Team team) {
if (team == null)
throw new IllegalArgumentException("team is null");
if (this.team != null)
throw new IllegalStateException("this student is already in a team");
this.team = team;
team.members.add(this);
}
/**
* Make this student no longer be a member of their team.
*
* @throws IllegalStateException if this student is not in a team.
* | getTeam() == null
*
* @mutates_properties | this.getTeam(), this.getTeam().getMembers()
*
* @post This student is not in a team.
* | getTeam() == null
* @post This student's old team's members are its old members minus this
* student.
* | old(getTeam()).getMembers().equals(
* | LogicalSet.minus(old(getTeam().getMembers()), this))
*/
public void leaveTeam() {
if (this.team == null)
throw new IllegalStateException("this student is not in a team");
team.members.remove(this);
team = null;
}
}
package bigteams;
138 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS
import java.util.HashSet;
import java.util.Set;
/**
* Each instance of this class represents a team in a student-team graph.
*
* @invar Each of this team's members has this team as its team.
* | getMembers().stream().allMatch(s -> s != null && s.getTeam() == this)
*/
public class Team {
/**
* @invar | members != null // Phase 1 representation invariant
* @invar | members.stream().allMatch(s ->
* | s != null && s.team == this) // Phase 2 rep. inv.
*
* @representationObject
* @peerObjects
*/
HashSet<ProjectCourseStudent> members = new HashSet<>();
/**
* Returns this team's set of members.
*
* @post | result != null
* @creates | result
* @peerObjects
* @basic
*/
public Set<ProjectCourseStudent> getMembers() { return Set.copyOf(members); }
/**
* Initializes this object as representing an empty team.
*
* @mutates | this
* @post This team has no members.
* | getMembers().isEmpty()
*/
public Team() {}
package bigteams.tests;
import java.util.Set;
import org.junit.jupiter.api.Test;
import bigteams.ProjectCourseStudent;
import bigteams.Team;
class BigTeamsTest {
@Test
139
void test() {
ProjectCourseStudent student1 = new ProjectCourseStudent();
ProjectCourseStudent student2 = new ProjectCourseStudent();
Team team = new Team();
student1.join(team);
assertEquals(team, student1.getTeam());
assertEquals(Set.of(student1), team.getMembers());
student2.join(team);
assertEquals(team, student2.getTeam());
assertEquals(Set.of(student1, student2), team.getMembers());
student1.leaveTeam();
assertEquals(Set.of(student2), team.getMembers());
student2.leaveTeam();
assertEquals(Set.of(), team.getMembers());
}
conveniently specify that joinTeam(team) mutates the peer group of this and
the peer group of team, and that for any object O in either of these peer
groups, and for any basic getter M of O, we have either that (O, M) is in the
set {(this, getTeam), (team, getMembers)} or Objects.equals(O.M(), old(O.M())).
That is, from the client’s point of view, joinTeam(team) mutates only
this.getTeam() and team.getMembers() and leaves all other properties of both
peer groups involved (as given by the peer group objects’ basic getters)
unchanged. (A getter is basic if it is marked using the @basic tag.)
import logicalcollections.LogicalSet;
/**
* Each instance of this class represents a student in a project course,
* as part of a student-team graph.
*
* @invar If a student is in a team, it is among its members.
* | getTeam() == null || getTeam().getMembers().contains(this)
*/
public class ProjectCourseStudent {
/**
* @invar | getTeamInternal() == null
* | || getTeamInternal().getMembersInternal().contains(this)
*
* @peerObject
*/
Team getTeamInternal() { return team; }
/**
* Returns this student's team, or {@code null} if they are not in a team.
NESTING CLASS-LEVEL AND PACKAGE-LEVEL ABSTRACTIONS 141
*
* @peerObject
* @basic
*/
public Team getTeam() { return team; }
/**
* Initializes this object as representing a student who is not in a team.
*/
public ProjectCourseStudent() {}
/**
* Make this student a member of the given team.
*
* @throws IllegalArgumentException if {@code team} is null.
* | team == null
* @throws IllegalStateException if this student is already in a team.
* | getTeam() != null
*
* @mutates_properties | this.getTeam(), team.getMembers()
*
* @post The given team's members equal its old members plus this student.
* | team.getMembers().equals(LogicalSet.plus(old(team.getMembers()), this))
*/
public void join(Team team) {
if (team == null)
throw new IllegalArgumentException("team is null");
if (this.team != null)
throw new IllegalStateException("this student is already in a team");
this.team = team;
team.addMember(this);
}
/**
* Make this student no longer be a member of their team.
*
* @throws IllegalStateException if this student is not in a team.
* | getTeam() == null
*
* @mutates_properties | this.getTeam(), this.getTeam().getMembers()
*
* @post This student is not in a team.
* | getTeam() == null
* @post This student's old team's members are its old members minus this
* student.
* | old(getTeam()).getMembers().equals(
* | LogicalSet.minus(old(getTeam().getMembers()), this))
*/
public void leaveTeam() {
if (this.team == null)
throw new IllegalStateException("this student is not in a team");
team.removeMember(this);
team = null;
}
}
142 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS
package bigteams;
import java.util.HashSet;
import java.util.Set;
import logicalcollections.LogicalSet;
/**
* Each instance of this class represents a team in a student-team graph.
*
* @invar Each of this team's members has this team as its team.
* | getMembers().stream().allMatch(s -> s != null && s.getTeam() == this)
*/
public class Team {
/**
* @invar | members != null
* @invar | members.stream().allMatch(s -> s != null)
*
* @representationObject
*/
private HashSet<ProjectCourseStudent> members = new HashSet<>();
/**
* Returns this team's set of members.
*
* @invar | getMembersInternal().stream().allMatch(s ->
* | s.getTeamInternal() == this)
*
* @post | result != null && result.stream().allMatch(s -> s != null)
* @peerObjects
*/
Set<ProjectCourseStudent> getMembersInternal() { return Set.copyOf(members); }
/**
* Returns this team's set of members.
*
* @post | result != null
* @creates | result
* @peerObjects
* @basic
*/
public Set<ProjectCourseStudent> getMembers() { return Set.copyOf(members); }
/**
* Initializes this object as representing an empty team.
*
* @mutates | this
* @post This team has no members.
* | getMembers().isEmpty()
*/
public Team() {}
/**
* Adds the given student to this team's set of students.
*
* @throws IllegalArgumentException if {@code student} is null
NESTING CLASS-LEVEL AND PACKAGE-LEVEL ABSTRACTIONS 143
* | student == null
* @mutates | this
* @post This team's set of members equals its old set of members plus the
* given student.
* | getMembersInternal().equals(
* | LogicalSet.plus(old(getMembersInternal()), student))
*/
void addMember(ProjectCourseStudent student) {
if (student == null)
throw new IllegalArgumentException("student is null");
members.add(student);
}
/**
* Removes the given student from this team's set of students.
*
* @throws IllegalArgumentException if {@code student} is null
* | student == null
* @mutates | this
* @post This team's set of members equals its old set of members minus the
* given student.
* | getMembersInternal().equals(
* | LogicalSet.minus(old(getMembersInternal()), student))
*/
void removeMember(ProjectCourseStudent student) {
if (student == null)
throw new IllegalArgumentException("student is null");
members.remove(student);
}
nonprivate getters.
• We specify a package-level abstraction’s representation invariants using
@invar tags in the Javadoc comments for the package’s package-accessible
fields or getters.
• We specify a package-level abstraction’s abstract state invariants using
@invar tags in the Javadoc comments for the package’s classes and/or using
@post tags in the Javadoc comments for the package’s public getters.
• A public constructor’s author must ensure that when the constructor
returns, all of the constructed object’s class-level and package-level repre-
sentation invariants hold. (Those of its peer objects must hold as well, but
usually a newly constructed object does not yet have any peer objects.)
• At every call of a package-accessible method, it is the caller’s responsibility
to ensure that for each member O of the class-level peer group of each object
inspected or mutated by the method, all of O’s class-level representation
invariants hold. It is the method author’s responsibility to ensure that all
of these invariants hold again when the method returns.
• At every call of a public method, it is the caller’s responsibility to ensure
that for each member O of the package-level peer group of each object
inspected or mutated by the method, all of O’s class-level and package-level
representation invariants hold. It is the method author’s responsibility to
ensure that all of these invariants hold again when the method returns.
• This means that it is not correct to call package-accessible getters inside a
class-level representation invariant or public getters inside any representa-
tion invariant, because this would introduce a circularity: the getter would
be allowed to assume the invariants in whose definition it is involved. It
is, however, allowed (and generally necessary) to call package-accessible
getters in package-level representation invariants, because these getters
can assume only the class-level representation invariants.
• We introduce package-accessible mutators addMember and removeMember into
class Team to allow methods join and leaveTeam in class ProjectCourseStudent
to restore the consistency of the package-level bidirectional association.
How to properly document
multi-object abstractions
Fields documentation
If an @invar clause for an object O mentions a field of an object O’, then O’ must
be a representation object of O or a peer object of O.
An @invar clause must be well-defined (i.e. must not crash or loop forever) for
arbitrary (concrete) states of the objects involved, with the following exceptions:
• An @invar clause is evaluated only if preceding @invar clauses of the same
object evaluated to true. So, if, for example, the first clause says that some
field is non-null, the second clause is allowed to call a method on the object
referenced by that field.
• The N’th @invar clause of an object O that is a member of a peer group is
evaluated only if, for each member O’ of the same peer group, and each I
< N, the I’th @invar clause of O’ evaluated to true. So, for example, if the
second @invar clause of some class calls a method on the foo field of a peer
object, this is fine provided that the first @invar clause of the peer object’s
class says that foo is non-null.
Advanced topics
You need not read or apply this section if you are not going for a top score.
To achieve a complete specification of the behavior of a method or constructor,
the @peerObject and @peerObjects clauses should be used to define an object’s peer
group, and the @mutates, @mutates_properties, and @inspects clauses should be used
to indicate which pre-existing peer groups are mutated and inspected by the
method or constructor, and the @creates clause should be used to indicate which
145
146 HOW TO DOCUMENT MULTI-OBJECT ABSTRACTIONS
peer groups that are visible to the client after the method or constructor finishes
were created by the method or constructor.
For more information about peer groups, see the course notes on Entity-
relationship abstractions.
Mutates clauses
A method must not mutate any pre-existing peer groups not mentioned in its
@mutates clause and not used to represent the state of any peer groups mentioned
in its @mutates clause. That is, an execution of a method may mutate an object
O if and only if either O was newly created during the method execution (i.e.
it did not exist when the method execution started), or a member of O’s peer
group is mentioned in the method’s @mutates clause, or O is a member of the
peer group of a representation object of a member of the peer group of an object
mentioned in the method’s @mutates clause, or O is a member of the peer group
of a representation object of a member of the peer group of a representation
object of a member of the peer group of an object mentioned in the method’s
@mutates clause, and so forth.
Inspects clauses
Similarly, a method must not inspect the state of any pre-existing mutable
objects that are not in the peer group of any object mentioned in its @inspects
or @mutates clause and not used to represent the state of any objects mentioned
in its @inspects or @mutates clause.
Defaults
If no @mutates, @mutates_properties, or @inspects clause is specified for a given
method or constructor, the default is that it may inspect and mutate any object
ADVANCED TOPICS 147
Creates clauses
By specifying an object in a @creates clause, you indicate that every member of
the object’s peer group was created during the execution of the method, and
furthermore, that the peer group is disjoint from that of any direct or indirect
representation object of any peer group mentioned in any of the method’s
@inspects or @mutates clauses.
/**
* @inspects | lists, ...lists
*/
static boolean anyIsEmpty(StringList[] lists) {
for (StringList list : lists)
if (list.getElements().length == 0)
return true;
return false;
}
class Rectangle {
/** @basic */
public int getWidth() { return width; }
/** @basic */
public int getHeight() { return height; }
/**
* @mutates_properties | getWidth()
* @post | getWidth() == newWidth
*/
148 HOW TO DOCUMENT MULTI-OBJECT ABSTRACTIONS
/**
* @inspects rectangles
* @mutates_properties | (...rectangles).getWidth()
* @post | Arrays.stream(rectangles).allMatch(r -> r.getWidth() == newWidth)
*/
public static void allSetWidth(Rectangle[] rectangles, int newWidth) {
for (Rectangle rectangle : rectangles)
rectangle.setWidth(newWidth);
}
}
Iterators
The problem
We illustrate the problem by means of the example of two data structures for
storing a collection of objects, and a client that iterates over both of them:
public class ArrayList {
149
150 ITERATORS
System.out.println(node.value);
}
Notice also that the client program has to duplicate the logic of printing all
elements for the two data structures. The question we consider in this text is:
how can we introduce an API that generalizes the two data structures, and
allows the client to write the logic for printing all elements only once and apply
it unchanged to both data structures, without hurting performance?
(We could introduce a method into both data structures that returns the elements
as an array; the client could call it and then iterate over the array to print all
elements. However, this would increase the memory usage of the client: it
would need an amount of extra memory linear in the number of elements in the
collection.)
To see the solution, carefully consider both printAll methods and notice that they
follow a similar pattern: they use some piece of data to track where they are in
the data structure; they perform a test on the piece of data to see whether they
have reached the end of the data structure; they retrieve the current element
pointed to by the piece of data; and they update the piece of data to point to
the next element.
ITERATORS 151
Iterators
The solution, then, is to encapsulate this piece of data into an object, which
we will call an iterator, and provide methods to allow clients to test whether
the iterator has reached the end of the data structure, to retrieve the current
element, and to mutate the iterator so that it points to the next element.
Such an iterator API can take a few different forms, depending on whether and
how some of these functionalities are combined into a single method. Whereas
most programming languages define some kind of standard iterator API, they
do so in different ways. However, in the end all of these forms are essentially
equivalent.
We show the styles used by some of the most popular programming languages
(translated into Java syntax) below:
// Java
public interface Iterator {
boolean hasNext();
/**
* Mutates the iterator to point to the next element and returns the current
* element.
*/
Object next();
}
// C#
public interface Enumerator {
Object getCurrent();
/**
* Mutates the enumerator to point to the next element, or returns `false` if
* the end has been reached.
*/
boolean moveNext();
}
// C++
public interface Iterator {
Object getCurrent(); // syntax in C++: *iterator
void moveNext(); // syntax in C++: iterator++
// in C++, to tell whether you have reached the end of the data structure, you
// have to test equality with a special "one-past-the-end" iterator
boolean equals(Iterator other);
}
// Python
public interface Iterator {
/**
* Throws a StopIteration exception if the end of the data structure has been
* reached.
*/
Object next();
}
// JavaScript
public interface Iterator {
152 ITERATORS
This way, the client can now reuse the logic for printing all elements:
public class ClientProgram {
Iterables
We can perform a further step of generalization, and further simplify the client
code, by introducing an interface to be implemented by any collection that
supports iteration:
public interface Iterable {
/** Returns a new iterator that points to the start of the data structure. */
Iterator iterator();
}
We can easily update classes ArrayList and LinkedList to implement this interface.
For example, we can update class ArrayList as follows:
public class ArrayList implements Iterable {
}
154 ITERATORS
Notice that the client no longer even needs to know that it is dealing with an
array-based list and a linked list.
Inner classes
We can simplify the implementation of class ArrayList by turning class IteratorImpl
into a nonstatic nested class, more commonly referred to as an inner class. In
the same way that a nonstatic method takes an implicit this argument that
points to an instance of the enclosing class, an inner class has an implicit
EnclosingClass.this field that points to an instance of the enclosing class. Fur-
thermore, each constructor takes an implicit EnclosingClass.this argument and
uses it to initialize the implicit field:
public class ArrayList implements Iterable {
APPLYING NESTED CLASSES 155
private IteratorImpl() {}
In fact, inside an inner class, we can refer to members of the enclosing class using
their simple names; in case of nonstatic members, they are implicitly prefixed by
EnclosingClass.this. This means we can further simplify the implementation of
class ArrayList as follows:
Local classes
In fact, since class IteratorImpl is only referred to inside method iterator(), it is
cleaner to define it as a local class inside method iterator():
public class ArrayList implements Iterable {
Note: local classes are not visible outside the enclosing method, so there would
be no point in adding the private keyword, and in fact this is not allowed.
As we will see later, an additional advantage of local classes is that they can
(under certain conditions) refer to the parameters and local variables of the
enclosing method.
Anonymous classes
We can apply one more simplification: since in our example class IteratorImpl is
referred to in only one place, to create an instance of it, we can replace it by an
anonymous class. This relieves us from the need to invent a name for it:
public class ArrayList implements Iterable {
};
As you can see, an anonymous class instance creation expression consists of:
• the keyword new
• the name of a class or interface to be implemented by the anonymous class
• a constructor argument list
• a class body
});
}
Notice that the client program implements interface Consumer with an anonymous
class whose accept method prints the element to the screen.
Lambda expressions
We can write this client program even more concisely using a lambda expression:
public class ClientProgram {
INTERNAL ITERATORS: FOREACH METHODS 159
}
160 ITERATORS
The Consumer object that results from evaluating the lambda expression has an
implicit field that holds a copy of parameter condition.
Java allows only effectively final variables to be captured this way. A variable is
effectively final if it is not mutated after initialization.
Generics
Basic concept
In this note, we introduce Java’s support for generic classes, interfaces, and
methods. We motivate and illustrate the concepts by means of the example task
of implementing a class University that has the following methods:
class University {
/** Returns the number of students that have obtained at least 120 credits. */
int getNbFinishers() { ... }
/**
* Returns the average number of scientific publications authored by this
* university's staff members.
*/
int getAvgNbPubs() { ... }
161
162 GENERICS
staff members. A data structure for storing a set of students could look like this:
class LinkedListOfStudent implements IterableOfStudent {
Node firstNode;
IteratorOfStudent iterator();
interface IteratorOfStudent {
boolean hasNext();
Student next();
}
BASIC CONCEPT 163
/** Returns the number of students that have obtained at least 120 credits. */
int getNbFinishers() {
int result = 0;
for (IteratorOfStudent iterator = students.iterator();
iterator.hasNext(); ) {
Student student = iterator.next();
if (student.nbCredits >= 120)
result++;
}
return result;
}
/**
* Returns the average number of scientific publications authored by this
* university's staff members.
*/
int getAvgNbPubs() {
int nbStaff = 0;
int totalNbPubs = 0;
for (IteratorOfStaff iterator = staffMembers.iterator();
iterator.hasNext(); ) {
Staff staff = iterator.next();
nbStaff++;
totalNbPubs += staff.nbPubs;
}
return totalNbPubs / nbStaff;
}
type.
interface Iterable {
Iterator iterator();
interface Iterator {
boolean hasNext();
Object next();
Node firstNode;
We can use class LinkedList to implement class University. Note, however, that
we do have to insert a typecast in methods getNbFinishers() and getAvgNbPubs():
class University {
/** Returns the number of students that have obtained at least 120 credits. */
int getNbFinishers() {
int result = 0;
for (Iterator iterator = students.iterator(); iterator.hasNext(); ) {
Student student = (Student)iterator.next(); // Typecast!
if (student.nbCredits >= 120)
result++;
}
return result;
}
/**
* Returns the average number of scientific publications authored by this
* university's staff members.
*/
int getAvgNbPubs() {
int nbStaff = 0;
int totalNbPubs = 0;
for (Iterator iterator = staffMembers.iterator(); iterator.hasNext(); ) {
Staff staff = (Staff)iterator.next(); // Typecast!
nbStaff++;
totalNbPubs += staff.nbPubs;
}
return totalNbPubs / nbStaff;
}
}
166 GENERICS
Note also that with this approach, we lose much of the benefit of Java’s static
type checker: many programming errors that would be caught by the static type
checker before we run the program when using Approach 1 are only detected, if
at all, during execution of the program when using Approach 2. This includes
the following errors:
• Adding the student to staffMembers instead of students in method addStudent.
This would lead to a ClassCastException in getAvgNbPubs() if and when that
method gets called.
• Calling contains on staffMembers instead of students in method hasStudent.
This would not cause an exception; instead, it would silently produce wrong
results.
• Iterating over staffMembers instead of students in method getNbFinishers. This
would lead to a ClassCastException in getNbFinishers(), except if staffMembers
is empty. In the latter case, it would silently produce wrong results.
• The corresponding errors in addStaff, hasStaff, and getAvgNbPubs.
Aproach 3: Generics
We can achieve reuse without sacrificing static type checking by defining types
Iterator, Iterable, and LinkedList as generic types with a type parameter T:
interface Iterable<T> {
Iterator<T> iterator();
interface Iterator<T> {
boolean hasNext();
T next();
Node<T> firstNode;
return true;
return false;
}
class University {
/** Returns the number of students that have obtained at least 120 credits. */
int getNbFinishers() {
int result = 0;
for (Iterator<Student> iterator = students.iterator();
iterator.hasNext(); ) {
Student student = iterator.next();
if (student.nbCredits >= 120)
result++;
}
return result;
}
return staffMembers.contains(staff);
}
/**
* Returns the average number of scientific publications authored by this
* university's staff members.
*/
int getAvgNbPubs() {
int nbStaff = 0;
int totalNbPubs = 0;
for (Iterator<Staff> iterator = staffMembers.iterator();
iterator.hasNext(); ) {
Staff staff = iterator.next();
nbStaff++;
totalNbPubs += staff.nbPubs;
}
return totalNbPubs / nbStaff;
}
Note: if the type arguments for an instance creation expression can be de-
rived from the context, they can be omitted: in the example above, instead of
new LinkedList<Student>(), we can simply write new LinkedList<>(); this is known
as diamond notation.
/**
* Returns a negative number, zero, or a positive number if this object
* compares as less than, equal to, or greater than {@code other}.
*/
int compareTo(T other);
We can let the static type checker enforce this by declaring Comparable<T> as an
upper bound of the type parameter of SortedLinkedList:
class SortedLinkedList<T extends Comparable<T>> extends LinkedList<T> {
@Override
void addFirst(T element) {
Node<T> sentinel = new Node<T>(null, firstNode);
Node<T> node = sentinel;
INVARIANCE 169
Notice that thanks to the upper bound, the static type checker allows us to call
the compareTo method on expressions of type T.
We can update class University to use class SortedLinkedList by updating the field
declarations as follows:
private LinkedList<Student> = new SortedLinkedList<>();
private LinkedList<Staff> = new SortedLinkedList<>();
The static type checker will allow this only after we extend classes Student and
Staff to implement interface Comparable:
int nbCredits;
int nbPubs;
(The term "upper bound" refers to the image of superclasses being "above"
subclasses.)
Invariance
Suppose, now, that we want to extend class University with a method getMembers()
that returns a collection containing all members of the university. First, we
introduce class Member as a superclass of Student and Staff:
class Member {}
class University {
// ...
LinkedList<Member> getMembers() {
LinkedList<Member> members = new LinkedList<>();
members.addAll(students);
staffMembers.copyInto(members);
return members;
}
We make a first attempt at defining methods addAll and copyInto in class LinkedList
as follows:
class LinkedList<T> implements Iterable<T> {
// ...
• Secondly, the static type checker also complains about the call of copyInto
in method getMembers(): its argument is of type LinkedList<Member>, which
is not assignable to parameter other of type LinkedList<Staff>. Indeed,
LinkedList<Member> is not a subtype of LinkedList<Staff>, even though Member
is a supertype of Staff. In other words, LinkedList<T> is not contravariant
in T.
The reason for this is obvious: a LinkedList<Staff> must contain only Staff
objects, whereas a LinkedList<Member> object may additionally contain other
Member objects, such as Student objects.
In summary, generic types are neither covariant nor contravariant in their type
parameter. In other words, they are invariant.
Wildcards
Upper-bounded wildcards
Even though LinkedList<Student> is not a subtype of LinkedList<Member>, it is in
fact safe for the call of members.addAll to pass students as an argument: indeed,
addAll only retrieves elements from its argument; it does not add new elements
to it. For that reason, it is safe for addAll to take as an argument a LinkedList<U>
object, for any subtype U of T. We can express this by using an upper-bounded
wildcard:
void addAll(LinkedList<? extends T> other) {
for (Iterator<? extends T> iterator = other.iterator(); iterator.hasNext(); )
addFirst(iterator.next());
}
Wildcard type LinkedList<? extends T> generalizes over all types LinkedList<U>, for
all subtypes U of T, as well as LinkedList<T> itself. It could be proncounced "linked
list of some type that extends T".
Lower-bounded wildcards
Even though LinkedList<Member> is not a subtype of LinkedList<Staff>, it is in
fact safe for the call of staffMembers.copyInto to pass members as an argument:
indeed, copyInto only puts elements into its argument; it does not retrieve any
elements from it. For that reason, it is safe for copyInto to take as an argument
a LinkedList<U> object, for any supertype U of T. We can express this by using a
lower-bounded wildcard:
void copyInto(LinkedList<? super T> other) {
for (Iterator<T> iterator = this.iterator(); iterator.hasNext(); )
other.addFirst(iterator.next());
}
172 GENERICS
Wildcard type LinkedList<? super T> generalizes over all types LinkedList<U>, for
all supertypes U of T, as well as LinkedList<T> itself. It could be pronounced "linked
list of some supertype of T".
Generic methods
Suppose we add the following static method to class LinkedList:
Notice that we can specify the type argument for the type parameter of the
method being called by putting it between angle brackets in front of the method
name.
In most cases, however, the type argument can be inferred from the argument
types or from the context of the call; in these cases, we can omit the type
argument:
LinkedList<Member> getMembers() {
LinkedList<Member> members = new LinkedList<>();
LinkedList.copyInto(students, members);
LinkedList.copyInto(staffMembers, members);
return members;
}
We can also use a method-level type parameter to link the return type to a
parameter type:
LIMITATIONS 173
Limitations
After the static type checker finishes checking a Java program, but before it is
executed, all generics are erased from it. That is, in generic type declarations,
each type parameter is replaced by its upper bound (or Object if it has no explicit
upper bound), and type arguments are simply removed. Typecasts are inserted
as necessary to preserve well-typedness. For example, after erasing the example
program from Approach 3, we obtain the example program from Approach 2.
This involves inserting typecasts that cast the result of iterator.next() to the
expected type.
This approach, called erasure, has the following implications:
• Type arguments must be subtypes of Object; using a primitive type (like
int) as a type argument is not allowed. To store primitive values in a
generic collection, you must first box them. For example, ArrayList<int> is
not allowed, but ArrayList<Integer> is allowed. Java will convert int values
to Integer objects and vice versa automatically. (However, the memory
allocation involved in autoboxing may have a significant performance
impact.)
• Since type arguments are not available at run time, they cannot be used
in ways that affect the run-time type of an object or otherwise affect the
program’s run-time behavior. For example, if T is a type parameter, the
expressions new T(), new T[], T.class, or ... instanceof T are not allowed.
However, new LinkedList<T>() is allowed, since the type argument <T> is
removed during erasure anyway.
• Casts to types that are not run-time types (i.e. types that are different from
their erasure, such as T (whose erasure is Object) or LinkedList<T> (whose
erasure is LinkedList)) are allowed, but it may not be possible to fully check
them at run time. In those cases, the static type checker generates an
unchecked cast warning. For example, suppose object is a variable of type
Object. The cast (LinkedList<Student>)object generates an unchecked cast
warning, since it cannot be determined at run time whether the LinkedList
object was created as a LinkedList<Student>, a LinkedList<Staff>, or with yet
some other type argument. If object points to a LinkedList object that
contains a Staff object, then ((LinkedList<Student>)object).iterator().next()
will throw a ClassCastException when the next() call returns, because this
expression’s erasure is (Student)((LinkedList)object).iterator().next() and
the return value of the next() call is a Staff object.
174 GENERICS