[go: up one dir, main page]

0% found this document useful (1 vote)
322 views180 pages

Course Notes

This document discusses object-oriented programming concepts like single-object abstractions, inheritance, and multi-object abstractions. It also covers the JLearner programming language, principles of Java programming, and modular programming techniques. The document contains chapters on managing complexity through modularity and abstraction, representation objects, inheritance, and modular reasoning about dynamic binding.

Uploaded by

Thomas Meylaers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
322 views180 pages

Course Notes

This document discusses object-oriented programming concepts like single-object abstractions, inheritance, and multi-object abstractions. It also covers the JLearner programming language, principles of Java programming, and modular programming techniques. The document contains chapters on managing complexity through modularity and abstraction, representation objects, inheritance, and modular reasoning about dynamic binding.

Uploaded by

Thomas Meylaers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 180

Object-Oriented Programming

Prof. Dr. Bart Jacobs

Department of Computer Science, KU Leuven

December 15, 2020


2
Contents

Object-Oriented Programming 1
Single-object abstractions . . . . . . . . . . . . . . . . . . . . . . . . . 1
Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Multi-object abstractions (= entity-relationship abstractions) . . . . . 2
Advanced topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Introduction: Topic of the course 5

Principles of programming in Java 7

The JLearner Programming Language 9


Values and Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
The Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
The Method Activation Stack . . . . . . . . . . . . . . . . . . . . . . . 11
Variables and Variable Declarations . . . . . . . . . . . . . . . . . . . 13
Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Literal expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Operator expressions . . . . . . . . . . . . . . . . . . . . . . . . . 15
Other expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Precedence and associativity . . . . . . . . . . . . . . . . . . . . . 18
Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

First Steps in Modular Programming (Part I) 21


Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Installing Eclipse and FSC4J . . . . . . . . . . . . . . . . . . . . . . . 22
Building and running our first program in Eclipse . . . . . . . . . . . . 22
The problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Encapsulating the fields of class Interval . . . . . . . . . . . . . . . . . 25
Moving the methods inside class Interval . . . . . . . . . . . . . . . . . 28
Enforcing encapsulation: accessibility modifiers . . . . . . . . . . . . . 29
Instance methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

i
ii CONTENTS

First Steps in Modular Programming (Part II) 33


Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
API Semantics: Documentation . . . . . . . . . . . . . . . . . . . . . . 35

Managing Complexity: Modularity & Abstraction 41


Procedural abstractions . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Modular software development . . . . . . . . . . . . . . . . . . . . . . 44
Data abstractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Mutable versus immutable data abstractions . . . . . . . . . . . . 51

Representation Objects and Representation Exposure 55


Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Representation Exposure . . . . . . . . . . . . . . . . . . . . . . . . . 58
Representation Exposure without Leaking . . . . . . . . . . . . . . . . 60
Representation Exposure Breaks Consistency . . . . . . . . . . . . . . 61
Modular Reasoning: @mutates . . . . . . . . . . . . . . . . . . . . . . 63
ByteBuffer: first attempt (INCORRECT) . . . . . . . . . . . . . 63
ByteBuffer (opaque) . . . . . . . . . . . . . . . . . . . . . . . . . 65
ByteBuffer (transparent) . . . . . . . . . . . . . . . . . . . . . . . 67
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

How to document single-object abstractions 69


Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Data abstraction definition and implementation . . . . . . . . . . 69
Abstract state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Valid abstract states . . . . . . . . . . . . . . . . . . . . . . . . . 71
Representation, concrete state . . . . . . . . . . . . . . . . . . . . 72
Valid concrete states . . . . . . . . . . . . . . . . . . . . . . . . . 72
Relationship between concrete states and abstract states . . . . . 72
Class documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Fields documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Constructors and methods documentation . . . . . . . . . . . . . . . . 75
Defensive programming . . . . . . . . . . . . . . . . . . . . . . . 75
Contractual programming . . . . . . . . . . . . . . . . . . . . . . 75
Postconditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Expressions allowed inside Javadoc formal parts . . . . . . . . . . . . . 79
Advanced topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Mutates clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Inspects clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Defaults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Creates clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Inheritance 83
Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Static type checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
CONTENTS iii

Typecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Dynamic binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Class Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Modular reasoning about dynamic binding 91


Modular reasoning about programs . . . . . . . . . . . . . . . . . . . . 91
Non-modular reasoning . . . . . . . . . . . . . . . . . . . . . . . 91
Modular reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Dynamic binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Modular reasoning about dynamic binding . . . . . . . . . . . . . . . . 95
Applying basic modular reasoning . . . . . . . . . . . . . . . . . 95
Modular reasoning about dynamic binding: basic principle . . . . 97
Derived principle: strengthening of specifications . . . . . . . . . 97
Derived principle: behavioral subtyping . . . . . . . . . . . . . . 98
Inheritance of specifications . . . . . . . . . . . . . . . . . . . . . . . . 99
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

Interfaces 101

Implementation inheritance 107

Lists, sets, and maps 109


Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
The Java Collections Framework . . . . . . . . . . . . . . . . . . . . . 122
Other notable data structures . . . . . . . . . . . . . . . . . . . . 124

Entity-relationship abstractions 127


OOP Teams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Consistency of bidirectional associations . . . . . . . . . . . . . . . . . 131
Peer groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Multi-class entity-relationship abstractions 133


Nesting class-level and package-level abstractions . . . . . . . . . . . . 140

How to document multi-object abstractions 145


Fields documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Advanced topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Mutates clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
@mutates_properties and @basic clauses . . . . . . . . . . . . . 146
Inspects clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Defaults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Creates clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Specifying collections of objects . . . . . . . . . . . . . . . . . . . 147

Iterators 149
iv CONTENTS

The problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149


Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Iterables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Applying nested classes . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Static nested classes . . . . . . . . . . . . . . . . . . . . . . . . . 154
Inner classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Local classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Anonymous classes . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Enhanced for loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Internal iterators: forEach methods . . . . . . . . . . . . . . . . . . . 157
Lambda expressions . . . . . . . . . . . . . . . . . . . . . . . . . 158
Capturing outer variables . . . . . . . . . . . . . . . . . . . . . . 159

Generics 161
Basic concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Approach 1: Duplicate collection classes . . . . . . . . . . . . . . 161
Approach 2: Using subtype polymorphism . . . . . . . . . . . . . 164
Aproach 3: Generics . . . . . . . . . . . . . . . . . . . . . . . . . 166
Bounded type parameters . . . . . . . . . . . . . . . . . . . . . . . . . 168
Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Wildcards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Upper-bounded wildcards . . . . . . . . . . . . . . . . . . . . . . 171
Lower-bounded wildcards . . . . . . . . . . . . . . . . . . . . . . 171
Generic methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Object-Oriented
Programming

Single-object abstractions
• Introduction: Topic of the course
• Principles of programming in Java
– Concepts: values, variables, types, methods, parameters, arguments,
return values, classes, fields, objects, object creation, object references,
method activations and the call stack
• First steps in modular programming Part 1 Part 2
– Example: squareroot
– Example: max3
– Example: timeofday
– Concepts: Using Eclipse, creating JUnit test cases, creating classes,
instance methods, encapsulation, private versus public, using Git,
constructors
• Managing complexity through modularity and abstraction
– Concepts: modularity, abstraction, API, client module, importance
of documentation, information hiding, procedural abstraction, data
abstraction, immutable versus mutable abstractions, abstract value/s-
tate of an object, Java’s built-in datatypes and operators
• Representation objects and representation exposure
– Concepts: representation object, representation exposure
• How to properly document single-object abstractions
– Concepts: defensive programming, contractual programming, precon-
ditions, postconditions, class representation invariants (= private class
invariants), class abstract state invariants (= public class invariants),
getters, mutators

Inheritance
• Polymorphism and dynamic binding

1
2 OBJECT-ORIENTED PROGRAMMING

– Concepts: polymorphism, subclassing, inheritance, instanceof, the


static type checker, static/dynamic type of a variable or an expression,
typecasts; dynamic binding, method overriding, @Override; class Object,
Object.equals, Object.hashCode, Object.toString, Object.getClass
• Behavioral subtyping: modular reasoning about programs that use dynamic
binding
– Example: intlist_inheritance
– Concepts: Non-modular reasoning, modular reasoning, method spec-
ifications, correctness of methods; method call resolution, resolved
method vs called method, static versus dynamic method call binding;
strenghtening of specifications, behavioral types, behavioral subtyping
• Interfaces
– Concepts: interfaces, multiple inheritance, static fields, the Singleton
pattern
• Implementation inheritance
– Concepts: Inheritance of fields and methods, super constructor calls,
super method calls
• Lists, sets, and maps
– Concepts: the List, Set, and Map abstract datatypes (ADTs); the
ArrayList, LinkedList, HashSet, and HashMap data structures

Multi-object abstractions (= entity-relationship


abstractions)
• Single-class entity-relationship abstractions
– Example: html
– Concepts: entity graphs, multi-object abstractions, bidirectional asso-
ciations, consistency of bidirectional associations, peer objects, peer
groups
• Multi-class entity-relationship abstractions
– Concepts: packages, package-accessible fields/constructors/method-
s/classes, package representation invariants, package abstract state
invariants, HashSet
• How to properly document multi-object abstractions

Advanced topics
(Students of course H02C5A can ignore this material.)
• Iterators
– Concepts: (external) iterators, iterables, nested classes, inner classes,
local classes, anonymous classes, enhanced for loop, internal itera-
tors, lambda expressions, capturing outer variables, effectively final
variables
ADVANCED TOPICS 3

• Streams (on the web)


– Concepts: streams, sources, map, filter, reduce, collect, parallel
streams
• Generics
– Concepts: generic class, generic interface, type parameter, type argu-
ment, generic type instantiation, parameterized type, bounded type
parameter, covariance, contravariance, invariance, upper-bounded
wildcard, lower-bounded wildcard, generic method, erasure, unchecked
cast warning
4 OBJECT-ORIENTED PROGRAMMING
Introduction: Topic of the
course

The official title of the course is "Object-oriented programming". However, a


better title would be "Modular programming", because while you will in fact
learn the Java programming language and object-oriented programming, the
most important concept of the course is "Modular programming".
Modular programming is the principal method by which software engineers
manage the complexity of the task of building, maintaining, and evolving large
software systems. In this approach, a decomposition of the software system
into a number of modules is defined; for each module, a clear and abstract
specification is defined of the syntax and semantics of the program elements or
API it exports for use by other modules; and each module is implemented such
that its correctness (that is, its compliance with its specification) depends only
on the specifications, not the implementations, of the modules whose program
elements it uses (that is, whose program elements it imports).
This approach allows each module to be developed, understood, verified, and
evolved independently from, and in parallel with, other modules, and therefore
allows much larger and more complex software systems to be built, maintained
and evolved than would otherwise be possible.
The focus of this course will be on how to design and clearly define the syntax
and semantics of module APIs. Only if APIs are well-defined can the benefits of
modular programming be realized fully.
A major reason why we choose Java as the programming language for this course
is that it has good support for clearly defining the syntax of APIs. In particular,
it has strong static typing, and strong support for encapsulation: the compiler
and virtual machine (Java’s execution environment) together enforce that a
module is accessed only through its official API, and that the correct number
and types of arguments are passed to each API call.
Besides precisely defining the syntax of APIs, it is crucial to also precisely define
their semantics (meaning), in terms of the behavior generated by API calls.
Therefore, in this course, we put a strong emphasis on how to write clear and

5
6 INTRODUCTION: TOPIC OF THE COURSE

comprehensive documentation for APIs. For the sake of clarity and precision,
we will supplement informal documentation with formal specifications.
Principles of programming
in Java

We introduce the core concepts of the Java programming language: values,


variables, object references, arrays, classes, the call stack, and the heap, by
means of a series of simple example programs. We do so using the JLearner
environment for visualizing the execution of Java code snippets.
See The JLearner Programming Language for a detailed description of the syntax
and meaning of JLearner programs. We recommend that students study this
document to acquire a precise vocabulary for discussing and reasoning about
JLearner (and Java) programs and their execution.

7
8 PRINCIPLES OF PROGRAMMING IN JAVA
The JLearner Programming
Language

JLearner is a browser-hosted environment to be used by students of Java for ex-


perimenting with and stepping through simple Java programs. The programming
language supported by JLearner, called the JLearner programming language
or simply JLearner in this document, is a small subset of Java. It includes
enough features of Java to be able to serve as a vehicle for conveying the essential
principles of programming in Java, while remaining small enough to be able to
be presented fully to students taking a second programming course.

In this document, we define the syntax of JLearner programs and what it means
to execute them. We assume that the reader recognizes these concepts from
their introductory programming course; this document serves to further clarify
these concepts and to offer a reasonably precise vocabulary for discussing and
reasoning about JLearner (and Java) programs and their execution.

To further clarify the details of program execution, we recommend that students


step through example programs in the JLearner environment while studying this
document.

Contents:

• Values and Types


• Classes
• The Heap
• Methods
• The Method Activation Stack
• Variables and Variable Declarations
• Expressions
• Statements
• Further Reading

9
10 THE JLEARNER PROGRAMMING LANGUAGE

Values and Types


The result of evaluating a JLearner expression is a JLearner value. The JLearner
values are:
• the Boolean values, true and false
• the integer values, the integers between -2147483648 and 2147483647
• the null reference, null
• the object references, which uniquely identify an object in the heap
An object is either a class instance or an array.
The JLearner types are:
• type boolean
• type int
• for every class named ClassName declared by the JLearner program, the
class type ClassName
• for every type T, the array type T[]. The type T is called the element type
of array type T[].
The values of type boolean are the Boolean values.
The values of type int are the integer values.
The values of a class type C are the null reference and the object references that
refer to an instance of class C.
The values of an array type T[] are the null reference and the object references
that refer to an array with element type T.
The class types and the array types are collectively called the reference types.
Notice that the null reference is a value of every reference type.

Classes
JLearner class declarations are of the form class ClassName { FieldDeclarations },
where FieldDeclarations is a sequence of zero or more field declarations. A JLearner
field declaration is of the form Type FieldName;.

The Heap
The JLearner heap at each point during the execution of a program contains the
objects created by the program so far. For each instance O of a class C created
by the program and for each field F of C, it stores the value of O.F. (We also
say that it binds O.F to some value.) For each array A created by the program,
it stores the element type and the length N of the array, and for each of the
components of the array (identified by an index I between 0, inclusive, and N,
exclusive), it stores the value of A[I]. (We also say that it binds A[I] to some
METHODS 11

value.) We often refer to the values stored by the components of an array as the
array’s elements.
For example, consider the following snapshot from an execution of an example
program in the JLearner environment:

In this snapshot (showing a particular program execution state), the heap contains
four objects: one array, with element type Person and length 3, and three instances
of class Person. The first component of the array stores a reference to a Person
instance whose age field currently stores the value 11. The second component
stores a reference to another Person instance, whose age field currently stores the
value 21.
The JLearner heap is always well-typed:
• for any instance O of a class C and for any field F of C, the value of O.F
is a value of the declared type of F
• for any array A with element type T, the value of any component A[I] of
A is a value of type T

Methods
The JLearner methods are of the form ReturnType MethodName(ParameterDeclarations) { Statements }
where ReturnType is either void or a type, and ParameterDeclarations is a comma-
separated sequence of zero or more parameter declarations of the form
Type ParameterName.

The Method Activation Stack


Evaluation of JLearner expressions and execution of JLearner statements occurs
in the context of a variable environment that binds the method parameters and
12 THE JLEARNER PROGRAMMING LANGUAGE

local variables that are currently in scope to a JLearner value. The variable
environment is used to evaluate variable expressions.
A method call suspends the current method activation and starts a new method
activation. It pushes a corresponding activation record onto the method activation
stack, also known as the call stack. The call stack contains an activation record
for each method activation that is in progress. (At any point during the execution
of a program, only one method activation, the one at the top of the activation
stack, is active; the other method activations are suspended.)
For each activation, the corresponding activation record stores the current
variable environment and the program counter (also known as the instruction
pointer), which indicates which part of the body of the method being executed
will be executed next. (JLearner shows the instruction pointer of the active
activation by highlighting the program element in green; it does not show the
instruction pointers of the suspended activations.)
When a method activation is finished, the corresponding activation record is
removed (or popped) from the activation stack, and execution of the caller is
resumed.

Consider again the same execution snapshot shown earlier (repeated here). The
method activation stack currently contains three activation records. The record
at the top of the stack (shown at the bottom in the picture; it is customary
in computer science that stacks are depicted upside-down) corresponds to an
activation of method incrementAge. This is the method that is currently active.
When this activation is finished, the suspended activation of method incrementAges
is resumed. The activation record for incrementAge binds method parameter person
to a reference to the Person object with identification number 12. The activation
record for incrementAges binds method parameter persons to a reference to the
array object, and local variable i to the integer value 2.
The method activation stack is always well-typed: if an activation record binds
a method parameter or local variable of declared type T to a value V, then value
VARIABLES AND VARIABLE DECLARATIONS 13

V is of type T.

Variables and Variable Declarations

JLearner has four kinds of variables: object fields, array components, method
parameters, and local variables. Variables are created at some point during a
program execution and, from that point on, store a value. They can be mutated
to store a different value by means of assignments.

Variable declarations, on the other hand, are parts of the program text and exist
independently of any execution of the program. JLearner has three kinds of
variable declarations: field declarations, method parameter declarations, and
local variable declarations.

At any point during an execution of a program, a single variable declaration may


correspond to zero, one, or many variables:

• for each field declaration of a class C, there is one object field in the heap
per instance of class C
• for each method parameter declaration of a method M, there is one method
parameter on the method activation stack for each currently active or
suspended activation of method M
• similarly, for each local variable declaration in a method M, there is one
local variable on the method activation stack for each currently active or
suspended activation of method M whose instruction pointer is within the
scope of the local variable declaration

For array components, there is no corresponding declaration.

Consider the following snapshot from an execution in the JLearner environment


of an example program that uses a recursive method sum to compute the sum of
the values stored by a linked list:
14 THE JLEARNER PROGRAMMING LANGUAGE

• for the single field declaration int value of class Node, there are three corre-
sponding object fields, one in each of the three instances of Node
• for the single method parameter declaration Node node of method sum, there
are three corresponding method parameters, one in each of the three
currently active or suspended activations of method sum
• for the single local variable declaration int value in method sum, there
are two corresponding local variables, one in each of the two currently
suspended activations of method sum. (There is no corresponding local
variable in the active activation of method sum because its instruction
pointer is not in the scope of the local variable declaration.)
• for the method parameter declaration int x of method foo, there are no
corresponding method parameters on the method activation stack, because
there is no current activation of method foo.

Expressions
An expression is a part of a program that can be evaluated in a given variable
environment to produce a result value. Evaluation of an expression may also
produce side effects. In JLearner, the side effects are the object creations and the
variable mutations. That is, in addition to producing a result value, evaluation
of an expression may create objects, bind different values to object fields and
array components in the heap, and bind different values to method parameters
and local variables in the active method activation record.
Some expressions can be evaluated to a variable: field selection expressions can
be evaluated to an object field, array component selection expressions can be
evaluated to an array component, and variable expressions can be evaluated to a
method parameter or local variable. These are the expressions that can be used
as the left-hand side of an assignment expression. To evaluate these expressions
EXPRESSIONS 15

to produce a value, they are first evaluated to a variable, and then the variable
is inspected to obtain its current value.
The JLearner expressions are:
• the literal expressions, e.g. true, 42, null
• the operator expressions, e.g. 3 + 5 or x = y * 3
• the variable expressions, e.g. x or myVariable
• the object creation expressions, e.g. new Rectangle() or new int[7]
• the field selection expressions, e.g. myRectangle.width
• the array component selection expressions, e.g. myArray[5]
• the method call expressions, e.g. myMethod(42, false)
• the parenthesized expressions, e.g. (7 - 1)

Literal expressions
The literal expressions are:
• the Boolean literal expressions, true and false
• the integer literal expressions, e.g. 10, 42, 739
• the null literal expression, null
Evaluation of a literal expression has no side-effects and produces as a result
value the value denoted by the expression.

Operator expressions
The operator expressions are:
• the unary operator expressions, e.g. -3, ++x, y--, -myMethod(77)
• the binary operator expressions, e.g. myVariable / myMethod(10),
myVariable = 33

The unary operator expressions are:


• the negation expressions, of the form -Expression, i.e. a negation sign
followed by an expression.
• the pre-increment expressions, of the form ++Expression
• the pre-decrement expressions, of the form --Expression
• the post-increment expressions, of the form Expression++
• the post-decrement expressions, of the form Expression--
The subexpression of a unary expression, e.g. the expression E in negation
expression -E, is called the operand expression of the unary expression.
Evaluation of a negation expression first evaluates its operand expression. If this
yields a value V, evaluation of the negation expression completes with result -V.
Evaluation of a pre-increment expression first evaluates its operand expression
to a variable Var. Then, it looks up Var’s current value V. Then, it mutates Var
to store value V + 1. Then, evaluation completes with result value V + 1.
16 THE JLEARNER PROGRAMMING LANGUAGE

In contrast, evaluation of a post-increment expression first evaluates its operand


expression to a variable Var. Then, it looks up Var’s current value V. Then, it
mutates Var to store value V + 1. Then, evaluation completes with result value
V (not V + 1).
Evaluation of a pre-decrement or post-decrement expression proceeds analogously.
The binary operator expressions are:
• the addition expressions, of the form Expression + Expression
• the subtraction expressions, of the form Expression - Expression
• the multiplication expressions, of the form Expression * Expression
• the division expressions, of the form Expression / Expression
• the remainder expressions, of the form Expression % Expression
• the equality expressions, of the form Expression == Expression
• the inequality expressions, of the form Expression != Expression
• the less-than expressions, of the form Expression < Expression
• the less-than-or-equals expressions, of the form Expression <= Expression
• the greater-than expressions, of the form Expression > Expression
• the greater-than-or-equals expressions, of the form Expression >= Expression
• the conjunction expressions, of the form Expression && Expression
• the disjunction expressions, of the form Expression || Expression
• the assignment expressions, of the form Expression = Expression
• the compound assignment expressions, of the form Expression op= Expression,
where op is one of +, -, *, /, %. For example: x += 3 or y *= 5
The subexpressions of a binary operator expression, e.g. the expressions E1
and E2 in addition expression E1 + E2, are called the left operand expression (or
left-hand side, abbreviated LHS) and the right operand expression (or right-hand
side, abbreviated RHS) of the binary operator expression, respectively.
Evaluation of a binary operator expression E1 op E2, where op is +, -, *, /, %, ==,
!=, <, <=, >, or >=, first evaluates E1 to a value V1, then evaluates E2 to a value
V2, and then completes with the result value obtained by applying operator op
to V1 and V2.
Evaluation of a conjunction expression E1 && E2 first evaluates E1 to a Boolean
value V1. If V1 is false, evaluation completes with result value false. Otherwise,
E2 is evaluated to a value V2 and evaluation completes with result value V2.
Analogously, evaluation of a disjunction expression E1 || E2 first evaluates E1 to
a Boolean value V1. If V1 is true, evaluation completes with result value true.
Otherwise, E2 is evaluated to a value V2 and evaluation completes with result
value V2.
Evaluation of an assignment expression E1 = E2 first evaluates E1 to a variable
Var, then evaluates E2 to a value V, then mutates Var to store V, and then
completes with result value V.
Evaluation of a compound assignment expression E1 op= E2 first evaluates E1 to
EXPRESSIONS 17

a variable Var, then looks up the value V1 of Var, then evaluates E2 to value
V2, then computes the result V of applying op to V1 and V2, then mutates Var
to store V, and then completes with value V.

Other expressions
The variable expressions are the method parameter names and the local variable
names.

Evaluation of a variable expression to a variable yields the variable of the given


name in the active activation record’s variable environment.

The object creation expressions are:

• the class instance creation expressions, of the form new ClassName()


• the array creation expressions, of the form new Type[Expression]

Evaluation of a class instance creation expression new C() creates a new instance
O of class C in the heap, then for each field F declared by C initializes object
field O.F with the default value of the declared type of F, and then completes
with a reference to O as the result value. The default value of type boolean is
false, the default value of type int is 0, and the default value of a reference type
is null.

Evaluation of an array creation expression new T[E] first evaluates E to a value V


(which must be a nonnegative integer), then creates a new array A with element
type T and length V in the heap, then initializes each component of the array
with the default value of type T, and then completes with a reference to A as
the result value.

The field selection expressions are the expressions of the form Expression.FieldName.

Evaluation of a field selection expression E.F to a variable first evaluates the


target expression E to a value V (which must be a reference O to an instance of
a class C that declares a field named F), and then completes with variable O.F
as its result.

However, an expression E.length, where E is of an array type, is an array length


expression. It is evaluated by first evaluating E to a value V, which must be a
reference to an array A. It then completes with result value N, where N is the
length of A.

The array component selection expressions are the expressions of the form
Expression[Expression].

Evaluation of an array component selection expression E1[E2] first evaluates the


target expression E1 to a value V1 (which must be a reference to an array A),
then evaluates the index expression E2 to a value V2 (which must be a valid
index I for array A), and then completes with variable A[I] as its result.
18 THE JLEARNER PROGRAMMING LANGUAGE

The method call expressions are the expressions of the form MethodName(Expressions),
where Expressions (known as the argument expressions of the method call
expression) is a comma-separated sequence of zero or more expressions.
Evaluation of a method call expression M(Expressions) first evaluates the argument
expressions Expressions, from left to right, to obtain argument values Args. It
must be the case that the number of arguments equals the number of declared
parameters of method M. Then, a new activation record is pushed onto the
method activation stack, and for each method parameter declaration P of
method M, a method parameter P is added to the new activation record’s
variable environment, initialized with the corresponding argument value. The
new activation record’s instruction pointer is initialized to point to the first
statement of the body of method M. Program execution then proceeds by
executing the body of M. When the new activation finishes with a result value
V, its activation record is removed from the activation stack and evaluation of
the method call expression completes with value V.
The parenthesized expressions are the expressions of the form (Expression).

Evaluation of a parenthesized expression (E) first evaluates E to a value V, and


then completes with value V.

Precedence and associativity


The expression 1 + 2 * 3 is an addition expression whose right operand is a
multiplication expression; it is not a multiplication expression whose left operand
is an addition expression. This is because the multiplication operator has higher
precedence than the addition operator.
Furthermore, 1 - 2 - 3 is a subtraction expression whose left operand is a sub-
traction expression; it is not a subtraction expression whose right operand is a
subtraction expression. This is because the negation operator is left-associative.
The following table lists the operators in order of decreasing precedence.

Operators Associativity
Unary (-, ++, --)
* / % Left
+ - Left
== != < <= > >=
&&
||
= op= Right

Statements
The JLearner statements are:
STATEMENTS 19

• the local variable declaration statements, of the form Type VariableName = Expression;
or Type[] VariableName = ArrayInitializer;, where ArrayInitializer is of the
form { Expressions }, where Expressions is a comma-separated sequence of
zero or more expressions
• the expression statements, of the form Expression;
• the if statements, of the form if (Expression) Statement or if (Expression) Statement else Statement
• the while statements, of the form while (Expression) Statement
• the for statements, of the form for (Initialization; Condition; Update) Statement,
where Initialization is either nothing or of the form Type VariableName = Expression
or of the form VariableName = Expression, Condition is either nothing or an
expression, and Update is either nothing or an expression
• the block statements, of the form { Statements }, where Statements is a
sequence of zero or more statements
• the return statements, of the form return; or return Expression;
• the assert statements, of the form assert Expression;
Execution of a local variable declaration statement T X = E; first evaluates initial-
izer E to a value V (which must be of type T), and then adds a new local variable
named X to the active activation record’s variable environment, initialized with
V.
Execution of a local variable declaration statement T[] X = {Expressions}; first
evaluates Expressions, from left to right, to a sequence of values Vs, then creates a
new array A in the heap with element type T and length N equal to the number
of values, then initializes each component of A with the corresponding value
from Vs, and then adds a new local variable named X to the active activation
record’s variable environment, initialized with a reference to A.
Execution of an expression statement E; evaluates E.
Execution of an if statement if (E) S first evaluates E to a Boolean value V. If
V is true, it then executes S.
Execution of an if statement if (E) S1 else S2 first evaluates E to a Boolean
value V. If V is true, it then executes S1; otherwise, it then executes S2.
Execution of a while statement while (E) S first evaluates E to a Boolean value
V. If V is true, it then executes S and while (E) S, in that order.

Execution of a for statement for (Init; Cond; Update) S is equivalent to execution


of a block { Init; while (Cond) { S Update; } }.

Execution of a block statement { Statements } executes the given statements,


from left to right, and then removes all local variables from the active activation
record’s variable environment that were added since the start of the execution
of the block.
Execution of a return statement return; causes the active method activation to
finish without a result value. Execution of a return statement return E; first
20 THE JLEARNER PROGRAMMING LANGUAGE

evaluates E to a value V and then causes the active method activation to finish
with result value V.
Execution of an assert statement assert E; evaluates E to a value, which must
be the value true.

Further reading
The syntax and meaning of the full Java programming language is specified in
the Java Language Specification.
First Steps in Modular
Programming (Part I)

Contents
• Installing Eclipse and FSC4J
• Building and running our first program in Eclipse
• The problem
• Encapsulating the fields of class Interval
• Moving the methods inside class Interval
• Enforcing encapsulation: accessibility modifiers
• Instance methods
In the previous lecture, we used the JLearner tool to make acquaintance with
the Java programming language. We used variables, arrays, and class objects to
store and process data. From this perspective, Java is not much different from
Python. The main difference is that Java is statically typed, which means that
we need to specify the type of each local variable, method parameter, method
result, array element, and class field in our program. The types we have seen are
int, the type of integers, int[], the type of arrays of integers, int[][], the type
of arrays of arrays of integers, and the various class types corresponding to the
classes we declared, such as Node and List.
In this lecture, we will move on to modular programming. In modular program-
ming, we split our program into modules, with the goal of being able to build,
understand, verify, and evolve each module separately and in parallel with the
other modules of the program. To achieve this, it is necessary that we clearly
specify the API between a module and its clients (the modules that use the
module).
To illustrate this, let’s write a program that stores and manipulates intervals of
integers. An interval is defined by a lower bound and an upper bound, so we
declare a class Interval, with fields lowerBound and upperBound:
class Interval {
int lowerBound;
int upperBound;

21
22 FIRST STEPS IN MODULAR PROGRAMMING (PART I)

Here is a program that manipulates an interval:


Interval interval = new Interval();
interval.lowerBound = 3;
interval.upperBound = 7;

int width = interval.upperBound - interval.lowerBound;


assert width == 4;

The assert statement checks if a given expression evaluates to true. If not, it


reports an error.
We can run this program directly in JLearner, but since JLearner does not
support Java’s modularity features, at this point, we will switch to a more
complex but more powerful tool: the Eclipse IDE (Integrated Development
Environment).

Installing Eclipse and FSC4J


We recommend that you use the Eclipse Installer to install the latest Eclipse
IDE for Java Developers; it will also install a matching Java Development Kit
(JDK) if one is not yet present on your system.
Once you have installed Eclipse, we recommend that you install the Formal
Specifications Checker for Java (FSC4J), that we are developing. It is a modified
version of the Java Development Tools component of Eclipse that gives you
feedback about the formal documentation you write. To install it, just follow
the instructions on the FSC4J website.

Building and running our first program in Eclipse


Once Eclipse is installed, start it by double-clicking the Eclipse program. First, if
you see the Welcome to the Eclipse IDE for Java Developers screen, uncheck the
Always show Welcome at start up box in the bottom right corner of the screen,
and then click the Workbench button in the top right corner of the screen.
In Eclipse, to create a program you must first create a project. To do so, in the
File menu, choose New -> Java Project. Enter interval as the project name and
click Finish. In the Create module-info.java dialog box, choose Don’t Create.
The project will now appear in your Package Explorer view. If you do not see
the Package Explorer view, in the Window menu, choose Show View -> Package
Explorer.
In the Package Explorer, if the node for project interval is collapsed, expand it
by clicking the node. You will see that the project has an src folder. This is
where you will store all of the program’s source code files. (In Java, names of
BUILDING AND RUNNING OUR FIRST PROGRAM IN ECLIPSE 23

source code files end with .java.) Any .java file outside the src folder will be
ignored by Eclipse.
In Java, each class must be in its own file. Let’s create a source code file for the
Interval class. Right-click the interval project’s src node in the Project Explorer
and choose New -> Class. In the New Java Class dialog box, enter Interval as
the class name and choose Finish. You will see that a new node corresponding
to a new source file Interval.java appears in the Project Explorer, and an editor
appears for editing the new source files. Notice that the source file node is below
a new node interval. This node represents the package interval. Indeed, Eclipse
by default adds new classes in a project xyz to a package named xyz. In Java,
packages serve to group classes. We will learn more about packages later.
The initial contents of the Interval.java source file are as follows:
package interval;

public class Interval {

For now, replace the given class declaration by the one shown above:
package interval;

class Interval {
int lowerBound;
int upperBound;
}

We now want to write the code that uses the Interval class. In JLearner,
statements and expressions can be written directly in the respective boxes. In
real Java, however, all statements and expressions must be inside methods, and
all methods must be inside classes. Therefore, we will create a class to hold
our program. In the Package Explorer, right-click on the interval package and
choose New -> JUnit Test Case. A JUnit test case is a class intended to just
contain some code that we want to run. Enter IntervalTest as the name for the
test case, and choose Finish. If Eclipse asks whether to add JUnit to the build
path, choose OK.
The initial contents of source file IntervalTest are as follows:
package interval;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class IntervalTest {

@Test
void test() {
24 FIRST STEPS IN MODULAR PROGRAMMING (PART I)

fail("Not yet implemented");


}

We can write our code inside the test method. Replace the existing code by the
code we wrote above:
package interval;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class IntervalTest {

@Test
void test() {
Interval interval = new Interval();
interval.lowerBound = 3;
interval.upperBound = 7;

int width = interval.upperBound - interval.lowerBound;


assert width == 4;
}

We are now ready to run our code. Right-click on the IntervalTest.java node
in the Package Explorer, and choose Run As -> JUnit Test. The JUnit view
should appear and show a green bar, indicating that our program was executed
successfully, without detecting any errors.
To see what happens if an error is detected, change width == 4 to width == 5
and run the program again. A shortcut for running the same program again
is to simply click the green Play button in the Eclipse toolbar. Notice that
you now get a red bar in the JUnit view. The Failure Trace part of the JUnit
view tells you what went wrong. In this case an AssertionError occurred at
IntervalTest.java:16, meaning line 16 of file IntervalTest.java. Double-click this
message in the Failure Trace to highlight this line in the editor. If we change
width == 5 back to width == 4 and run the program again, we again get a green
bar.

The problem
We have now written a program consisting of two source files: Interval.java
and IntervalTest.java. However, our program is not modular: we cannot change
the structure of class Interval without breaking the IntervalTest program. For
example, suppose we decide that it is better to store intervals by storing their
ENCAPSULATING THE FIELDS OF CLASS INTERVAL 25

lower bound and their width, rather than their lower bound and their upper
bound. Update file Interval.java as follows:
package interval;

class Interval {
int lowerBound;
int width;
}

Even though, conceptually, the class stores the same information as before, just
in a different form, we have now broken our client code (i.e. the code that uses
our class). Indeed, Eclipse now shows a red wavy line below the references to
field upperBound in IntervalTest.java; these references are now broken since this
field no longer exists. These red wavy lines are known as compilation errors.
They are errors that the programming environment detects even before we run
the program. Errors detected only while running a program are called run-time
errors. We here see a major advantage of statically typed programming languages:
many errors can be detected even before we run the program.

Encapsulating the fields of class Interval


This experiment shows that we should never access the fields of another class
directly. Instead, we should access the information we need from an object
through methods. This is known as encapsulation. Let’s update our program in
IntervalTest to use methods to access the properties of Interval objects. First,
define methods for inspecting and updating the properties of an interval:
package interval;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class IntervalTest {

int getLowerBound(Interval interval) {


return interval.lowerBound;
}

int getUpperBound(Interval interval) {


return interval.lowerBound + interval.width;
}

int getWidth(Interval interval) {


return interval.width;
}

void setLowerBound(Interval interval, int value) {


interval.lowerBound = value;
}
26 FIRST STEPS IN MODULAR PROGRAMMING (PART I)

void setUpperBound(Interval interval, int value) {


interval.width = value - interval.lowerBound;
}

void setWidth(Interval interval, int value) {


interval.width = value;
}

@Test
void test() {
Interval interval = new Interval();
interval.lowerBound = 3;
interval.upperBound = 7;

int width = interval.upperBound - interval.lowerBound;


assert width == 4;
}

Then, in the program, use these methods (known as getters and setters) instead
of accessing the fields of the Interval object directly:
package interval;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class IntervalTest {

int getLowerBound(Interval interval) {


return interval.lowerBound;
}

int getUpperBound(Interval interval) {


return interval.lowerBound + interval.width;
}

int getWidth(Interval interval) {


return interval.width;
}

void setLowerBound(Interval interval, int value) {


interval.lowerBound = value;
}

void setUpperBound(Interval interval, int value) {


interval.width = value - interval.lowerBound;
}

void setWidth(Interval interval, int value) {


interval.width = value;
}

@Test
void test() {
ENCAPSULATING THE FIELDS OF CLASS INTERVAL 27

Interval interval = new Interval();


setLowerBound(interval, 3);
setUpperBound(interval, 7);

int width = getWidth(interval);


assert width == 4;
}

Notice that Eclipse does not show any compilation errors anymore. Furthermore,
if we run the program, we get a green bar. More importantly, though, if we now
decide to change the structure of class Interval again, we only need to update
the relevant getters and setters, and we do not need to touch our client program
in method test at all. Indeed, replace field int width by field int upperBound in
Interval.java. Eclipse will show compilation errors in the places where field width
is referenced in file IntervalTest.java, but notice that these references are all
inside the getters and setters. Updating those eliminates the errors:
package interval;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class IntervalTest {

int getLowerBound(Interval interval) {


return interval.lowerBound;
}

int getUpperBound(Interval interval) {


return interval.upperBound;
}

int getWidth(Interval interval) {


return interval.upperBound - interval.lowerBound;
}

void setLowerBound(Interval interval, int value) {


interval.lowerBound = value;
}

void setUpperBound(Interval interval, int value) {


interval.upperBound = value;
}

void setWidth(Interval interval, int value) {


interval.upperBound = interval.lowerBound + value;
}

@Test
void test() {
Interval interval = new Interval();
setLowerBound(interval, 3);
28 FIRST STEPS IN MODULAR PROGRAMMING (PART I)

setUpperBound(interval, 7);

int width = getWidth(interval);


assert width == 4;
}

Moving the methods inside class Interval


We now have two modules: one module consists of file Interval.java plus the
getters and setters in file IntervalTest.java, and the other module consists of
method test in file IntervalTest.java. We can change the way we store the interval
properties in the first module without breaking the second module. However, it
would of course be much better if each module is in its own file. For this reason,
Java allows us to define the methods that belong together with a given class
inside the class itself. To move a method into a class, however, we need to prefix
it with the keyword static:
package interval;

class Interval {
int lowerBound;
int upperBound;

static int getLowerBound(Interval interval) {


return interval.lowerBound;
}

static int getUpperBound(Interval interval) {


return interval.upperBound;
}

static int getWidth(Interval interval) {


return interval.upperBound - interval.lowerBound;
}

static void setLowerBound(Interval interval, int value) {


interval.lowerBound = value;
}

static void setUpperBound(Interval interval, int value) {


interval.upperBound = value;
}

static void setWidth(Interval interval, int value) {


interval.upperBound = interval.lowerBound + value;
}

(You will understand later why we did not need to write the static keyword
when we defined the getters and setters in class IntervalTest originally.)
ENFORCING ENCAPSULATION: ACCESSIBILITY MODIFIERS 29

Furthermore, in file IntervalTest.java, we now need to tell Java that it needs to


look inside class Interval to find the getters and setters. We do so by putting
the class name and a dot in front of the method name:
package interval;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class IntervalTest {

@Test
void test() {
Interval interval = new Interval();
Interval.setLowerBound(interval, 3);
Interval.setUpperBound(interval, 7);

int width = Interval.getWidth(interval);


assert width == 4;
}

Enforcing encapsulation: accessibility modifiers


We have now cleanly separated our interval module and our client module into
separate files and separate classes. If we want to change the way we store the
interval properties again, we only need to update the Interval class; this is thanks
to the fact that the IntervalTest class accesses the interval properties only via
the getters and setters, not by directly accessing the fields of class Interval.
However, there is currently nothing that prevents the authors of IntervalTest
from (perhaps accidentally) changing the program to access the fields of Interval
directly, thus breaking modularity. Fortunately, Java offers a way to make this
impossible: it allows us to indicate that certain elements defined inside a given
class (called members of the class) are only for use by other members of the
same class, by using the private keyword:
package interval;

class Interval {
private int lowerBound;
private int upperBound;

static int getLowerBound(Interval interval) {


return interval.lowerBound;
}

static int getUpperBound(Interval interval) {


return interval.upperBound;
}
30 FIRST STEPS IN MODULAR PROGRAMMING (PART I)

static int getWidth(Interval interval) {


return interval.upperBound - interval.lowerBound;
}

static void setLowerBound(Interval interval, int value) {


interval.lowerBound = value;
}

static void setUpperBound(Interval interval, int value) {


interval.upperBound = value;
}

static void setWidth(Interval interval, int value) {


interval.upperBound = interval.lowerBound + value;
}

Now, if we replace Interval.setUpperBound(interval, 7) by interval.upperBound = 7


in class IntervalTest, we immediately get a compilation error: The field
Interval.upperBound is not visible. Fix the error by restoring the setter call.

By making the fields of class Interval private, we now get the guarantee that any
client code of class Interval that has no compilation errors will not break if we
change the way we store the interval properties. If our module is used by many
clients around the world, this is an extremely valuable guarantee.

Instance methods
Notice that all of the methods of class Interval take a reference to an Interval
object as their first argument. This is of course a very common phenomenon.
For that reason, Java supports a more concise notation for this case, known as
instance methods. An instance method is a method declared without the static
keyword. (A method that does have the static keyword is known as a static
method.) An instance method declared in a class C has an implicit parameter of
type C, called the receiver. To refer to the receiver in the body of an instance
method, use the keyword this. When calling an instance method, the receiver
object is written before the method name, with a dot in between the two:
package interval;

class Interval {
private int lowerBound;
private int upperBound;

int getLowerBound() {
return this.lowerBound;
}

int getUpperBound() {
return this.upperBound;
}
INSTANCE METHODS 31

int getWidth() {
return this.upperBound - this.lowerBound;
}

void setLowerBound(int value) {


this.lowerBound = value;
}

void setUpperBound(int value) {


this.upperBound = value;
}

void setWidth(int value) {


this.upperBound = this.lowerBound + value;
}

package interval;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class IntervalTest {

@Test
void test() {
Interval interval = new Interval();
interval.setLowerBound(3);
interval.setUpperBound(7);

int width = interval.getWidth();


assert width == 4;
}

So, Interval.setLowerBound(interval, 3)becomes interval.setLowerBound(3). Here,


intervalis the receiver for the call of method setLowerBound. Since interval is of
type Interval, Java will look in class Interval to find method setLowerBound.
Notice that both the method declarations and the method calls become much
more concise this way. But remember that the meaning is exactly the same as
before; it’s just a shorter way to write the same program.
In fact, we can use an additional Java shorthand to write class Interval even more
concisely: instead of writing this.lowerBound, we can just write lowerBound. When
Java encounters the name lowerBound, used as an expression in an instance method,
and the method declares no parameter or local variable named lowerBound, then
it will look for a field called lowerBound in the receiver object. This means we can
write class Interval as follows:
32 FIRST STEPS IN MODULAR PROGRAMMING (PART I)

package interval;

class Interval {
private int lowerBound;
private int upperBound;

int getLowerBound() {
return lowerBound;
}

int getUpperBound() {
return upperBound;
}

int getWidth() {
return upperBound - lowerBound;
}

void setLowerBound(int value) {


lowerBound = value;
}

void setUpperBound(int value) {


upperBound = value;
}

void setWidth(int value) {


upperBound = lowerBound + value;
}

So, again, this program has exactly the same meaning as the previous ones, and
will behave in exactly the same way; it’s just shorter. Notice that Eclipse tells
you whether it interprets a name as a reference to a local variable or a reference
to a field: references to fields are shown in blue, references to local variables in
black.
First Steps in Modular
Programming (Part II)

Constructors
At this point, to use an Interval object to store a given interval, we must first
create the object using a new Interval() expression, and then set its properties
through the setLowerBound, setUpperBound, and/or setWidth setters. It is often
convenient, and sometimes necessary, to allow client code to create an object
and initialize its properties to desired values in one step. For this purpose, Java
allows us to declare constructors. For example, we can allow clients to create
an Interval object storing the interval with lower bound 3 and upper bound 7
using expression new Interval(3, 7) by inserting a constructor declaration into
class Interval as follows:
package interval;

class Interval {
private int lowerBound;
private int upperBound;

int getLowerBound() {
return lowerBound;
}

int getUpperBound() {
return upperBound;
}

int getWidth() {
return upperBound - lowerBound;
}

Interval(int initialLowerBound, int initialUpperBound) {


lowerBound = initialLowerBound;
upperBound = initialUpperBound;
}

void setLowerBound(int value) {

33
34 FIRST STEPS IN MODULAR PROGRAMMING (PART II)

lowerBound = value;
}

void setUpperBound(int value) {


upperBound = value;
}

void setWidth(int value) {


upperBound = lowerBound + value;
}

Notice the following:


• The name of a constructor is always simply the name of the class.
• A constructor declaration does not specify a return type.
Notice that Eclipse now marks expression new Interval() in IntervalTest.java as
incorrect. Indeed, this expression refers to a constructor with zero parameters.
If a class C does not explicitly declare a constructor, Java implicitly generates a
default constructor C() {}. Since we now explicitly declare a constructor, Java
no longer generates this constructor and the expression new Interval() no longer
works. To fix the error, we replace this expression with new Interval(3, 7). We
can now remove the setter calls:
package interval;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class IntervalTest {

@Test
void test() {
Interval interval = new Interval(3, 7);

int width = interval.getWidth();


assert width == 4;
}

Overloading
But what if we want to initialize an Interval object with a given lower bound
and a given width? That is, we would like to additionally declare the following
constructor:
Interval(int lowerBound, int width) {
this.lowerBound = lowerBound;
API SEMANTICS: DOCUMENTATION 35

this.upperBound = lowerBound + width;


}

Java does not allow this, because there would be no way to decide which
constructor to use to execute an instance creation expression new Interval(3, 7).
Java does allow a class to declare multiple constructors, but each constructor
must have either a different number of parameters or different parameter types.
So, one way to add our additional constructor is as follows:
Interval(int lowerBound, int width, Void dummy) {
this.lowerBound = lowerBound;
this.upperBound = lowerBound + width;
}

(Void is a built-in class with no instances; the only value of type Void is null.)
This works: now, we can replace expression new Interval(3, 7) by the equivalent
expression new Interval(3, 4, null).
Declaring multiple members with the same name is known as overloading. Java
supports overloading of constructors, as well as methods. For example, we can
extend our class with a method setWidth with two parameters that updates the
lower bound or the upper bound, depending on the second argument:
void setWidth(int value, boolean updateLowerBound) {
if (updateLowerBound)
lowerBound = upperBound - value;
else
upperBound = lowerBound + value;
}

API Semantics: Documentation


Consider the following modified version of class IntervalTest:

package interval;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class IntervalTest {

@Test
void test() {
Interval interval = new Interval(3, 4, null);

interval.setUpperBound(8);
assert interval.getLowerBound() == 3;
}

}
36 FIRST STEPS IN MODULAR PROGRAMMING (PART II)

It checks that updating an interval’s upper bound leaves its lower bound un-
changed. If we were to update method setUpperBound in class Interval so that it
updates the lower bound and leaves the width unchanged, this change would
break this client: the assert statement would fail. Who would be to blame?
To eliminate this error, who would have to change their code? The author of
Interval or the author of IntervalTest?

In this case, it is reasonable to say that the client is to blame: they assumed that
method setUpperBound would leave the lower bound unchanged, but the author of
class Interval has made no such promises. Unless module authors make specific
guarantees about how a method will behave, it is always safest to assume the
worst and not rely on anything that is not explicitly guaranteed.
So, if we, as a module author, want to allow clients to assume certain facts
about our methods’ behavior, we have to state those facts clearly in the module’s
documentation. Writing clear and precise documentation for modules will be a
central focus of this course. For maximum clarity and precision, we will write
statements in documentation both informally and formally. A fully documented
version of class Interval looks like this:
package interval;

/**
* An object of this class stores an interval of integers.
*
* @invar This interval's lower bound is not greater than its upper bound.
* | getLowerBound() <= getUpperBound()
* @invar This interval's width equals the difference between its upper bound
* and its lower bound.
* | getWidth() == getUpperBound() - getLowerBound()
*/
class Interval {

/**
* @invar This interval's lower bound is not greater than its upper bound.
* | lowerBound <= upperBound
*/
private int lowerBound;
private int upperBound;

int getLowerBound() {
return lowerBound;
}

int getUpperBound() {
return upperBound;
}

int getWidth() {
return upperBound - lowerBound;
}

/**
* Initializes this interval with the given lower bound and upper bound.
API SEMANTICS: DOCUMENTATION 37

*
* @pre The given lower bound is not greater than the given upper bound.
* | lowerBound <= upperBound
* @post This interval's lower bound equals the given lower bound.
* | getLowerBound() == lowerBound
* @post This interval's upper bound equals the given upper bound.
* | getUpperBound() == upperBound
*/
Interval(int lowerBound, int upperBound) {
this.lowerBound = lowerBound;
this.upperBound = upperBound;
}

/**
* Initializes this interval with the given lower bound and width.
*
* @pre The given width is nonnegative.
* | 0 <= width
* @post This interval's lower bound equals the given lower bound.
* | getLowerBound() == lowerBound
* @post This interval's width equals the given width.
* | getWidth() == width
*/
Interval(int lowerBound, int width, Void dummy) {
this.lowerBound = lowerBound;
this.upperBound = lowerBound + width;
}

/**
* Sets this interval's lower bound to the given value.
*
* @pre The given value is not greater than this interval's upper bound.
* | value <= getUpperBound()
* @post This interval's lower bound equals the given value.
* | getLowerBound() == value
* @post This interval's upper bound has remained unchanged.
* | getUpperBound() == old(getUpperBound())
*/
void setLowerBound(int value) {
lowerBound = value;
}

/**
* Sets this interval's upper bound to the given value.
*
* @pre The given value is not less than this interval's lower bound.
* | getLowerBound() <= value
* @post This interval's upper bound equals the given value.
* | getUpperBound() == value
* @post This interval's lower bound has remained unchanged.
* | getLowerBound() == old(getLowerBound())
*/
void setUpperBound(int value) {
upperBound = value;
}

/**
38 FIRST STEPS IN MODULAR PROGRAMMING (PART II)

* Sets this interval's width to the given value.


*
* @pre The given value is nonnegative.
* | 0 <= value
* @post This interval's width equals the given value.
* | getWidth() == value
* @post This interval's lower bound has remained unchanged.
* | getLowerBound() == old(getLowerBound())
*/
void setWidth(int value) {
upperBound = lowerBound + value;
}

/**
* Sets this interval's width to the given value.
*
* @pre The given value is nonnegative.
* | 0 <= value
* @post This interval's width equals the given value.
* | getWidth() == value
* @post If the caller specified that the lower bound should be updated, the
* upper bound has remained unchanged.
* | !updateLowerBound || getUpperBound() == old(getUpperBound())
* @post If the caller specified that the lower bound should not be updated,
* the lower bound has remained unchanged.
* | updateLowerBound || getLowerBound() == old(getLowerBound())
*/
void setWidth(int value, boolean updateLowerBound) {
if (updateLowerBound)
lowerBound = upperBound - value;
else
upperBound = lowerBound + value;
}

Notice that we write documentation structured into four kinds of clauses:


• Postconditions, indicated by tag @post, state conditions that are promised
by the module author to be true after execution of the method, provided
that the method’s preconditions are true before execution of the method.
• Preconditions, indicated by tag @pre, state conditions that must be true
at the start of an execution of the method. If at the start of a particular
execution of a method, the method’s preconditions are not true, then the
module author is not required to ensure that the method’s postconditions
are true at the end of this execution.
• Private invariants, also known as representation invariants, indicated by
tag @invar in a documentation comment preceding the private fields of a
class, state conditions on the values of the fields of an object that must be
true for the object to be considered to be in a valid state. It is a module
author’s responsibility to ensure that an object is in a valid state whenever
no constructor or method of the object is being executed. Note: the private
invariants are not part of the API specification; they serve only as internal
API SEMANTICS: DOCUMENTATION 39

documentation for the module author’s own use to facilitate their reasoning
about the correctness of their module.
• Public invariants, indicated by tag @invar in the documentation comment
for the class declaration itself, state conditions on the values returned by the
getters of an object that must be true whenever no constructor or method
of the object is being executed. It is a module author’s responsibility to
ensure this.
40 FIRST STEPS IN MODULAR PROGRAMMING (PART II)
Managing Complexity
through Modularity and
Abstraction

The question addressed by this course is: how can we manage the complexity of
the task of building large software systems?
The approach we teach is an instance of a general approach for any complex
task: divide and conquer. This means that we try to split the problem up into
subproblems, such that 1) each subproblem is easier to solve than the overall
problem, and 2) solving all of the subproblems yields a solution for the overall
problem.
In the case of software development, this means trying to decompose the system
development task into a set of simpler subsystem development tasks, such that
the resulting subsystems can be composed to yield a system that solves the
overall task.
One of the most effective ways to come up with such a decomposition is to try
to come up with effective abstractions. In this course, we teach two main types
of abstractions: procedural abstractions and data abstractions.

Procedural abstractions
In the case of procedural abstractions, we try to identify operations such that it
would make the system development task easier if those operations were built
into the programming language.
Only a very limited number of operations are built into the Java programming
language: +, -, *, /, %, <, >, <=, >=, ==, !=, &&, ||, and the bitwise operators &, |, ^,
<<, >>, and >>>. Anything else you want to do in a Java program, you need to
implement in terms of these basic operations.
For example, if an application needs to perform square root computations, it will

41
42 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION

have to implement a square root operation in terms of addition, subtraction, etc.


The complexity of building such an application, then, includes the complexity of
implementing a square root computation.
The life of the application’s developers would have been easier if square root
had been a primitive operation in the Java programming language. Building the
application would have been a simpler task.
Procedural abstraction, then, means that we split the task of building such an
application into two parts:
• building a client module that implements the application on top of not Java,
but Java++, an extension of Java with a built-in square root operation
(whose syntax is MyMath.squareRoot(...)).
• building a square root module whose only job is to implement the square
root operation (i.e. to implement method MyMath.squareRoot) in terms of
Java’s built-in operations.
Each of these two tasks is easier than the overall task:
• The client module’s developers do not have to worry about how to imple-
ment a square root operation.
• The square root module’s developers do not have to worry about the
application. They need not care if the application is implementing a web
shop, a game, a bookkeeping system, or anything else. All they need to
worry about is correctly implementing the square root operation.
Note that this decomposition will only be effective if the abstraction is defined
sufficiently precisely and abstractly.
• If the only way the client module’s developers can figure out what
MyMath.squareRoot does is by looking at its implementation, then their life
has not been made much simpler. They are still confronted with the
complexity of the square root implementation. For this reason, it is
crucial that the abstraction implemented by a module and used by the
client application (also known as the module’s Application Programming
Interface or API ) be properly documented. The API documentation must
document sufficiently precisely and abstractly the operations provided by
the module to its clients.
• Analogously, if the only way the square root module developers can figure
out what the application expects MyMath.squareRoot to do is by looking at the
application’s implementation and by understanding what the application
as a whole is supposed to do, then they are still confronted with the
complexity of the entire application. In conclusion, proper documentation
is necessary to achieve true reduction of complexity and separation of
concerns.
• A module’s documentation should not just be sufficiently precise. It should
also be sufficiently abstract. If the documentation for the square root
operation simply showed the implementation algorithm, again the client
PROCEDURAL ABSTRACTIONS 43

module developers’ life would not have been made much easier. Again
they would be confronted with the complexity of the implementation. A
module’s documentation must hide the implementation complexity as much
as possible and define the meaning of the API as simply as possible.

See below a proper way to document the square root module:

/**
* Returns the square root of the given nonnegative integer, rounded down.
*
* @pre The given integer is nonnegative.
* | 0 <= x
* @post The result is nonnegative.
* | 0 <= result
* @post The square of the result is not greater than the given integer.
* | (long)result * result <= x
* @post The square of one more than the result is greater than the given integer.
* | x < ((long)result + 1) * ((long)result + 1)
*/
public static int squareRoot(int x) {
int result = 0;
while ((long)result * result <= x)
result++;
return result - 1;
}

(Note that we need to cast the int result to long before computing its square or
adding one, to avoid arithmetic overflow.)

Notice that the documentation does not reveal the algorithm used to compute
the square root. (The algorithm shown is a very slow and naive one; better-
performing ones undoubtedly exist. Faster algorithms would undoubtedly be
more complex and further increase the complexity reduction achieved by the
layer of abstraction.)

It includes a precondition: the module promises to implement its abstraction


correctly only if the client adheres to the precondition. The postconditions
state the conditions that must hold when the method returns, for the method’s
behavior to be considered correct.

Proper documentation for a module, then, simplifies both the task of developing
the module itself, and the task of developing client modules that use the module:

• The developers that implement the module need to look only at the
module’s API documentation to learn what the module is supposed to do.
• The developers that implement the client modules (i.e. the modules that
use the module) need to look only at the module’s API documentation to
learn what the module does.
44 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION

Modular software development


Splitting a system up into modules with a clearly defined, abstract API between
them is an effective way of splitting a software development task into a set of
simpler subtasks. Each module is not just easier to build than the system as a
whole; it is also easier for someone to come in, read the codebase, and understand
what is going on, because each module can be understood separately. This is
because there is now a well-defined notion, for each module separately, of what
that module is supposed to do (i.e. there is a well-defined notion of correctness
for each module separately): a module is correct if it implements its API in such
a way that it complies with the module’s API documentation, assuming that
the lower-level modules it builds upon comply with their API documentation.
Having a notion of correctness for each module separately means we can verify
each module separately: we can perform testing and code review on each module
independently from its clients.
It also means the modules of the system can be built in parallel, by independent
software development teams. Furthermore, the developers of a module can
release new versions of the module that fix bugs, improve performance, or
add new features. So long as the new versions of the module comply with the
original API documentation, the old version can be replaced by the new version
in the overall system without adversely impacting the overall system’s correct
operation.

Data abstractions
Procedural abstractions allow clients to work with a more powerful programming
language, that has more operations built in.
Similarly, application development would often benefit from having more
datatypes built in besides the ones that are built into the Java programming
language itself.
The datatypes built into Java itself are the following:

Datatype Values
byte The integers between -128 and 127
short The integers between -32768 and 32767
int The integers between -231 and 231 - 1
long The integers between -263 and 263 - 1
char The integers between 0 and 65535
boolean true and false
float The single-precision floating-point numbers
double The double-precision floating-point numbers
DATA ABSTRACTIONS 45

Important remarks about these datatypes:

• Values of type char are used to represent text characters. The notation 'X'
denotes the char value assigned to character X by the Unicode character
encoding standard. For example, 'A' denotes the char value 65.
• The arithmetic operations (+, -, *, /, %) do not operate directly on types
byte, short, or char; instead, operands of type byte, short, or char are first
promoted to type int. For example, the type of expression 'A'- 'Z' is int
and its value is -25.
• An expression that applies an arithmetic operator to two operands of
type int is of type int itself, even though the mathematical result of the
operation might not be within the limits of type int. In that case (known
as arithmetic overflow), the value 232 is added to or subtracted from the
result as many times as necessary to bring it within the limits of the type.
For example: the expression 2000000000 + 2000000000 is of type int and its
value is -294967296 (= 4000000000 - 232 ).
• If either operand of an arithmetic expression is of type long and the other
operand is of any integer type, the expression itself is of type long and 264
is added to or subtracted from the result as many times as necessary to
bring it within the limits of type long.
• Type float has three kinds of values:
1. The real values, of the form M × 2E where mantissa M is an integer
between -224 + 1 and 224 - 1, and exponent E is an integer between
-149 and 104. Note, however, that there are two distinct "zero" values:
positive zero (0f) and negative zero (-0f).
2. The infinities, Float.POSITIVE_INFINITY and Float.NEGATIVE_INFINITY.
3. The Not-a-Number (NaN) values; these are used to represent the
result of operations that have no well-defined result in mathematics,
such as 0f/0f and Float.POSITIVE_INFINITY/Float.POSITIVE_INFINITY. You
can tell whether a float value is a Not-a-Number value using method
Float.isNaN.
• Analogously, type double has three kinds of values:
1. The real values, of the form M × 2E where mantissa M is an integer
between -253 + 1 and 253 - 1, and exponent E is an integer between
-1074 and 971. Again, there are two distinct zero values: positive zero
(0.0) and negative zero (-0.0).
2. The infinities, Double.POSITIVE_INFINITY and Double.NEGATIVE_INFINITY.
3. The NaN values. You can tell whether a double value is a NaN value
using method Double.isNaN.
• This means that float values can be used to represent positive numbers
between 10-37 and 1038 with 6 decimal digits of precision (as well as their
negation), and double values can be used to represent positive numbers
between 10-307 and 10308 with 15 decimal digits of precision (as well as
their negation).
• Floating-point numbers can be written using scientific notation:
1.00001e-37f denotes the float value closest to 1.00001 × 10 , and
-37
46 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION

9.99999999999999e307 denotes the double value closest to 9.99999999999999


× 10307 .
• Java’s comparison operators have potentially surprising behavior on
floating-point numbers: for example, 0.0 == -0.0 returns true and
0.0/0.0 == 0.0/0.0 returns false. To tell whether two float values are
identical, compare the results of calling Float.floatToRawIntBits on both of
them; similarly, to tell whether two double values are identical, compare
the results of calling Double.doubleToRawLongBits on both of them.
• The arithmetic operations on floating-point values perform implicit round-
ing. In computations involving multiple arithmetic operations, these round-
ing errors can accumulate and lead to very large errors in the final result.
Constructing correct floating-point computations is a very challenging and
specialized skill that is outside the scope of this course.
• For further reading on floating-point numbers, see here and here.
• Underscores can be used to make numbers more readable: 2_000_000_000,
9_000_000_000_000_000_000L, 9.999_999_999_999_99. For integers, Java supports
hexadecimal notation (e.g. 0xf denotes 15 and 0x10 denotes 16), binary
notation (e.g. 0b1000_0000 denotes 128), and octal notation (e.g. 0100
denotes 64).
• If you installed AdoptOpenJDK JDK 9 or newer, you can play with Java’s
datatypes and operators by opening a Command Prompt (on Windows:
right-click the Start button and choose Command Prompt; on macOS:
Applications -> Utilities -> Terminal) and entering jshell. Then enter e.g.
2_000_000_000 + 2_000_000_000 to see the result of evaluating the expression.
To see the type of the result, enter /vars.
However, for many applications, these datatypes are not sufficient. For
example, using floating-point numbers to count money in financial ap-
plications is a bad idea. Indeed, not all decimal fractions can be
represented exactly as a binary floating-point number. For example:
0.10 + 0.10 + 0.10 + 0.10 + 0.10 + 0.10 + 0.10 + 0.10 + 0.10 + 0.10 == 1.00 yields
false!

Financial applications would be much easier to write in Java if Java had a


built-in type of fractions. Fortunately, Java supports data abstraction by means
of classes. By defining classes, we can extend Java with new datatypes. This
way, we can split the task of developing a financial application in Java into two
simpler subtasks:
• The development of a financial application, not in Java but in Java++, an
extension of Java with a type Fraction whose values are (some subset of)
the rational numbers*.
• The development of a fractions module, that implements datatype Fraction
as a class that internally uses Java’s built-in datatypes.
(*) Unfortunately, in Java the values of a class type such as Fraction always
include the special value null, known as the null reference, in addition to the
object references that refer to instances of class Fraction. Tony Hoare, who
DATA ABSTRACTIONS 47

originally introduced null references in the programming language Algol, calls


this his "billion-dollar mistake". With new programming languages such as
Kotlin that do not suffer from this issue gaining popularity, the industry is slowly
eliminating this scourge.
We call the application written in Java++ a client module of the fractions module.
Composing the client module with the fractions module yields a system that
implements the financial application in Java.
Again, these two subtasks are simpler and easier than the overall application
development task:
• The developers of the client module are working in a more powerful
language, with more datatypes built in. They need not worry about how
to implement fractions in Java.
• The developers of the fractions module only need to worry about correctly
implementing the fractions abstraction. They need not worry about how
it will be used or what it will be used for.
Again, the full benefit of this decomposition is obtained only if sufficiently precise
and abstract documentation is provided for the Fraction datatype’s API :
• If the developers of the financial application need to look inside the imple-
mentation of the Fraction datatype to understand its behavior, then they
are still exposed to the complexity of implementing fractions in terms of
Java’s built-in datatypes.
• Conversely, if the developers of the fractions module need to inspect the
client module to understand the client’s expectations with respect to the
Fraction datatype’s behavior, then they are still exposed to the complexity
of the entire financial application.
Here is an example of a properly documented implementation of the fractions
module:
import java.math.BigInteger;

/**
* Each instance of this class represents a rational number.
*
* @immutable
*
* @invar The denominator is positive.
* | 0 < getDenominator()
* @invar The numerator is greater than the minimum {@code long} value.
* | Long.MIN_VALUE < getNumerator()
* @invar The fraction is irreducible: the greatest common divisor of
* the absolute value of the numerator and the denominator is one.
* | MoreMath.gcd(Math.abs(getNumerator()), getDenominator()) == 1
*/
public class Fraction {

/**
* @invar | 0 < denominator
48 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION

* @invar | Long.MIN_VALUE < numerator


* @invar | MoreMath.gcd(Math.abs(numerator), denominator) == 1
*/
private final long numerator;
private final long denominator;

public long getNumerator() { return numerator; }


public long getDenominator() { return denominator; }

private Fraction(long numerator, long denominator) {


this.numerator = numerator;
this.denominator = denominator;
}

/**
* An object that represents the number zero.
*/
public static final Fraction ZERO = new Fraction(0, 1);

/**
* Returns an object representing the rational number defined by the given
* numerator and denominator.
*
* @throws IllegalArgumentException if the given denominator is zero.
* | denominator == 0
* @may_throw ArithmeticException if arithmetic overflow occurs.
* | true
* @post The result is not null.
* | result != null
* @post The rational number represented by the result equals the rational
* number defined by the given numerator and denominator.
* | BigInteger.valueOf(result.getNumerator())
* | .multiply(BigInteger.valueOf(denominator)).equals(
* | BigInteger.valueOf(numerator).multiply(
* | BigInteger.valueOf(result.getDenominator())))
*/
public static Fraction of(long numerator, long denominator) {
if (denominator == 0)
throw new IllegalArgumentException("denominator is zero");
if (numerator == 0)
return ZERO;
long gcd = MoreMath.gcd(
MoreMath.absExact(numerator),
MoreMath.absExact(denominator));
if (denominator < 0)
gcd = -gcd;
return new Fraction(numerator / gcd, denominator / gcd);
}

/**
* Returns whether this object and the given object represent the same
* rational number.
*
* @throws IllegalArgumentException if {@code other} is null.
* | other == null
* @post
* | result == (
DATA ABSTRACTIONS 49

* | getNumerator() == other.getNumerator() &&


* | getDenominator() == other.getDenominator()
* | )
*/
public boolean equals(Fraction other) {
if (other == null)
throw new IllegalArgumentException("other is null");
return numerator == other.numerator
&& denominator == other.denominator;
}

/**
* Returns an object representing the rational number obtained by adding
* the rational number represented by this object to the rational number
* represented by the given object.
*
* @throws IllegalArgumentException if {@code other} is null.
* | other == null
* @may_throw ArithmeticException if arithmetic overflow occurs.
* | true
* @post The result is not null.
* | result != null
* @post a/b == c/d + e/f if and only if adf == cbf + ebd.
* | BigInteger.valueOf(result.getNumerator()).
* | multiply(BigInteger.valueOf(this.getDenominator())).
* | multiply(BigInteger.valueOf(other.getDenominator())).
* | equals(
* | BigInteger.valueOf(this.getNumerator()).
* | multiply(BigInteger.valueOf(result.getDenominator())).
* | multiply(BigInteger.valueOf(other.getDenominator())).
* | add(BigInteger.valueOf(other.getNumerator()).
* | multiply(BigInteger.valueOf(result.getDenominator())).
* | multiply(BigInteger.valueOf(this.getDenominator()))))
*/
public Fraction plus(Fraction other) {
if (other == null)
throw new IllegalArgumentException("other is null");
long gcd = MoreMath.gcd(this.denominator, other.denominator);
long numerator = Math.addExact(
Math.multiplyExact(this.numerator, other.denominator / gcd),
Math.multiplyExact(other.numerator, this.denominator / gcd));
long denominator =
Math.multiplyExact(this.denominator, other.denominator / gcd);
return Fraction.of(numerator, denominator);
}

It uses the following library of math methods:


import java.util.stream.LongStream;

public class MoreMath {

/**
* Returns the absolute value of the given number.
*
50 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION

* @throws ArithmeticException if arithmetic overflow occurs.


* | x == Long.MIN_VALUE
* @post The result is nonnegative.
* | 0 <= result
* @post The result equals either the argument or its negation.
* | result == x || result == -x
*/
public static long absExact(long x) {
if (x == Long.MIN_VALUE)
throw new ArithmeticException("Arithmetic overflow");
return Math.abs(x);
}

/**
* Returns whether the first given number divides the second given number.
*
* @pre The first given number is not zero.
* | a != 0
* @post | result == (b % a == 0)
*/
public static boolean divides(long a, long b) {
return b % a == 0;
}

/**
* Returns the greatest common divisor of the two given integers.
*
* @pre The given numbers are nonnegative.
* | 0 <= a && 0 <= b
* @pre At least one given number is nonzero.
* | a != 0 || b != 0
* @post The result is positive.
* | 0 < result
* @post The result divides both given numbers.
* | divides(result, a) && divides(result, b)
* @post No greater number divides both given numbers.
* | LongStream.range(result, Math.max(a, b)).allMatch(x ->
* | !(divides(x + 1, a) && divides(x + 1, b)))
*/
public static long gcd(long a, long b) {
if (a == 0) return b;
if (b == 0) return a;
if (a < b) return gcd(b % a, a);
return gcd(a % b, b);
}

Notice that class Fraction does not expose any means for clients to mutate the
association between a Fraction instance and the rational number it represents.
Classes like this, where clients cannot mutate an instance’s abstract value, are
called immutable classes. We say they implement an immutable value abstraction.
In order to allow clients to understand that a class is immutable without having
to check all methods, we include the @immutable tag in the Javadoc comment for
the class.
DATA ABSTRACTIONS 51

Notice, furthermore, that class Fraction does not expose a constructor. Instead,
it exposes a static method of to allow clients to obtain a Fraction instance that
represents a given abstract value. Exposing such a method, typically called of or
valueOf, is common practice for immutable classes; it allows the implementation
to avoid the creation of a new instance in some cases. For example, class Fraction
reuses an existing object whenever the client requests an object that represents
abstract value zero. Of course, such reuse is safe only if the class is immutable.
Notice also that the documentation for Fraction uses Java’s BigInteger class to
avoid arithmetic overflow inside the documentation.
Client modules can use the Fraction class to perform financial computations in a
safe and simple way. For example: the assert statement in the following code
snippet succeeds:
Fraction tenCents = Fraction.of(10, 100);
Fraction total = Fraction.of(0, 100);
for (int i = 0; i < 10; i++)
total = total.plus(tenCents);
assert total.equals(Fraction.of(100, 100));

Mutable versus immutable data abstractions


An alternative abstraction that one could introduce for simplifying the task of
building financial applications, is a mutable class for calculating with fractions:
import java.math.BigInteger;

/**
* Each instance of this class stores, at each point in time, a rational number.
*
* @invar The denominator is positive.
* | 0 < getDenominator()
* @invar The numerator is greater than the minimum {@code long} value.
* | Long.MIN_VALUE < getNumerator()
* @invar The fraction is irreducible: the greatest common divisor of
* the absolute value of the numerator and the denominator is one.
* | MoreMath.gcd(Math.abs(getNumerator()), getDenominator()) == 1
*/
public class FractionContainer {

/**
* @invar | 0 < denominator
* @invar | Long.MIN_VALUE < numerator
* @invar | MoreMath.gcd(Math.abs(numerator), denominator) == 1
*/
private long numerator;
private long denominator;

public long getNumerator() { return numerator; }


public long getDenominator() { return denominator; }

/**
* Returns whether the rational number stored by this object
52 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION

* equals the rational number defined by the given numerator


* and denominator.
*
* @throws IllegalArgumentException if the given denominator is zero.
* | denominator == 0
* @post
* | result ==
* | BigInteger.valueOf(getNumerator())
* | .multiply(BigInteger.valueOf(denominator))
* | .equals(
* | BigInteger.valueOf(numerator)
* | .multiply(BigInteger.valueOf(this.getDenominator())))
*/
public boolean equals(long numerator, long denominator) {
if (denominator == 0)
throw new IllegalArgumentException("denominator is zero");
if (denominator % this.denominator != 0)
return false;
long factor = denominator / this.denominator;
if (numerator % factor != 0)
return false;
return numerator / factor == this.numerator;
}

/**
* Initializes this object so that it stores the number zero.
* @post | getNumerator() == 0
*/
public FractionContainer() {
denominator = 1;
}

/**
* Mutates this object so that it stores the rational number defined
* by the given numerator and denominator.
*
* @throws IllegalArgumentException if the given denominator is zero.
* | denominator == 0
* @may_throw ArithmeticException if arithmetic overflow occurs.
* | true
* @post The rational number stored by this object equals the rational
* number defined by the given numerator and denominator.
* | BigInteger.valueOf(getNumerator())
* | .multiply(BigInteger.valueOf(denominator)).equals(
* | BigInteger.valueOf(numerator)
* | .multiply(BigInteger.valueOf(getDenominator())))
*/
public void set(long numerator, long denominator) {
if (denominator == 0)
throw new IllegalArgumentException("denominator is zero");
long gcd = MoreMath.gcd(
MoreMath.absExact(numerator),
MoreMath.absExact(denominator));
if (denominator < 0)
gcd = -gcd;
this.numerator = numerator / gcd;
this.denominator = denominator / gcd;
DATA ABSTRACTIONS 53

/**
* Mutates this object so that it stores the rational number obtained by adding
* the old rational number stored by this object to the rational number
* defined by the given numerator and denominator.
*
* @throws IllegalArgumentException if the given denominator is zero.
* | denominator == 0
* @may_throw ArithmeticException if arithmetic overflow occurs.
* | true
* @post a/b == c/d + e/f if and only if adf == cbf + ebd.
* | BigInteger.valueOf(getNumerator()).
* | multiply(BigInteger.valueOf(old(getDenominator()))).
* | multiply(BigInteger.valueOf(denominator)).
* | equals(BigInteger.valueOf(old(getNumerator())).
* | multiply(BigInteger.valueOf(getDenominator())).
* | multiply(BigInteger.valueOf(denominator)).add(
* | BigInteger.valueOf(numerator).
* | multiply(BigInteger.valueOf(getDenominator())).
* | multiply(BigInteger.valueOf(old(getDenominator())))))
*/
public void add(long numerator, long denominator) {
if (denominator == 0)
throw new IllegalArgumentException("denominator is zero");
long gcd = MoreMath.gcd(this.denominator, MoreMath.absExact(denominator));
if (denominator < 0)
gcd = -gcd;
long newNumerator = Math.addExact(
Math.multiplyExact(this.numerator, denominator / gcd),
Math.multiplyExact(numerator, this.denominator / gcd));
long newDenominator =
Math.multiplyExact(this.denominator, denominator / gcd);
set(newNumerator, newDenominator);
}

We could implement our example financial computation using class


FractionContainer as follows:

FractionContainer container = new FractionContainer();


for (int i = 0; i < 10; i++)
container.add(10, 100);
assert container.equals(100, 100);

Notice that the association between the FractionContainer instance and the rational
number it stores changes 11 times during this computation: initially, it stores
the value 0, then, after the first add call, the value 1/10, then the value 2/10,
etc. Therefore, it is not correct to say that an instance of FractionContainer is a
rational number; rather, we can only say that it stores, at any given point in
time, a particular rational number.
The example shows both the main advantage and the main disadvantage of
mutable classes compared to immutable classes: when working with mutable
54 MANAGING COMPLEXITY: MODULARITY & ABSTRACTION

classes, performance is often better because the program creates fewer objects.
On the other hand, when working with immutable classes, reasoning about the
program is generally easier because there is no need to distinguish between an
object and its abstract value; indeed, it is fine to think of object Fraction.ZERO as
being the number zero.
To highlight the fact that it is not correct to think of a FractionContainer instance
as being a fraction, we have named the class FractionContainer rather than Fraction;
however, in practice mutable classes are very often called after the type of abstract
values they store. For example, in the Java Collections API, class ArrayList is a
mutable class for storing lists of objects; it is not correct to think of an ArrayList
object as being a list of objects.
Representation Objects and
Representation Exposure

In a data abstraction, each instance of the class that implements the abstraction is
associated conceptually with a particular abstract state. In the case of a mutable
abstraction, this association can change over time; in the case of an immutable
abstraction, it is fixed upon creation of the object. Clients can use getter
methods to inspect an object’s abstract state. In order for the class to be able to
implement these getter methods, it must store some concrete representation of
the object’s abstract state in the computer’s memory. For example, in order for
class Interval to be able to implement methods getLowerBound(), getUpperBound(),
and getWidth(), it must store sufficient information to be able to derive an Interval
object’s lower bound, upper bound, and width in the computer’s memory. One
example implementation of class Interval that we have seen stores an Interval
object’s lower bound and width in fields of the object. We call these fields, and
the way they relate to the object’s abstract state, the object’s representation.
Usually, there are many different possible ways to design the represention for a
given abstraction: for example, another example implementation of class Interval
that we have seen stores an Interval object’s lower bound and upper bound,
rather than its lower bound and width.

In the example data abstraction implementations we have seen so far, such as


Interval, TimeOfDay, Fraction, and FractionContainer, the class that implements the
abstraction exclusively uses the fields of an instance to store its abstract state.
However, in most cases, the fields of an object are not sufficient to store the
object’s abstract state. In those cases, the class must use auxiliary objects,
known as representation objects, to help represent the object’s abstract state.

In this chapter, we first use the example of a String class to introduce the concept
of representation objects, and to show how representation exposure can break
immutability. Then, we use a FractionList example to show how representation
exposure can break consistency of an abstraction’s representation. Finally, we
use a ByteBuffer example to show how representation exposure can break modular
reasoning.

55
56 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE

Strings
For example, consider (an extract from) the API of class String from the Java
Platform API:
/**
* Each instance of this class represents a sequence of text characters.
*
* @immutable
*/
public class String {

/**
* Returns the length of the sequence of characters represented by this object.
*
* @post | 0 <= result
*/
public int length()

/**
* Returns the character at the given index in the sequence of characters
* represented by this object. The first character is at index 0.
*
* @throws IndexOutOfBoundsException if the given index is less than zero or
* not less than the length of this object.
* | index < 0 || length() <= index
*/
public char charAt(int index)

/**
* Returns a `String` object of length 1 containing the single given character.
*
* @post | result != null
* @post | result.length() == 1
* @post | result.charAt(0) == c
*/
public static String valueOf(char c)

/**
* Returns a `String` object that represents the sequence of characters
* obtained by concatenating the sequence of characters represented by this
* object and the given object, respectively.
*
* @throws NullPointerException | other == null
* @post | result != null
* @post | result.length() == length() + other.length()
* @post | IntStream.range(0, length()).allMatch(i ->
* | result.charAt(i) == charAt(i))
* @post | IntStream.range(0, other.length()).allMatch(i ->
* | result.charAt(length() + i) == other.charAt(i))
*/
public String concat(String other)

For example, the expression String.valueOf('H').concat(String.valueOf('i')).concat(String.valueOf('!'))


STRINGS 57

yields a String object that represents the sequence of characters Hi!. (An
equivalent String object can be written more concisely using the expres-
sion "Hi!". Also, concatenation of String objects can be written more
concisely using the + operator; another equivalent expression is therefore
String.valueOf('H') + String.valueOf('i') + String.valueOf('!').)

It is impossible to represent the abstract value of a String object using just the
fields of the object: a class with N fields can store at most N values, and a
String object must be able to store arbitrarily many characters. Therefore, any
implementation of class String must necessarily use auxiliary objects to help
represent its state.
The following implementation of class String uses an auxiliary array object to
store the characters of the string:
/**
* Each instance of this class represents a sequence of text characters.
*
* @immutable
*/
public class String {

/**
* @invar | characters != null
*
* @representationObject
*/
private final char[] characters;

private String(char[] characters) {


this.characters = characters;
}

/**
* Returns the length of the sequence of characters represented by this object.
*
* @post | 0 <= result
*/
public int length() {
return characters.length;
}

/**
* Returns the character at the given index in the sequence of characters
* represented by this object. The first character is at index 0.
*
* @throws IndexOutOfBoundsException if the given index is less than zero
* or not less than the length of this object.
* | index < 0 || length() <= index
*/
public char charAt(int index) {
if (index < 0 || length() <= index)
throw new IndexOutOfBoundsException();
return characters[index];
}
58 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE

/**
* Returns a `String` object of length 1 containing the single given character.
*
* @post | result != null
* @post | result.length() == 1
* @post | result.charAt(0) == c
*/
public static String valueOf(char c) {
return new String(new char[] {c});
}

/**
* Returns a `String` object that represents the sequence of characters
* obtained by concatenating the sequence of characters represented by this
* object and the given object, respectively.
*
* @throws NullPointerException | other == null
* @post | result != null
* @post | result.length() == length() + other.length()
* @post | IntStream.range(0, length()).allMatch(i ->
* | result.charAt(i) == charAt(i))
* @post | IntStream.range(0, other.length()).allMatch(i ->
* | result.charAt(length() + i) == other.charAt(i))
*/
public String concat(String other) {
char[] cs = new char[characters.length + other.characters.length];
System.arraycopy(characters, 0, cs, 0, characters.length);
System.arraycopy(other.characters, 0, cs, characters.length,
other.characters.length);
return new String(cs);
}

After a String object S has been initialized, field S.characters points to an array
object, let’s call it A, that stores the abstract value of S. We call A a representation
object of S.
Notice that the existence of A is completely invisible to clients of S: the API of
class String provides no means for clients to obtain a reference to an instance’s
representation object. We say the representation object is encapsulated.

Representation Exposure
Let’s see what happens if we break encapsulation. Here is a first attempt to add
a method toCharArray to class String:
/**
* Returns an array containing the sequence of characters represented by this
* object.
*
* @post | result != null
* @post | result.length == length()
* @post | IntStream.range(0, length()).allMatch(i ->
REPRESENTATION EXPOSURE 59

* | result[i] == charAt(i))
*/
public char[] toCharArray() {
return characters; // WRONG! Representation exposure
}

Although this method satisfies its postconditions, it is still wrong. This is because
it leaks the receiver object’s representation object; it exposes the representation
object to clients. This is wrong because it allows clients to perform inappropriate
mutations of the abstract value of the String object by mutating the representation
object. That is, it allows clients to break the immutability of the String object:
String a = String.valueOf('a');
assert a.charAt(0) == 'a'; // Succeeds
a.toCharArray()[0] = 'b';
assert a.charAt(0) == 'a'; // Fails

An immutable class is only considered properly encapsulated if it protects its


immutability; that is, it must not allow clients to mutate its abstract value
in any way. This implies, among other things, that it must encapsulate its
representation objects; that is, it must not leak or expose them to its clients.
A correct way to implement method toCharArray is by returning a copy of the
representation object:
/**
* Returns a new array containing the sequence of characters represented by
* this object.
*
* @creates | result
* @post | result != null
* @post | result.length == length()
* @post | IntStream.range(0, length()).allMatch(i ->
* | result[i] == charAt(i))
*/
public char[] toCharArray() {
return characters.clone();
}

Notice the following:


• Any array can be copied by calling its clone() method. This creates a new
array with the same element type and the same elements.
• We made it explicit in the documentation of method toCharArray that the
returned array has been newly created by the method. Tag @creates | result
means that result has been newly created by the method, and furthermore
that it is not a representation object of the receiver object or of any other
object. The client can therefore safely mutate it without affecting any
other object’s abstract state.
• Our implementation of class String uses array objects for two very different
purposes: to represent a String object’s abstract state, and as a container
60 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE

for a sequence of characters to be returned as the result of a method call.


It is crucial to always use separate objects for these two different purposes.

Representation Exposure without Leaking


A class must never leak its representation objects. But not leaking representation
objects is not sufficient to prevent representation exposure. Additionally, a class
must never use pre-existing client-visible objects as representation objects. For
example, here is a first attempt to add a method valueOf(char[]) to class String:
/**
* Returns a `String` object whose sequence of characters equals the
* sequence of characters stored in the given array.
*
* @throws NullPointerException | characters == null
* @inspects | characters
* @post | result != null
* @post | result.length() == characters.length
* @post | IntStream.range(0, characters.length)
* | .allMatch(i -> result.charAt(i) == characters[i])
*/
public static String valueOf(char[] characters) {
return new String(characters); // WRONG! Client-supplied object
// used as representation object.
}

This method passes the client-supplied array characters to the String constructor,
which installs its argument as the representation object for the new String object.
As a result, the client can mutate the new String object’s abstract state by
mutating the array object:
char[] cs = {'a'};
String a = String.valueOf(cs);
assert a.charAt(0) == 'a'; // Succeeds
cs[0] = 'b';
assert a.charAt(0) == 'a'; // Fails

A correct way to implement the valueOf(char[]) method is by copying the argu-


ment:
/**
* Returns a `String` object whose sequence of characters equals the
* sequence of characters stored in the given array.
*
* @throws NullPointerException | characters == null
* @inspects | characters
* @post | result != null
* @post | result.length() == characters.length
* @post | IntStream.range(0, characters.length)
* | .allMatch(i -> result.charAt(i) == characters[i])
*/
public static String valueOf(char[] characters) {
return new String(characters.clone());
REPRESENTATION EXPOSURE BREAKS CONSISTENCY 61

Here, too, arrays are used for two different purposes: to represent a String
object’s abstract state, and to contain a sequence of characters to be passed as
an argument to a method call. It is crucial to always use separate objects for
these two different purposes.

FractionLists: Representation Exposure Breaks


Consistency
We have seen that immutable classes must encapsulate their representation
object to protect their immutability. There is another important reason why
classes must encapsulate their representation objects. We illustrate it by means
of the following FractionList example.
/**
* Each instance of this class stores a list of fractions.
*/
public class FractionList {

/**
* @invar | elements != null
* @invar | Arrays.stream(elements).allMatch(element -> element != null)
*
* @representationObject
*/
private Fraction[] elements;

/**
* Returns the number of elements in the list of fractions stored by this
* object.
*
* @post | 0 <= result
*/
public int getSize() {
return elements.length;
}

/**
* Returns the element at the given index in the list of fractions stored by
* this object.
*
* @throws IndexOutOfBoundsException | 0 < index || index <= getSize()
*/
public Fraction getElementAt(int index) {
if (index < 0 || getSize() <= index)
throw new IndexOutOfBoundsException();

return elements[index];
}

/**
* Returns the sum of the elements of the list of fractions stored by this
62 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE

* object.
*
* @post | Objects.equals(result,
* | IntStream.range(0, getSize())
* | .mapToObj(i -> getElementAt(i))
* | .reduce(Fraction.ZERO, (x, y) -> x.plus(y)))
*/
public Fraction getSum() {
return Arrays.stream(elements).reduce(Fraction.ZERO, (x, y) -> x.plus(y));
}

/**
* Initializes this object to store an empty list of fractions.
*
* @post | getSize() == 0
*/
public FractionList() {
elements = new Fraction[0];
}

/**
* Adds the given element to the end of the list of fractions stored by this
* object.
*
* @throws NullPointerException | element == null
* @mutates | this
* @post | getSize() == old(getSize()) + 1
* @post | Arrays.equals(
* | IntStream.range(0, old(getSize()))
* | .mapToObj(i -> getElementAt(i)).toArray(),
* | old(IntStream.range(0, getSize())
* | .mapToObj(i -> getElementAt(i)).toArray()))
* @post | Objects.equals(getElementAt(old(getSize())), element)
*/
public void add(Fraction element) {
if (element == null)
throw new IllegalArgumentException("element is null");

Fraction[] newElements = new Fraction[elements.length + 1];


System.arraycopy(elements, 0, newElements, 0, elements.length);
newElements[elements.length] = element;
elements = newElements;
}

Suppose, now, that we want to add a method for retrieving an array containing
a FractionList object’s list of fractions. Here is a first attempt at adding such a
method:
public Fraction[] getElements() {
return elements; // WRONG! Leaks representation object.
}

This method leaks the receiver object’s representation object to the client. Even
though class FractionList is a mutable class, this is still wrong, because this allows
MODULAR REASONING: @MUTATES 63

clients to break the FractionList object’s representation invariants. In other


words, it allows clients to bring the FractionList object into an inconsistent state.
Specifically, it allows clients to introduce null elements into the representation
object:
FractionList myList = new FractionList();
myList.add(Fraction.ZERO);
Fraction[] elements = myList.getElements();
elements[0] = null;
// Object myList is now in an inconsistent state
myList.getSum(); // crashes with a NullPointerException

Method getSum relies on the receiver’s representation invariants for its safe execu-
tion; indeed, running this method after breaking the representation invariants
causes the method to crash.
As before, we can fix this leak by copying the array:
/**
* Returns a new array containing the list of fractions stored by this object.
*
* @creates | result
* @post | Arrays.equals(result, IntStream.range(0, getSize())
* | .mapToObj(i -> getElementAt(i)).toArray())
*/
public Fraction[] getElements() {
return elements.clone();
}

Modular Reasoning: @mutates


If a class is mutable and its representation invariants do not mention the mutable
state of its representation object(s), then exposing representation object(s) does
not endanger its immutability or its consistency. But even then, exposing
representation object(s) is wrong, because it breaks modular reasoning, as we
will illustrate next.

ByteBuffer: first attempt (INCORRECT)


Consider the following INCORRECT attempt at designing and implementing
an API for a ByteBuffer class.
/**
* Each instance of this class stores a sequence of bytes and an offset into that
* sequence.
*/
public class ByteBuffer {

/**
* @invar | bytes != null
* @invar | 0 <= offset
64 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE

*/
private byte[] bytes;
private int offset;

/**
* Returns an array containing the sequence of bytes stored by this object.
*
* @post | result != null
*/
public byte[] getBytes() { return bytes; }

/**
* Returns the offset stored by this object.
*
* @post | 0 <= result
*/
public int getOffset() { return offset; }

/**
* Initializes this object so that it stores the given sequence of bytes and
* offset zero.
*
* @throws IllegalArgumentException if the given array is null
* | bytes == null
* @post | Arrays.equals(getBytes(), bytes)
* @post | getOffset() == 0
*/
public ByteBuffer(byte[] bytes) {
if (bytes == null)
throw new IllegalArgumentException("bytes is null");
this.bytes = bytes;
}

/**
* Writes the given byte into the sequence of bytes stored by this object
* at the current offset, and increments the offset.
*
* @throws ArrayIndexOutOfBoundsException if the current offset is outside
* the bounds of the sequence of bytes stored by this object.
* | getBytes().length <= getOffset()
* @mutates | this // WRONG!
* @post | getOffset() == old(getOffset()) + 1
* @post | getBytes().length == old(getBytes().length)
* @post | getBytes()[old(getOffset())] == b
* @post | IntStream.range(0, getBytes().length).allMatch(i ->
* | i == old(getOffset()) || getBytes()[i] == old(getBytes())[i])
*/
public void put(byte b) {
this.bytes[offset] = b;
offset++;
}

Consider now the following client code:


byte[] myBytes = {1, 2, 3};
MODULAR REASONING: @MUTATES 65

ByteBuffer myBuffer = new ByteBuffer(myBytes);


assert myBytes[0] == 1; // Succeeds
myBuffer.put(4);
assert myBytes[0] == 1; // Fails

Calling method put on myBuffer mutates myBytes. Based on the documentation for
method put, this is completely unexpected; indeed, method put’s @mutates clause
asserts that the method mutates only this, i.e. myBuffer itself.
There are (at least) two possible ways to fix the ByteBuffer class, corresponding
with two different design decisions, both of which are reasonable and useful and
occur in practice. The relevant design question here is: do we want to hide the
array that backs the ByteBuffer object from the client, or do we want the client
to be aware of it? Both options have advantages and disadvantages: the former
option leads to a more abstract and arguably simpler API; the latter option has
better performance because it avoids the need to copy the array.
We show elaborations of both options below.

ByteBuffer (opaque)
In this version of class ByteBuffer, the backing array is hidden from the client.
This class is similar to Java’s ByteArrayOutputStream class.
/**
* Each instance of this class stores a sequence of bytes and an offset into
* that sequence.
*/
public class ByteBuffer {

/**
* @invar | bytes != null
* @invar | 0 <= offset
*
* @representationObject
*/
private byte[] bytes;
private int offset;

/**
* Returns a new array containing the sequence of bytes stored by this object.
*
* @creates | result
* @post | result != null
*/
public byte[] getBytes() { return bytes.clone(); }

/**
* Returns the offset stored by this object.
*
* @post | 0 <= result
*/
public int getOffset() { return offset; }
66 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE

/**
* Initializes this object so that it stores the given sequence of bytes
* and offset zero.
*
* @throws NullPointerException if the given array is null
* | bytes == null
* @inspects | bytes
* @post | Arrays.equals(getBytes(), bytes)
* @post | getOffset() == 0
*/
public ByteBuffer(byte[] bytes) {
this.bytes = bytes.clone();
}

/**
* Writes the given byte into the sequence of bytes stored by this object
* at the current offset, and increments the offset.
*
* @throws ArrayIndexOutOfBoundsException if the current offset is outside
* the bounds of the sequence of bytes stored by this object.
* | getBytes().length <= getOffset()
* @mutates | this
* @post | getOffset() == old(getOffset()) + 1
* @post | getBytes().length == old(getBytes().length)
* @post | getBytes()[old(getOffset())] == b
* @post | IntStream.range(0, getBytes().length).allMatch(i ->
* | i == old(getOffset()) || getBytes()[i] == old(getBytes())[i])
*/
public void put(byte b) {
this.bytes[offset] = b;
offset++;
}

Notice that in this version, the @mutates clause of method put still only mentions
this. This is fine, because the meaning of @mutates | O is that the method can
mutate object O as well as any representation object of O (as well as the represen-
tation objects of those objects, if they have any, and so on). Since field bytes is
marked as @representationObject, the array referenced by this field is considered
a representation object of the ByteBuffer object. Since the representation objects
of an object O must never become exposed to clients of O, the constructor and
method getBytes must make the necessary copies. In return, the client can safely
assume that when calling method myBuffer.put, none of the array objects it has
references to are mutated:
byte[] myBytes = {1, 2, 3};
ByteBuffer myBuffer = new ByteBuffer(myBytes);
assert myBytes[0] == 1; // Succeeds
myBuffer.put(4);
assert myBytes[0] == 1; // Succeeds

byte[] moreBytes = myBuffer.getBytes();


assert moreBytes[1] == 2; // Succeeds
myBuffer.put(5);
MODULAR REASONING: @MUTATES 67

assert moreBytes[1] == 2; // Succeeds

ByteBuffer (transparent)
In this version of class ByteBuffer, the client is aware of the backing array.
Therefore, the backing array is not a representation object and the contents of
the backing array are not part of the state of the ByteBuffer object. Here, the
ByteBuffer object does not store a sequence of bytes; it merely stores a reference
to an array object.
This class is similar to Java’s ByteBuffer class.
/**
* Each instance of this class stores a reference to a byte array and an offset
* into that array.
*/
public class ByteBuffer {

/**
* @invar | array != null
* @invar | 0 <= offset
*/
private byte[] array;
private int offset;

/**
* Returns the array reference stored by this object.
*
* @post | result != null
* @immutable This object is associated with the same array reference
* throughout its lifetime.
*/
public byte[] getArray() { return array; }

/**
* Returns the offset stored by this object.
*
* @post | 0 <= result
*/
public int getOffset() { return offset; }

/**
* Initializes this object to store the given array reference and offset zero.
*
* @throws IllegalArgumentException if the given array reference is null
* | array == null
* @post | getArray() == array
* @post | getOffset() == 0
*/
public ByteBuffer(byte[] array) {
if (array == null)
throw new IllegalArgumentException("array is null");
this.array = array;
}
68 REPRESENTATION OBJECTS AND REPRESENTATION EXPOSURE

/**
* Writes the given byte into the referenced array at the current offset, and
* increments the offset.
*
* @throws ArrayIndexOutOfBoundsException if the current offset is outside
* the bound of the referenced array.
* | getArray().length <= getOffset()
* @mutates | this, getArray()
* @post | getOffset() == old(getOffset()) + 1
* @post | getArray()[old(getOffset())] == b
* @post The elements of the array referenced by this object, except for the
* element at the old offset, have remained unchanged.
* | IntStream.range(0, getArray().length).allMatch(i ->
* | i == old(getOffset())
* | || getArray()[i] == old(getArray().clone())[i])
*/
public void put(byte b) {
this.array[offset] = b;
offset++;
}

Notice that method put must mention both this and getArray() in its @mutates
clause; this way, the client will not be surprised when its array object is mutated:
byte[] myBytes = {1, 2, 3};
ByteBuffer myBuffer = new ByteBuffer(myBytes);
assert myBytes[0] == 1; // Succeeds
myBuffer.put(4);
assert myBytes[0] == 4; // Succeeds, as expected:
// `myBuffer.put()` mutates `myBuffer.getArray()`
// a.k.a. `myBytes`

byte[] moreBytes = myBuffer.getArray();


assert moreBytes[1] == 2; // Succeeds
myBuffer.put(5);
assert moreBytes[1] == 5; // Succeeds, as expected

Conclusion
In this chapter, we introduced the notion of representation objects: objects used
internally by an abstraction to help represent an instance’s abstract state. We
showed by means of three examples (String, FractionList, and ByteBuffer) that it
is crucial that representation objects never be exposed to clients, since doing
so can break immutability of the abstraction, consistency of the abstraction’s
representation, and/or modular reasoning about the abstraction by clients. If
an object used by an abstraction is exposed to clients, it is not a representation
object of the abstraction, its state is not a part of the state of the abstraction,
and methods that mutate it must declare this explicitly in their @mutates clause.
How to properly document
single-object abstractions

Concepts
Data abstraction definition and implementation
In Java, a data abstraction is defined by giving its public aspects: the class name,
the names, parameter types, and return types of the class’ public methods, the pa-
rameter types of the class’ public constructors, and the documentation comments
associated with the class, the public methods, and the public constructors.
For example, the following defines a Point abstraction:
/**
* Each instance of this class represents a point with integer coordinates
* in a two-dimensional coordinate system.
*/
public class Point {

/** Returns this instance's X coordinate. */


public int getX()

/** Returns this instance's Y coordinate. */


public int getY()

/**
* Initializes this instance so that it represents the given point.
* @post | getX() == x
* @post | getY() == y
*/
public Point(int x, int y)

An implementation of a given data abstraction is defined by the remaining


elements of the class, called the private aspects: the private fields, the bodies of
the public methods and constructors, any nonpublic methods and constructors,
and any documentation comments associated with the nonpublic fields.

69
70 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS

That is, a Point class implements the data abstraction defined above if its public
aspects are as given above. Generally, many different possible implementations
exist for any given data abstraction. For example, here are two implementations
of the Point abstraction:
/**
* Each instance of this class represents a point with integer coordinates
* in a two-dimensional coordinate system.
*/
public class Point {

private int x;
private int y;

/** Returns this instance's X coordinate. */


public int getX() { return x; }

/** Returns this instance's Y coordinate. */


public int getY() { return y; }

/**
* Initializes this instance so that it represents the given point.
* @post | getX() == x
* @post | getY() == y
*/
public Point(int x, int y) {
this.x = x;
this.y = y;
}

/**
* Each instance of this class represents a point with integer coordinates
* in a two-dimensional coordinate system.
*/
public class Point {

/**
* Stores the distance between this point and the origin of the coordinate system
* (that is, the r component of the polar coordinates of this point).
*/
private double radius;

/**
* Stores the angle, in radians, made by the origin of the
* coordinate system, this point, and the positive X axis
* (that is, the theta component of the polar coordinates of this point).
*/
private double angle;

/** Returns this instance's X coordinate. */


public int getX() { return (int)(radius * Math.cos(angle)); }

/** Returns this instance's Y coordinate. */


public int getY() { return (int)(radius * Math.sin(angle)); }
CONCEPTS 71

/**
* Initializes this instance so that it represents the given point.
* @post | getX() == x
* @post | getY() == y
*/
public Point(int x, int y) {
radius = Math.sqrt((double)x * x + (double)y * y);
angle = Math.atan2(y, x);
}

Notice that both implementations have exactly the same public aspects; that is,
they implement exactly the same abstraction. A client cannot tell whether they
are using one or the other.

Abstract state
The abstract state of a given instance O of a data abstraction at a given point in
time is given by the return values of the getters of O at that time. For example, if
at some point in time, we have, for some Interval object O, O.getLowerBound() == 3
and O.getWidth() == 4 and O.getUpperBound() == 7, then O’s abstract state at that
point in time could be written as (getLowerBound(): 3, getWidth(): 4, getUpperBound():
7).

If we then call O.setWidth(10), we change O’s abstract state. After the call returns,
O’s new abstract state is (getLowerBound(): 3, getWidth(): 10, getUpperBound(): 13).

(If we fix the order of the getters, we can write the abstract state as a tuple.
Using this notation, O.setWidth(10) changes O’s abstract state from (3, 4, 7) to
(3, 10, 13).)

Valid abstract states


The abstract state space of a data abstraction with getters getA() with return
type TA and getB() with return type TB is the set of tuples of the form (VA,
VB), where VA is a value of type TA and VB is a value of type TB. For example,
the abstract state space of the Interval abstraction is the set of triples (L, W, U)
where L, W, and U are arbitrary int values.

Generally, not all elements of an abstraction’s abstract state space are mean-
ingful abstract states of the abstraction. For example, the abstract state
(getLowerBound(): 10, getWidth(): 4, getUpperBound(): 0) is not a meaningful ab-
stract state. We say the abstract state is not valid. For the Interval abstraction,
the valid abstract states are exactly the abstract states (L, W, U) where the
constraints U = L + W and 0 ≤ W hold.
72 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS

Representation, concrete state


The representation of a data abstraction implementation is the way the imple-
mentation stores an instance’s abstract state in computer memory. In simple
cases, sufficient information to uniquely determine the current abstract state is
stored in the instance’s fields.
For example, a possible representation for the Interval data abstraction is the
one where the implementation stores an Interval instance’s abstract state by
storing the lower bound in a field lowerBound and the width in a field width. (The
upper bound need not be stored separately since it can be computed from the
lower bound and the width.)
The concrete state of a given instance O of a data abstraction at a given point
in time is (in simple cases) given by the values of the fields of O at that time.
For example, suppose class Interval is implemented using fields lowerBound and
width. Then, if at some point in time, we have O.lowerBound == 3 and O.width == 4,
then O’s concrete state at that point in time could be written as (lowerBound: 3,
width: 4).

If we then call O.setWidth(10), we change O’s concrete state. After the call returns,
O’s new concrete state is (lowerBound: 3, width: 10).
If we fix the order of the fields, we can write the concrete state as a tuple. Using
this notation, O.setWidth(10) changes O’s concrete state from (3, 4) to (3, 10).

Valid concrete states


The concrete state space of a data abstraction implementation with fields f1 of
type T1 and f2 of type T2 is the set of tuples of the form (V1, V2) where V1 is a
value of type T1 and V2 is a value of type T2. For example, the concrete state
space of the Interval implementation with fields lowerBound and width of type int
is the set of pairs (L, W) where L and W are arbitrary int values.
Generally, not all elements of a data abstraction implementation’s concrete state
space are meaningful concrete states of the implementation. For example, the
concrete state (lowerBound: 5, width: -4) is not a meaningful concrete state. We
say the concrete state is not valid. For this implementation, the valid concrete
states are exactly the pairs (L, W) where 0 ≤ W holds.

Relationship between concrete states and abstract states


Each implementation of a given data abstraction fixes a particular representation
and therefore fixes a particular concrete state space as well as a particular
relationship between an instance’s concrete state and its abstract state. This
relationship is defined by the implementations of the getters. Specifically, an
implementation’s getters compute, given a concrete state, a corresponding ab-
stract state. (In the literature, this correspondence is called the implementation’s
abstraction function or abstraction relation.)
CLASS DOCUMENTATION 73

A correct implementation’s getters compute from each valid concrete state a


valid abstract state. (If a getter is called when the instance’s concrete state is
invalid, the getter need not produce a valid abstract state. It may even crash.)
For example, the getters of the Interval implementation with fields lowerBound
and width map the concrete state (lowerBound: 3, width: 4) to the abstract state
(getLowerBound(): 3, getWidth(): 4, getUpperBound(): 7).

Class documentation
If a class implements an immutable abstraction, which implies that the properties
of an instance of the class do not change throughout the lifetime of the instance,
then you shall include an @immutable tag in the Javadoc comment for the class.
If not all abstract states in the abstract state space of a class are valid, then you
shall describe the set of valid abstract states by specifying one or more public
class invariants. A public class invariant for a class can be specified in the form
of an @invar clause in the Javadoc comment for the class or in the form of a
@post clause in the Javadoc comment for a public getter of the class; the choice
between these two forms is a matter of style.
For example, see the following documentation for an immutable TimeOfDay class:
/**
* Each instance of this class represents a time of day, at one-minute resolution.
*
* @immutable
* @invar This object's hours are between 0 and 23
* | 0 <= getHours() && getHours() <= 23
* @invar This object's minutes are between 0 and 59
* | 0 <= getMinutes() && getMinutes <= 59
*/
public class TimeOfDay {

/**
* @invar | 0 <= hours && hours <= 23
* @invar | 0 <= minutes && minutes <= 59
*/
private final int hours;
private final int minutes;

public int getHours() { return hours; }


public int getMinutes() { return minutes; }

/**
* Initializes this object with the given hours and minutes.
*
* @throws IllegalArgumentException if the given hours are not between 0
* and 23
* | !(0 <= hours && hours <= 23)
* @throws IllegalArgumentException if the given minutes are not between 0
* and 59
* | !(0 <= minutes && minutes <= 59)
74 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS

*
* @post This object's hours equal the given hours
* | getHours() == hours
* @post This object's minutes equal the given minutes
* | getMinutes() == minutes
*/
public TimeOfDay(int hours, int minutes) {
if (!(0 <= hours && hours <= 23))
throw new IllegalArgumentException("hours out of range");
if (!(0 <= minutes && minutes <= 59))
throw new IllegalArgumentException("minutes out of range");
this.hours = hours;
this.minutes = minutes;
}

/**
* Returns whether this time is before the given time.
*
* @pre Argument {@code other} is not {@code null}.
* | other != null
* @post
* The result is {@code true} iff either this object's hours are less
* than the given object's hours, or this object's hours equal the
* given object's hours and this object's minutes are less than the
* given object's minutes.
* | result == (
* | getHours() < other.getHours() ||
* | getHours() == other.getHours() &&
* | getMinutes() < other.getMinutes()
* | )
*/
public boolean isBefore(TimeOfDay other) {
return
hours < other.hours ||
hours == other.hours && minutes < other.minutes;
}
}

Note: the fields in the above example are marked as final. This causes the Java
compiler to check that they are mutated only in the class’ constructor. It is
recommended to mark the fields of immutable classes as final.
If a class exists only to contain static methods, and is not intended to be
instantiated, you shall declare a private constructor to ensure that no public
constructor is implicitly generated.

Fields documentation
All fields of all classes shall be marked as private.

If not all concrete states in the concrete state space of a class are valid, then
you shall describe the set of valid concrete states by means of one or more @invar
clauses in the Javadoc comments for the fields of the class. These specify the
CONSTRUCTORS AND METHODS DOCUMENTATION 75

class’ private class invariants, also called its representation invariants. (It does
not matter which @invar clauses are in the Javadoc comment for which field;
a reasonable approach is to specify all @invar clauses in the Javadoc comment
preceding the entire block of fields.)
The above TimeOfDay example illustrates this.

Constructors and methods documentation


If not all possible values of the types of the parameters of a constructor or
method are legal values, or if not all possible invocations of a constructor or
method are legal for other reasons, you shall deal with this either defensively or
contractually.

Defensive programming
In defensive programming, you write code at the start of the method body to
check if the arguments are legal. If not, you throw an IllegalArgumentException
object. You include one or more @throws clauses in the constructor or method’s
Javadoc comment to specify the conditions under which the constructor or
method throws particular types of exceptions.
The constructor in the example above is programmed defensively.
Callers can rely on a method’s @throws clauses: if the condition specified by a
@throws clause is satisfied, the method must throw the specified exception; it is
not allowed to return normally.

Contractual programming
In contractual programming, you do not write any code in the method body
to check whether the arguments are legal; instead, you implement the method
under the assumption that the arguments are legal. You include one or more
@pre clauses in the Javadoc comment for the constructor or method to specify the
conditions that must hold at the start of an execution of the method. These are
called preconditions. If any of these conditions do not hold, the resulting behavior
of the method is unspecified. It may crash or exhibit arbitrary undesired behavior.
It is the caller’s obligation to ensure that the called method’s preconditions are
satisfied.
The method isBefore in the example above is programmed contractually.
Defensive programming is safer than contractual programming, because it keeps
programming errors in one module from leading to failures inside another module.
Programming errors are detected earlier and are easier to diagnose. Therefore,
it is generally the recommended approach. However, in some cases the perfor-
mance cost of the defensive checks is unacceptable. In these cases, contractual
programming is necessary.
76 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS

Even when programming contractually, it is recommended to check the pre-


conditions at run time during development, and to turn the checks off only
when necessary. Java supports this approach through the assert statement:
execution of an assert E; statement, where E is a boolean expression, evaluates E
and throws an AssertionError if it evaluates to false, but only if assertions are
enabled. Assertions are enabled by default when using "Run as JUnit Test" in
Eclipse, but they are disabled by default when using "Run as Java Application".
To enable assertions, open the Run Configuration and add -enableassertions or
-ea to the VM Arguments. It is recommended to disable assertions only if they
cause unacceptable performance degradation.
It is possible to disable only the assertions in a particular class, by using the
VM Arguments -ea -da:mypackage.MyClass, or in a particular package (and its
subpackages), by using the VM Arguments -ea -da:mypackage....
When using the Formal Specifications Checker for Java, assert statements that
check a method or constructor’s preconditions and postconditions are added to
the method or constructor body implicitly.

Postconditions
You shall precisely specify the result and the side-effects of each constructor
and each method, by including one or more @post clauses in the constructor or
method’s Javadoc comment that specify the conditions that must hold when
the constructor or method returns. In these conditions, you can refer to the
method’s result as result. If the method or constructor mutates any objects, you
must fully specify these objects’ new abstract state. Often, to do so you need to
refer to their old abstract state. You can do so using old(E) expressions. The
body of an old(E) expression is evaluated at the start of the execution of the
method.
The rule above implies that you must always explicitly declare at least one
constructor for each public class. Otherwise, Java would implicitly add a default
constructor, and there would be no way for you to document its behavior.
For example:
import java.util.Arrays;
import java.util.stream.IntStream;

/**
* Each instance of this class stores a list of text strings.
*/
public class StringList {

/**
* @invar | elements != null
* @invar | Arrays.stream(elements).allMatch(e -> e != null)
* @representationObject
*/
private String[] elements;
CONSTRUCTORS AND METHODS DOCUMENTATION 77

/**
* @creates | result
* @post The result is not {@code null}
* | result != null
* @post The result's elements are not {@code null}
* | Arrays.stream(result).allMatch(e -> e != null)
*/
public String[] getElements() {
return elements.clone();
}

/**
* Initializes this object so that it stores an empty list of text strings.
*
* @post This object's list of text strings is empty.
* | getElements().length == 0
*/
public StringList() {
elements = new String[0];
}

/**
* Replaces each element of this object by the text string obtained by
* replacing each character of the element by its corresponding uppercase
* letter.
*
* @mutates | this
* @post This object's number of elements equals its old number of elements.
* | getElements().length == old(getElements()).length
* @post Each of this object's elements equals its old element at the same
* index after replacing each character by the corresponding
* uppercase letter.
* | IntStream.range(0, getElements().length).allMatch(i ->
* | getElements()[i].equals(old(getElements())[i].toUpperCase()))
*/
public void allToUpperCase() {
for (int i = 0; i < elements.length; i++)
elements[i] = elements[i].toUpperCase();
}

/**
* Adds the given text strings to the end of this object's list of
* text strings.
*
* @mutates | this
* @inspects | other
*
* @throws IllegalArgumentException if argument {@code other} is
* {@code null}
* | other == null
* @throws IllegalArgumentException if any of the elements of the given array
* are {@code null}
* | Arrays.stream(other).anyMatch(e -> e == null)
*
* @post This object's number of elements equals its old number of
* elements plus the number of given text strings.
78 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS

* | getElements().length == old(getElements()).length + other.length


* @post This object's old elements have remained unchanged.
* | Arrays.equals(getElements(), 0, old(getElements()).length,
* | old(getElements()), 0, old(getElements()).length)
* @post The given list of text strings is a suffix of this object's list
* of text strings.
* | Arrays.equals(
* | getElements(), old(getElements()).length, getElements().length,
* | other, 0, other.length)
*/
public void addAll(String[] other) {
if (other == null)
throw new IllegalArgumentException("other is null");
for (int i = 0; i < other.length; i++)
if (other[i] == null)
throw new IllegalArgumentException("other[" + i + "] is null");

String[] newElements = new String[elements.length + other.length];


System.arraycopy(elements, 0, newElements, 0, elements.length);
System.arraycopy(other, 0, newElements, elements.length, other.length);
elements = newElements;
}
}

Notice the following:

• In this example, the constraint that method getElements’s result is not null
and that its elements are not null is specified in the form of postconditions
of method getElement. A correct alternative would be to specify it in the
form of public class invariants. The choice between inspector method
postconditions and public class invariants (or a combination of both) is a
matter of style.
• Arrays.stream(array).allMatch(e -> C) expresses that condition C holds for
each element e of array array. Similarly, Arrays.stream(array).anyMatch(e -> C)
expresses that condition C holds for at least one element e of array array.
• IntStream.range(a, b).allMatch(i -> C) expresses that condition C holds for
each integer i between a (inclusive) and b (exclusive).
• Arrays.equals(array1, from1, to1, array2, from2, to2) expresses that the list
of elements of array1 at indices from1 (inclusive) to to1 (exclusive) equals
the list of elements of array2 at indices from2 (inclusive) to to2 (exclusive).
• string1.equals(string2) expresses that String objects string1 and string2 rep-
resent the same list of characters. In contrast, string1 == string2 expresses
that string1 and string2 refer to the same String object.
• The notation {@code text} in an informal part of a Javadoc comment is used
to indicate that text should be formatted as code rather than as natural
language text.
EXPRESSIONS ALLOWED INSIDE JAVADOC FORMAL PARTS 79

Expressions allowed inside Javadoc formal parts


Any side-effect-free, terminating boolean Java expressions are allowed in Javadoc
formal parts. This includes calling side-effect-free methods and constructors of
the program being documented. Of course, to achieve complete documentation
those methods should themselves be documented properly. Also, be careful not
to create infinite recursions this way.
In this context, an expression, method or constructor is side-effect-free if it does
not mutate any pre-existing objects. Creating and initializing new objects is
allowed. (Also, of course a constructor is allowed to mutate this.)
Note, however, that in the Javadoc comment for class or class member X, you
can refer to another class or class member Y only if Y is visible to any code
that can see X. For example, in the Javadoc comment for a public method of a
public class, you cannot refer to private fields or methods of that class or to any
non-public classes, constructors or methods in the same package.
Note also that evaluation of Javadoc formal parts must never crash, i.e. throw
an Exception. If evaluation of a Javadoc formal part crashes, the documentation
is considered incorrect. For example, removing the first @throws clause in the
documentation for method addAll above would lead to incorrect documentation,
because calling addAll with a null argument would then cause the Arrays.stream
call in the second @throws clause to throw a NullPointerException.
(Note: This rule implies that evaluation of a Javadoc formal part must never lead
to calling a method with arguments that violate that method’s preconditions.
Indeed, the behavior of such a call is completely unspecified so it might throw an
Exception. This means that it is not the case that a method M automatically
inherits the preconditions of methods M’ called in M’s preconditions.)

Advanced topics
You need not read or apply this section if you are not going for a top score.
To achieve a complete specification of the behavior of a method or constructor,
the @mutates and @inspects clauses should be used to indicate which pre-existing
objects are mutated and inspected by the method or constructor, and the @creates
clause should be used to indicate which objects that are visible to the client after
the method or constructor finishes were created by the method or constructor.

Mutates clauses
A method must not mutate any pre-existing objects not mentioned in its @mutates
clause and not used to represent the state of any objects mentioned in its @mutates
clause. That is, an execution of a method may mutate an object O if and only if
either O was newly created during the method execution (i.e. it did not exist
when the method execution started), or O is mentioned in the method’s @mutates
80 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS

clause, or O is a representation object of an object mentioned in the method’s


@mutates clause, or O is a representation object of a representation object of an
object mentioned in the method’s @mutates clause, and so forth.
An object O is a representation object of another object O’ if a field marked
@representationObject of O’ holds a reference to O. For example, a StringList
object’s elements array is a representation object of the StringList object. This is
why method allToUpperCase can mutate the array object, even though it is not
mentioned by the method’s @mutates clause.

Inspects clauses
Similarly, a method must not inspect the state of any pre-existing mutable objects
not mentioned in its @inspects or @mutates clause and not used to represent the
state of any objects mentioned in its @inspects or @mutates clause.
Instances of immutable classes need not be mentioned in @inspects clauses.
(Documenting which objects are inspected by a method is important for at least
two reasons: 1) the caller must ensure that the inspected objects are in a valid
state, i.e. their representation invariants hold; 2) in a multithreaded program,
no other thread must mutate the inspected objects.)

Defaults
If no @mutates or @inspects clause is specified for a given method or constructor,
the default is that it may inspect and mutate any object that is not an instance
of an immutable class. Exception: if an instance method’s name starts with get
or is, the default is that it may mutate no object and that it may inspect this.
A constructor is always allowed to mutate this.

Each of these clauses takes a comma-separated list of zero or more expressions


that should evaluate to an object. If an expression evaluates to null, it is ignored.
Obviously, it is an error to specify an instance of an immutable class in a @mutates
clause.

Creates clauses
By specifying an object in a @creates clause, you indicate that the object was
created during the execution of the method, and furthermore, that the object
is distinct from any direct or indirect representation objects of any objects
mentioned in any of the method’s @inspects or @mutates clauses.
The purpose is to allow the client to conclude that the object will not be inspected
or mutated by any future method calls that mutate or inspect pre-existing objects.
Objects created by a method or constructor that do not become visible to
the client when the method or constructor finishes need not (and cannot) be
ADVANCED TOPICS 81

mentioned in a @creates clause.


82 HOW TO DOCUMENT SINGLE-OBJECT ABSTRACTIONS
Inheritance

Polymorphism
Suppose we need to develop a drawing application. A drawing consists of a
number of triangles and circles, so we define a class where each instance represents
a triangle and a class where each instance represents a circle:
package drawings;

public class Triangle {

private final int x1, y1, x2, y2, x3, y3;

public int getX1() { return x1; }


public int getY1() { return y1; }
public int getX2() { return x2; }
public int getY2() { return y2; }
public int getX3() { return x3; }
public int getY3() { return y3; }

public Triangle(int x1, int y1, int x2, int y2, int x3, int y3) {
this.x1 = x1;
this.y1 = y1;
this.x2 = x2;
this.y2 = y2;
this.x3 = x3;
this.y3 = y3;
}

public boolean contains(int x, int y) {


return
Math.signum(((long)x - x1) * ((long)y2 - y1)
- ((long)y - y1) * ((long)x2 - x1)) *
Math.signum(((long)x3 - x1) * ((long)y2 - y1)
- ((long)y3 - y1) * ((long)x2 - x1)) >= 0 &&
Math.signum(((long)x - x2) * ((long)y3 - y2)
- ((long)y - y2) * ((long)x3 - x2)) *
Math.signum(((long)x1 - x2) * ((long)y3 - y2)
- ((long)y1 - y2) * ((long)x3 - x2)) >= 0 &&
Math.signum(((long)x - x3) * ((long)y1 - y3)
- ((long)y - y3) * ((long)x1 - x3)) *
Math.signum(((long)x2 - x3) * ((long)y1 - y3)

83
84 INHERITANCE

- ((long)y2 - y3) * ((long)x1 - x3)) >= 0;


}

public String getDrawingCommand() {


return "triangle "
+ x1 + " " + y1 + " " + x2 + " " + y2 + " " + x3 + " " + y3;
}

package drawings;

public class Circle {

private final int x, y;


private final int radius;

public int getX() { return x; }


public int getY() { return y; }
public int getRadius() { return radius; }

public Circle(int x, int y, int radius) {


this.x = x;
this.y = y;
this.radius = radius;
}

public boolean contains(int x, int y) {


long dx = (long)x - this.x;
long dy = (long)y - this.y;
return dx * dx + dy * dy <= (long)radius * radius;
}

public String getDrawingCommand() {


return "circle " + x + " " + y + " " + radius;
}

A drawing, then, is a sequence of shapes, where a shape is either a triangle or a


circle. The order among the shapes is important in case of overlaps: the shapes
are drawn in reverse order so that the first shape appears on top. How can we
store this sequence? We can store it using an array or an ArrayList, but what
should be the element type?
What we need is a type Shape that is conceptually the union of types Triangle
and Circle: what we want is that an object is an instance of Shape if and only if
it is either an instance of Triangle or an instance of Circle.
For this purpose, Java has the concepts of abstract classes and inheritance. We
can declare an abstract class Shape and declare classes Triangle and Circle as
subclasses of Shape:
public abstract class Shape {
STATIC TYPE CHECKING 85

public class Triangle extends Shape {


// class body unchanged
}

public class Circle extends Shape {


// class body unchanged
}

We also say that Triangle and Circle extend Shape and that they inherit from
Shape. Shape is the superclass of Triangle and Circle.

A variable of type Shape is polymorphic: we can use it to store a reference to a


Triangle instance or a reference to a Circle instance:

Shape shape1 = new Triangle(0, 0, 10, 0, 5, 10);


Shape shape2 = new Circle(5, 10, 5);

This also means we can use an array of type Shape[] to store both triangles and
circles:
Shape[] shapes = {new Triangle(0, 0, 10, 0, 5, 10), new Circle(5, 10, 5)};

Once we have stored an object reference into a polymorphic variable, we can


test whether it refers to an instance of a particular class using an instanceof
expression:
assertEquals(true, shape1 instanceof Triangle);
assertEquals(true, shape2 instanceof Circle);
assertEquals(false, shape1 instanceof Circle);
assertEquals(false, shape2 instanceof Triangle);

Static type checking


Java is a statically typed programming language. This means that the computer
will refuse to execute any Java program that is not statically well-typed. Specifi-
cally, before the computer starts executing a Java program, it first type-checks it.
If the type-check fails, the program is not executed. The purpose of type-checking
is to ensure that certain types of problems will never occur at run time. In
particular, Java’s static type system ensures that a well-typed program never
tries to call a method named M on an object whose class does not declare a
method named M, and that it never tries to access a field named F of an object
whose class does not declare a field named F.
To ensure this, the Java type-checker assigns a static type to each expression of
the program, based on a simple analysis of the program text. In particular:
86 INHERITANCE

• The static type of a variable reference is the declared type of the variable.
For example, the static type of an expression shape1 that appears in the
scope of a variable declaration Shape shape1 is Shape.
• Analogously, the static type of a field reference is the declared type of the
field, and the static type of a method call is the declared return type of
the method.
Java’s static type-checking rules ensure that whenever an expression E evaluates
to a value V at run time, then V is a value of the static type of E. For example,
Java’s static type checker allows an assignment x = E only if the static type of
expression E is compatible with the declared type of variable x. This ensures that
at any point during the execution of a program, the value stored in a variable is
a value of the declared type of the variable.
To prevent bad method calls or field accesses, Java’s static type checker allows a
method call E.m(...) only if the static type of E is a class that declares a method
named m. Similarly, it allows a field access E.f only if the static type of E is a
class that declares a field named f.
Java’s static type-checker is incomplete, in the technical sense that in some cases
it rejects a program even though no execution of that program would go wrong
at run time.
For example, the following program never goes wrong, but it is rejected by Java’s
static type checker:
Shape myShape = new Triangle(0, 0, 10, 0, 5, 10);
Triangle myTriangle = myShape;
int x1 = myTriangle.getX1();

Java’s type checker rejects the assignment myTriangle = myShape because the static
type of myShape is Shape and class Shape is not a subclass of class Triangle.

Typecasts
To allow programmers to work around the incompleteness of Java’s static type-
checker, Java supports typecasts. The following program is accepted by Java’s
static type checker:
Shape myShape = new Triangle(0, 0, 10, 0, 5, 10);
Triangle myTriangle = (Triangle)myShape;
int x1 = myTriangle.getX1();

Even though the static type of expression myShape is Shape, the static type of
expression (Triangle)myShape is Triangle. The computer checks at run time, when
evaluating expression (Triangle)myShape, that the value of myShape is in fact a
reference to an instance of class Triangle. If not, a ClassCastException is thrown.
We can use instanceof tests and typecasts to compute the drawing commands
for a drawing given by an array of Shapes:
DYNAMIC BINDING 87

public class Drawing {

private Shape[] shapes;

public String getDrawingCommands() {


String drawingCommands = "";
for (int i = shapes.length - 1; 0 <= i; i--) {
Shape shape = shapes[i];
if (shape instanceof Triangle)
drawingCommands += ((Triangle)shape).getDrawingCommand();
else
drawingCommands += ((Circle)shape).getDrawingCommand();
drawingCommands += "\n";
}
return drawingCommands;
}
}

Dynamic binding
The implementation of method getDrawingCommands shown above for composing the
drawing commands for a drawing works, but it is not optimal from a modularity
point of view: if we extend our drawing application to also support rectangles,
we need to update method getDrawingCommands.
We can avoid this by using dynamic binding. Since each subclass of class Shape
is supposed to declare a method getDrawingCommand(), we can declare an abstract
method getDrawingCommand() in class Shape:
public abstract class Shape {

public abstract String getDrawingCommand();

Java’s static type checker checks that each class that extends Shape declares a
method named getDrawingCommand that takes no parameters and has return type
String. Correspondingly, since class Shape now declares a method getDrawingCommand,
we can now call getDrawingCommand directly on an expression of static type Shape:
public class Drawing {

private Shape[] shapes;

public String getDrawingCommands() {


String drawingCommands = "";
for (int i = shapes.length - 1; 0 <= i; i--) {
Shape shape = shapes[i];
drawingCommands += shape.getDrawingCommand();
drawingCommands += "\n";
}
return drawingCommands;
88 INHERITANCE

}
}

When the computer executes the method call shape.getDrawingCommand(), it de-


termines which method body to execute based on the dynamic type of the
receiver object: if shape evaluates to a reference to an instance of Triangle, then
the implementation of getDrawingCommand() in class Triangle is executed; if shape
evaluates to a reference to an instance of Circle, then the implementation of
getDrawingCommand() in class Circle is executed. This is known as dynamic binding
of method calls.
If a method declared by a subclass has the same name and the same number and
types of parameters as a method declared by its superclass, we say it overrides
the superclass method. Calls of the method on an object of the subclass will
execute the overriding method instead of the overridden method.

Class Object
If, when declaring a class, we do not explicitly specify a superclass using an
extends clause, the class implicitly extends class java.lang.Object. This means
that every class is a direct or indirect subclass of class java.lang.Object. (Since
package java.lang is always imported implicitly, we can simply write Object.)
This also means that a variable of type Object can be used to store a reference
to any object of any class.
Class Object declares a number of methods:
package java.lang;

public class Object {

/**
* Returns the Class object for this object's class.
*/
public Class getClass() { /* ... */ }

/**
* Returns a number suitable for use as a hash code when using this object as
* a key in a hash table.
*
* Note: two objects that are equal according to the `equals(Object)` method
* must have the same hash code.
*
* The implementation of this method in class java.lang.Object returns a hash
* code based on the identity of this object. That is, this implementation
* usually returns a different number for different objects, although this is
* not guaranteed.
*/
public int hashCode() { /* ... */ }

/**
* Returns a textual representation of this object.
CLASS OBJECT 89

*
* The implementation of this method in class java.lang.Object is based on the
* name of this object's class and this object's identity-based hash code.
*/
public String toString() {
return this.getClass().getName() + "@"
+ Integer.toHexString(this.hashCode());
}

/**
* Returns whether this object is conceptually equal to the given object.
*
* The implementation of this method in class java.lang.Object returns whether
* this object and the given object are the same object.
*/
public boolean equals(Object other) { return other == this; }

// ...

Methods equals, hashCode, and toString are often overridden by immutable classes.
For example:
public class Point {

private final int x;


private final int y;

public Point(int x, int y) {


this.x = x;
this.y = y;
}

public int getX() {


return x;
}

public int getY() {


return y;
}

@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + x;
result = prime * result + y;
return result;
}

@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
90 INHERITANCE

if (getClass() != obj.getClass())
return false;
Point other = (Point) obj;
if (x != other.x)
return false;
if (y != other.y)
return false;
return true;
}

@Override
public String toString() {
return "Point [x=" + x + ", y=" + y + "]";
}

The implementations above were generated using Eclipse’s Generate hashCode()


and equals() and Generate toString() commands, which you can find in the
Source menu after right-clicking on the class.
The @Override annotations cause Java’s static type checker to check that the
methods do indeed override a method from the superclass. Without the annota-
tion, it is easy to accidentally not override a superclass method. For example, if
we accidentally declared the parameter type of equals as Point instead of Object,
it would not override the equals method from class Object and we would not get
the behavior shown below. Thanks to the @Override annotation, the Java static
type checker would flag this as an error.
As a result of overriding these methods from class Object, we get the following
behavior:
assertEquals("This is Point [x=10, y=20].","This is " + new Point(10, 20) + ".");
assertTrue(new Point(10, 20).equals(new Point(10, 20)));
assertEquals(new Point(10, 20), new Point(10, 20));
assertTrue(
Arrays.asList(new Point(3, 9), new Point(7, 20)).contains(new Point(3, 9)));
assertTrue(Set.of(new Point(3, 9), new Point(7, 20)).contains(new Point(7, 20)));

If we had not overridden these methods, the behavior would be as follows:


assertEquals("This is Point@12345678.", "This is " + new Point(10, 20) + ".");
assertFalse(new Point(10, 20).equals(new Point(10, 20)));
assertFalse(
Arrays.asList(new Point(3, 9), new Point(7, 20)).contains(new Point(3, 9)));
assertFalse(Set.of(new Point(3, 9), new Point(7, 20)).contains(new Point(7, 20)));
Behavioral subtyping:
modular reasoning about
programs that use dynamic
binding

Modular reasoning about programs


Non-modular reasoning
In order to be able to deliver programs that exhibit correct behavior, we need to
reason about them, so as to convince ourselves that the program will exhibit the
correct behavior in all circumstances.
For example, consider the following program:
class Company {

String[] getLocations() {
return new String[] {"Brussels", "Paris", "Berlin"};
}

class Program {

static void printLocations(Company company) {


String[] locations = company.getLocations();
for (int i = 0; i < 3; i++)
System.out.println(locations[i]);
}

public static void main(String[] args) {


printLocations(new Company());
}

91
92 MODULAR REASONING ABOUT DYNAMIC BINDING

As part of reviewing this program to convince ourselves that it behaves correctly,


we need to convince ourselves that the array accesses in method printLocations
will never be out of bounds.
A straightforward approach to this reasoning task is to perform non-modular
reasoning: to determine the behavior of a method call, we look inside the
called method’s implementation. In the example, to determine the behavior of
company.getLocations(), we look at the body of method getLocations() and see that
it returns an array of length 3. From this observation, we can deduce that the
array accesses in method printLocation will not fail.
This non-modular reasoning approach is brittle: if we convince ourselves of a
program’s correctness through non-modular reasoning, and then change the
implementation of any method anywhere in the program, we do not simply have
to re-check the modified method; we also have to find all of the method’s callers,
and re-check them as well. Indeed, the reasoning we used to convince ourselves
of their correctness may have been invalidated by the modification. But it does
not stop there: by the same argument, we have to re-check the callers’ callers
as well, and so on. In fact, it’s even worse: since we may have taken context
information into account when reviewing a method call, and the context may
have changed due to the modification, we have to review not just the direct and
indirect callers of the modified method, but all direct and indirect callees of the
direct and indirect callers as well; that is, we effectively have to re-review the
entire program.
For example, if we change method getLocations to return an array of size two,
the reasoning we used to establish the correctness of method printLocations is
invalidated. Indeed, executing method printLocations will now cause an out-of-
bounds array access.
Does this mean that method printLocations is incorrect? Or does it mean
that method getLocations is incorrect? The answer is: neither. With non-
modular reasoning, there is no notion of correctness of methods; there is only
the correctness of the program as a whole.

Modular reasoning
The solution, of course, is to perform modular reasoning instead. In the modular
reasoning approach, we assign a specification to each method, which specifies
the correct behaviors of the method. This way, we define a notion of correctness
of a method: a method M is correct if and only if all of its behaviors are allowed
by its specification, assuming that the method calls performed by M behave in
accordance with the called methods’ specifications.
To convince ourselves of the correctness of a program, we simply need to check
that each method complies with its specification, under the assumption that
called methods comply with theirs. If each of a program’s methods is correct in
MODULAR REASONING 93

this way, and the main method’s specification expresses the allowed behaviors of
the program as a whole, then the correctness of the program as a whole follows
as a corollary.
If we have verified a program modularly, and then we modify one of its methods,
we only need to re-check that that one method still complies with its original
specification. If so, we can immediately conclude that the correctness of the
program as a whole is preserved.
If the changed method no longer complies with its original specification, we
need to update its specification and re-verify its callers as well. This "change
propagation" stops when we reach a method whose original specification is
preserved.
In the example, there are at least two possible specifications that we can assign
to method getLocations:
• We can assign a strong specification:
/**
* @post | result != null
* @post | result.length == 3
* @post | Arrays.stream(result).allMatch(e -> e != null)
*/
String[] getLocations() {
return new String[] {"Brussels", "Paris", "Berlin"};
}

Given this specification for method getLocations, methods getLocations and


printLocations are both correct. If we then change method getLocations so
that it returns an array of size two, the method no longer complies with its
specification so we need to update its specification and re-verify callers as
well. We will then discover that method printLocations needs to be updated
as well.
• We can also assign a weaker specification:
/**
* @post | result != null
* @post | Arrays.stream(result).allMatch(e -> e != null)
*/
String[] getLocations() {
return new String[] {"Brussels", "Paris", "Berlin"};
}

Given this specification for method getLocations, method getLocations itself


is correct but method printLocations is not correct. Indeed, from the
specification of getLocations we cannot conclude that the array accesses
performed by printLocations will not be out of bounds.
Suppose we update method printLocations as follows:
94 MODULAR REASONING ABOUT DYNAMIC BINDING

/** @pre | company != null */


static void printLocations(Company company) {
String[] locations = company.getLocations();
for (String location : locations)
System.out.println(location);
}

This modified version of printLocations is correct. Furthermore, suppose


we now change method getLocations to return an array of size two. This
modified version of getLocations still complies with its original specification,
so we can conclude immediately that the correctness of the program as
a whole is preserved. In particular, since during our review of method
printLocations, we assumed only the specification of method getLocations
and did not look at its implementation, the reasoning we used to conclude
the correctness of method printLocations still holds, so we do not have to
review method printLocations again.

Dynamic binding
Before we discuss modular reasoning about programs that use dynamic binding,
we briefly review the concept of dynamic binding.
For each method call in a Java program, the Java compiler checks, before program
execution starts, that there is a corresponding method in the program. It looks
for this method by considering the static type of the target expression of the call
(the expression before the dot) and the argument expressions. This process is
known as method call resolution; we call the method found this way the resolved
method.
Consider the following example program:
abstract class Company {

abstract String[] getLocations();

class CompanyA extends Company {

String[] getLocations() {
return new String[] {"Brussels", "Paris", "Berlin"};
}

class Program {

static void printLocations(Company company) {


String[] locations = company.getLocations();
for (int i = 0; i < 3; i++)
System.out.println(locations[i]);
MODULAR REASONING ABOUT DYNAMIC BINDING 95

public static void main(String[] args) {


printLocations(new CompanyA());
}

In this program, the resolved method of the call printLocations(new CompanyA())


in method main is the method printLocations in class Program. The static type
of argument expression new CompanyA() is CompanyA, which is a subtype of the
parameter type Company of the method. If the argument expression was "Hello"
instead, method call resolution would fail, because String is not a subtype of
Company.

Similarly, the resolved method of the call company.getLocations() in method


printLocations is method getLocations in class Company, because the static type
of target expression company is Company.
These two calls illustrate the two different types of method calls in Java:
• Call printLocations(new CompanyA()) is a statically bound call: when the call
expression is evaluated at run time, the method called is exactly the
resolved method.
• Call company.getLocations() is a dynamically bound call. When evaluating a
dynamically bound method call expression, the method called is either the
resolved method or some method that overrides it. More specifically, to
determine which method is called, the computer looks at the target object of
the call, obtained by evaluating the target expression: if the class of the tar-
get object declares or inherits a method that overrides the resolved method,
then this method is called instead of the resolved method. In the example
program, when evaluating call expression company.getLocations(), the target
object is an object of class CompanyA (because main calls printLocations with
an instance of class CompanyA as an argument). Class CompanyA declares a
method getLocations that overrides the one from class Company, so this is the
one that is called.

Modular reasoning about programs that use dy-


namic binding
Applying basic modular reasoning
To reason about programs that use dynamic binding, such as the one shown above,
we can simply apply the principle we introduced above: assign a specification to
each method of the program, and check that each method’s behavior complies
with its specification, assuming that the behavior of method calls complies with
the called methods’ specifications. Suppose we assign the strong specification to
96 MODULAR REASONING ABOUT DYNAMIC BINDING

method getLocations in class CompanyA:

/**
* @post | result != null
* @post | result.length == 3
* @post | Arrays.stream(result).allMatch(e -> e != null)
*/
String[] getLocations() {
return new String[] {"Brussels", "Paris", "Berlin"};
}

When checking method printLocations, we need to determine which method is


called by call company.getLocations(). Since the precondition of printLocations does
not specify the precise class of argument company, we need to consider all possible
callees. In this program, since the only method that overrides abstract method
getLocations in class Company is the one in class CompanyA, we can assume the call
complies with the specification of getLocations in class CompanyA. This way, we can
conclude the correctness of printLocations.

This approach is modular, in the sense that if we modify a method, and the
modified method still complies with its original specification, we can immediately
conclude that the correctness of the program as a whole is preserved.

However, this approach does not deal optimally with another type of program
modification: adding a new class that extends an existing class.

Indeed, suppose we extend the example program with the following class:

class CompanyB extends Company {


/**
* @post | result != null
* @post | result.length == 2
* @post | Arrays.stream(result).allMatch(e -> e != null)
*/
String[] getLocations() {
return new String[] {"Vienna", "Prague"};
}
}

By adding this class, we have enlarged the set of possible callees of call
company.getLocations() in method printLocations. As a result, we need to re-verify
method printLocations. In this case, we discover that it is not correct as-is.

In general, when applying the simple modular reasoning approach defined above
to a program with dynamically bound calls, after adding a class to the program
we need to re-check all methods that perform dynamically bound calls.

We conclude that the basic modular reasoning approach defined above is not
adequate for reasoning about programs that use dynamic binding.
MODULAR REASONING ABOUT DYNAMIC BINDING 97

Modular reasoning about dynamic binding: basic principle


To solve this issue, we look, when checking a method, not at the specifications
of the called methods of the call expressions that appear in the method, but
at the specifications of the resolved methods. For example, when checking
method printLocations, we only look at the specification of method getLocations
in class Company. Furthermore, we check, when checking a method, not just
that it complies with its own specification, but also that it complies with the
specifications of all methods it overrides. In the example, we check that method
getLocations in class CompanyA complies both with its own specification, and with
the specification of method getLocations in class Company.
When adding a new class, we only need to check that its methods comply with the
specifications of all methods they override. If so, we can immediately conclude
that the correctness of the program as a whole is preserved.
In the example, there are two cases:
• If we assign the strong specification (stating that the method returns an
array of length three) to method getLocations in class Company, then method
printLocations is correct, but adding class CompanyB is incorrect because its
getLocations method does not comply with the specification of method
getLocations in class Company which it overrides.
• If we assign the weak specification (which leaves the length of the array
unspecified) to method getLocations in class Company, then we need to update
method printLocations. After fixing method printLocations, the program is
correct. When adding class CompanyB, we only need to check that its method
getLocations complies with the specification of getLocations in Company. Since
it does, we can immediately conclude that adding this class preserves the
correctness of the program as a whole.
We summarize the basic principle of effective modular reasoning about programs
with dynamic binding as follows:
• A method is correct if and only if its behaviors are allowed by its own
specification and by the specifications of all methods it overrides, assuming
that the behavior of each call it performs complies with the specification
of the resolved method of the call.
• If all of a program’s methods are correct, and the specification of the
program’s main method expresses the allowed behaviors of the program as
a whole, then correctness of the program as a whole follows as a corollary.

Derived principle: strengthening of specifications


If we apply this basic principe directly, we potentially have to verify a single
method implementation against many different specifications. To avoid this,
we can instead use a derived principle, that requires us to only check that 1)
each method complies with its own specification, and 2) that each method’s
specification strengthens the specifications of all methods it overrides. We say
98 MODULAR REASONING ABOUT DYNAMIC BINDING

that a specification S strengthens another specification S’ if and only if each


imaginable method that complies with S also complies with S’.

If a specification consists of a precondition and a postcondition, then we have


the following property: specification S strengthens specification S’ if 1) the
precondition of S weakens the precondition of S’, and 2) the postcondition of S
strengthens the postcondition of S’.

For example, in the following sequence of specifications for a method abs, each
next specification strengthens the preceding one:
/**
* @pre | false
* @post | true
*/
public static int abs(int x)

/**
* @pre | 0 <= x
* @post | true
*/
public static int abs(int x)

/**
* @pre | 0 <= x
* @post | 0 <= result
*/
public static int abs(int x)

/**
* @pre | true
* @post | 0 <= result
*/
public static int abs(int x)

/**
* @pre | true
* @post | false
*/
public static int abs(int x)

The first specification is the weakest possible one: it does not allow any calls of
the method, so the method is free to crash or exhibit any behavior whatsoever.
The last one is the strongest possible one, because it is unimplementable: since
there exists no method implementation that satisfies postcondition false, it is
vacuously true that every such implementation has the desired behavior (for any
definition of "desired behavior" whatsoever).

Derived principle: behavioral subtyping


If we assign a specification to each method of a class C, then in doing so, we
define a behavioral type. We say an object O is of behavioral type C, if, for every
INHERITANCE OF SPECIFICATIONS 99

method M of C, the behavior of a call of M on O complies with the specification


of M in C.
(Notice that the behavioral type defined by class is defined entirely by its
documentation; the implementation of a class is completely irrelevant to the
behavioral type it defines. (But the implementation of class C is relevant to the
question of whether the instances of class C are of the behavioral type C.))
We say a behavioral type D is a behavioral subtype of a behavioral type C if each
object that is of behavioral type D is also of behavioral type C.
If the specifications of the methods of D that override methods of C strengthen
the specifications of the overridden methods, then it follows that behavioral type
D is a behavioral subtype of type C.
We say that a program respects behavioral subtyping if, for every class D of the
program that extends a class C, it is the case that D is a behavioral subtype of
C.
Using these definitions, we can rephrase the principle of modular reasoning about
programs with dynamic binding as follows: if A) each method of a program
complies with its specification, assuming that each object it interacts with is
of the behavioral type given by its static type, and B) the program respects
behavioral subtyping, and C) the specification of the program’s main method
expresses the correctness of the program as a whole, then the program as a whole
exhibits only correct behaviors.
Or, to phrase it as a slogan: Java’s static type checker ensures that a subclass D
of a class C is a syntactic subtype of C; to achieve correct programs, we must
ensure that D is a behavioral subtype of C as well.

Inheritance of specifications
In the literature on behavioral subtyping, some authors propose that the effective
specification of a method should be the conjunction of the declared specification
and any inherited specifications. (The conjunction of a specification with precon-
dition P1 and postcondition Q1 and a specification with precondition P2 and
postcondition Q2 is the specification with precondition "P1 or P2" and postcon-
dition "(if old(P1) then Q1) and (if old(P2) then Q2)".) In this approach, called
specification inheritance, if the implementation of each method of a program
complies with its effective specification, the program automatically complies with
behavioral subtyping.
In this course, however, we do not apply specification inheritance; a method’s
effective specification is exactly its declared specification. (Therefore, we can
speak simply of "a method’s specification" without introducing ambiguity.)
We make one exception to this rule: if the Javadoc comment associated with a
method M does not contain any specification clauses at all (i.e. no @pre, @post,
100 MODULAR REASONING ABOUT DYNAMIC BINDING

@throws, @may_throw, @inspects, @mutates, @mutates_properties, @creates, @immutable,


@peerObject, @peerObjects,or @basic clauses at all), and M overrides some method
M’ that itself overrides all other methods that M overrides, then we define the
specification of M as being identical to the specification of M’.

Further reading
Note: the material in this section is outside the scope of the course and is
provided for the information of interested students only.
For a much more detailed, formal, academic treatment of the approach to
modular reasoning about programs that involve dynamically-bound method
calls and the notion of behavioral subtyping described in this note, we refer
interested students to Matthew J. Parkinson. Local reasoning for Java. PhD
Thesis, Cambridge University, 2005.
Work on behavioral notions of subtyping in object-oriented programming goes
back to the early nineties with work by America, Liskov and Wing, Leavens,
and others; see the bibliography in Parkinson’s thesis. Some of this early work,
however, confusingly discusses the problem in terms of substituting subtype
objects for supertype objects. This is not a good way to think of the problem of
modular reasoning about object-oriented programs. Most importantly, it invites
confusing behavioral types with implementations. We repeat that a behavioral
type is defined entirely by the type’s documentation, not by any implementations.
The correct image is that of a single object that is of two behavioral types
(one of which is a behavioral subtype of the other). One example of where the
substitution narrative breaks down is if the supertype is an abstract class (or an
interface): in that case, no object can exist whose class is the supertype.
Interfaces

Suppose we are implementing a programming language. Our language supports


the datatypes boolean, int, and string, amongst others:
public abstract class Type {}

public class BooleanType extends Type {


private BooleanType() {}
public static final BooleanType INSTANCE = new BooleanType();
}

public class IntType extends Type {


private IntType() {}
public static final IntType INSTANCE = new IntType();
}

public class StringType extends Type {


private StringType() {}
public static final StringType INSTANCE = new StringType();
}

public abstract class Value {


public abstract Type getType();
}

public class BooleanValue extends Value {


public final boolean value;
public Type getType() { return BooleanType.INSTANCE; }
private BooleanValue(boolean value) { this.value = value; }
public final static BooleanValue TRUE = new BooleanValue(true);
public final static BooleanValue FALSE = new BooleanValue(false);
public static BooleanValue of(boolean value) { return value ? TRUE : FALSE; }
}

public class IntValue extends Value {


public final int value;
public Type getType() { return IntType.INSTANCE; }
private IntValue(int value) { this.value = value; }
public final static IntValue ZERO = new IntValue(0);
public static IntValue of(int value) {
return value == 0 ? ZERO : new IntValue(value);
}
}

101
102 INTERFACES

public class StringValue extends Value {


public final String value;
public Type getType() { return StringType.INSTANCE; }
private StringValue(String value) { this.value = value; }
public final static StringValue EMPTY = new StringValue("");
public static StringValue of(String value) {
return value.equals("") ? EMPTY : new StringValue(value);
}
}

First of all, notice the following:


• Classes BooleanType, IntType, and StringType are examples of the Singleton
Pattern: it does not make sense to create more than one instance of class
BooleanType, so, instead of offering to clients a way to create new instances
of class BooleanType, the class offers only a static field INSTANCE that refers
to the only instance of class BooleanType that will ever exist. A static field
differs from a regular field in that it is a property of the class itself, rather
than a property of each instance of the class.
• Classes BooleanValue, IntValue, and StringValue are examples of immutable
value classes. Instead of exposing a constructor to clients directly, class
BooleanValue offers a static method of that clients can use to obtain a
BooleanValue instance corresponding to a particular boolean value. Instead
of creating a new instance of BooleanValue at each call, method of reuses a
BooleanValue instance stored in a static field of the class. Similarly, method
of of class IntValue reuses an IntValue instance stored in static field ZERO if
an IntValue instance corresponding to value 0 is requested; if some other
value is requested, a new IntValue instance is created. Class StringValue
implements this pattern as well.
• Names of static final fields are often written in all-uppercase.
Now suppose that in our programming language, like in Java, we can use the
+ operator to add int values and to concatenate string values, but we cannot
use it on boolean values. Furthermore, suppose that, again like in Java, we can
use the logical AND operator & to compute the bitwise AND of two int values
and to compute the logical AND of two boolean values, but we cannot use it on
string values. To implement this, it would make sense to make classes IntType
and StringType extend an abstract class AddableType with an abstract method add,
and to make classes BooleanType and IntType extend an abstract class AndableType
with an abstract method and. We could then implement a method evaluate as
follows:
public abstract class AddableType {
public abstract Value add(Value leftOperand, Value rightOperand);
}
public abstract class AndableType {
public abstract Value and(Value leftOperand, Value rightOperand);
}
public class BooleanType extends Type, AndableType { // ERROR
103

// ...
public Value and(Value leftOperand, Value rightOperand) {
return BooleanValue.of(((BooleanValue)leftOperand).value
& ((BooleanValue)rightOperand).value);
}
}
public class IntType extends Type, AddableType, AndableType { // ERROR
// ...
public Value add(Value leftOperand, Value rightOperand) {
return IntValue.of(((IntValue)leftOperand).value
+ ((IntValue)rightOperand).value);
}
public Value and(Value leftOperand, Value rightOperand) {
return IntValue.of((IntValue)leftOperand).value
& ((IntValue)rightOperand).value);
}
}
public class StringType extends Type, AddableType { // ERROR
// ...
public Value add(Value leftOperand, Value rightOperand) {
return StringValue.of(((StringValue)leftOperand).value
+ ((StringValue)rightOperand).value);
}
}
public class Interpreter {
public static Value evaluate(Value value1, char operator, Value value2) {
Type type = value1.getType();
if (type != value2.getType())
throw new UnsupportedOperationException(
"The operand types do not match");
switch (operator) {
case '+':
if (!(type instanceof AddableType))
throw new UnsupportedOperationException(
"Type " + type + " does not support the + operator");
return ((AddableType)type).add(value, value2);
case '&':
if (!(type instanceof AndableType))
throw new UnsupportedOperationException(
"Type " + type + " does not support the & operator");
return ((AndableType)type).and(value1, value2);
// ...
}
}
}

Unfortunately, this is not valid Java code: Java does not allow a class to extend
multiple superclasses. (Some other programming languages, such as C++, do
allow such multiple inheritance.) However, Java does allow a restricted form of
multiple inheritance: it allows a class to extend from one superclass and zero
or more interfaces. An interface, declared using the interface keyword, is in
most ways just like a class, except that it is not allowed to declare any instance
(i.e. non-static) fields and it is not allowed to declare any constructors. We can
therefore turn the incorrect program above into the correct program below:
104 INTERFACES

public interface AddableType {


Value add(Value leftOperand, Value rightOperand);
}
public interface AndableType {
Value and(Value leftOperand, Value rightOperand);
}
public class BooleanType extends Type implements AndableType {
// ...
public Value and(Value leftOperand, Value rightOperand) {
return BooleanValue.of(((BooleanValue)leftOperand).value
& ((BooleanValue)rightOperand).value);
}
}
public class IntType extends Type implements AddableType, AndableType {
// ...
public Value add(Value leftOperand, Value rightOperand) {
return IntValue.of(((IntValue)leftOperand).value
+ ((IntValue)rightOperand).value);
}
public Value and(Value leftOperand, Value rightOperand) {
return IntValue.of((IntValue)leftOperand).value
& ((IntValue)rightOperand).value);
}
}
public class StringType extends Type implements AddableType {
// ...
public Value add(Value leftOperand, Value rightOperand) {
return StringValue.of(((StringValue)leftOperand).value
+ ((StringValue)rightOperand).value);
}
}
public class Interpreter {
public static Value evaluate(Value value1, char operator, Value value2) {
Type type = leftOperand.getType();
if (type != rightOperand.getType())
throw new UnsupportedOperationException(
"The operand types do not match");
switch (operator) {
case '+':
if (!(type instanceof AddableType))
throw new UnsupportedOperationException(
"Type " + type + " does not support the + operator");
return ((AddableType)type).add(value1, value2);
case '&':
if (!(type instanceof AndableType))
throw new UnsupportedOperationException(
"Type " + type + " does not support the & operator");
return ((AndableType)type).and(value1, value2);
// ...
}
}
}

Notice the following:


• Interfaces are like abstract classes in that you cannot directly instantiate
an interface (although you can instantiate a class that implements the
105

interface). Interfaces are always abstract; it is not necessary to specify the


abstract keyword.
• Methods declared by interfaces are public and abstract by default; these
keywords need not be specified explicitly.
• A class can extend at most one superclass, but it can additionally implement
any number of interfaces. The interfaces are specified after the implements
keyword. A class that implements an interface must implement each of
the interface’s methods (i.e. declare, for each of the interface’s methods, a
non-abstract method that overrides it), unless the class is declared abstract
itself.
• You can use the instanceof operator to test if an object implements an
interface, in exactly the same way that you can test if it is an instance of
some class. Furthermore, you can use a typecast to cast an object to an
interface type, and then call the interface’s methods on it. Just like when
casting to a class type, a run-time check will be performed to check that the
object’s class indeed implements the interface; if not, a ClassCastException
is thrown.
106 INTERFACES
Implementation inheritance

Consider the following class Color, whose instances represent colors defined by
their red, green, and blue components:
public class Color {
public final int red, green, blue;
public Color(int red, int green, int blue) {
this.red = red;
this.green = green;
this.blue = blue;
}
public int getHue() { /* ... */ }
public int getSaturation() { /* ... */ }
public int getValue() { /* ... */ }
@Override
public String toString() {
return "rgb(" + red + ", " + green + ", " + blue + ")";
}
@Override
public boolean equals(Object other) {
return
other.getClass() == getClass() &&
((Color)other).red == red &&
((Color)other).green == green &&
((Color)other).blue == blue;
}
@Override
public int hashCode() {
return Objects.hash(red, green, blue);
}
}

Often, this class is sufficient. But sometimes, we need to additionally store a


transparency value. Instead of duplicating the existing functionality of Color, we
can declare class TransparentColor as a subclass of Color:
public class TransparentColor extends Color {
public final int transparency;
public TransparentColor(int red, int green, int blue, int transparency) {
super(red, green, blue);
this.transparency = transparency;
}

107
108 IMPLEMENTATION INHERITANCE

@Override
public String toString() {
return
"rgba(" + red + ", " + green + ", " + blue + ", " + transparency + ")";
}
@Override
public boolean equals(Object other) {
return
super.equals(other) &&
((TransparentColor)other).transparency == transparency;
}
@Override
public int hashCode() {
return Objects.hash(super.hashCode(), transparency);
}
}

Notice the following:


• Class TransparentColor inherits the instance (i.e. non-static) fields red, green,
and blue from class Color. This means that for each instance of class
TransparentColor, the computer stores four values: the values for fields red,
green, blue, and transparency.
• Java requires that a constructor first initialize the superclass fields. Typi-
cally, this means calling a superclass constructor, using the special syntax
super(...). If no explicit super(...) call is written at the start of a construc-
tor, an implicit zero-argument super() call is inserted by the compiler. If
the superclass has no zero-argument constructor, the compiler reports an
error.
• The implementation of method equals in class TransparentColor reuses the
implementation of equals in class Color by calling it using the special
super.M(...) syntax. In contrast to regular instance method calls, super
calls are statically bound: the super.equals call in class TransparentColor is
always bound to method equals in class Color.
• Since method equals in class Color checks that this.getClass() equals
other.getClass(), we know that after the call of this method in method
equals of class TransparentColor returns true, other is an instance of of class
TransparentColor so we can safely cast it to type TransparentColor.
• Methods getHue, getSaturation, and getValue are simply inherited by class
TransparentColor from class Color. This means that calling getHue on a
TransparentColor object O executes method getHue from class Color on O.
This, in turn, means that inside such an execution of method getHue, this
will refer to an instance of class TransparentColor.
Lists, sets, and maps

In this chapter, we consider three important examples of the use of inheritance


to generalize over different implementations of an abstract datatype (ADT). In
the first example, the List interface generalizes over the ArrayList and LinkedList
implementations; in the second example, the Set interface generalizes over the
ArraySet and HashSet implementations; and in the third example, the Map interface
generalizes over the ArrayMap and HashMap implementations. Besides serving as
illustrations of inheritance and behavioral subtyping, they also emphasize the
difference between API and implementation by showing how in each case, exactly
the same API is implemented in two very different ways. Furthermore, these
examples are important in their own right; they are some of the most useful and
most widely used data structures and should be known by every programmer.
The examples are simplified versions of the corresponding interfaces and classes
in the java.util package of the Java Platform API; see the discussion at the end
of the chapter.

Lists
If a Java program needs to store a sequence of values, an array can be used.
However, the number of operations available for working with arrays is limited.
For example, it is not possible to add or remove elements. To solve this problem,
we can define a list abstraction, with operations for adding and removing elements
in arbitrary positions.
There are multiple ways to implement such an abstraction, that each have differ-
ent performance characteristics. For example, implementing the list abstraction
by storing the elements in an array supports looking up an element at a given
index and removing the last element in constant time, and adding an element
to the end in amortized constant time, but adding an element to the front
or removing the first element takes time proportional to the size of the list.
Implementing the list abstraction by storing the elements in a doubly-linked list
data structure, on the other hand, supports adding an element to the front or
the back and removing an element from the front or the back in constant time,
but looking up the element at a given index takes time proportional to the size of

109
110 LISTS, SETS, AND MAPS

the list in the worst case. As a result, different implementations are appropriate
for different applications. Therefore, we will implement both an ArrayList class
and a LinkedList class. At the same time, code that manipulates instances of the
abstraction should be able to work with either implementation. Therefore, we
will introduce an interface List that generalizes over the implementations. Code
that accepts objects of type List can be used with ArrayList objects, LinkedList
objects, or any other implementation of the List interface we may build in the
future.
We define interface List as follows:
package collections;

import java.util.Arrays;
import java.util.stream.IntStream;
import java.util.stream.Stream;

/**
* @invar | Arrays.stream(toArray()).allMatch(e -> e != null)
*/
public interface List {

/**
* @inspects | this
* @creates | result
* @post | result != null
*/
Object[] toArray();

default Stream<Object> stream() { return Arrays.stream(toArray()); }

/**
* @inspects | this
* @post | result == toArray().length
*/
int size();

/**
* @pre | 0 <= index
* @pre | index < size()
* @post | result == toArray()[index]
*/
Object get(int index);

/**
* @pre | value != null
* @inspects | this
* @post | result == Arrays.stream(toArray()).anyMatch(e -> e.equals(value))
*/
boolean contains(Object value);

/**
* @pre | 0 <= index
* @pre | index <= size()
* @pre | value != null
* @mutates | this
LISTS 111

* @post | size() == old(size()) + 1


* @post | Arrays.equals(toArray(), 0, index, old(toArray()), 0, index)
* @post | Arrays.equals(toArray(), index + 1, size(),
* | old(toArray()), index, old(size()))
* @post | get(index).equals(value)
*/
void add(int index, Object value);

/**
* @pre | value != null
* @mutates | this
* @post | size() == old(size()) + 1
* @post | Arrays.equals(toArray(), 0, old(size()),
* | old(toArray()), 0, old(size()))
* @post | get(old(size())).equals(value)
*/
default void add(Object value) { add(size(), value); }

/**
* @pre | 0 <= index
* @pre | index < size()
* @mutates | this
* @post | Arrays.equals(toArray(), 0, index, old(toArray()), 0, index)
* @post | Arrays.equals(toArray(), index, size(),
* | old(toArray()), index + 1, old(size()))
*/
void remove(int index);

/**
* @pre | value != null
* @mutates | this
* @post | Arrays.equals(toArray(),
* | IntStream.range(0, old(size()))
* | .filter(i -> i != old(IntStream.range(0, size())
* | .filter(i -> get(i).equals(value))
* | .findFirst().orElse(-1)))
* | .mapToObj(i -> old(toArray())[i]).toArray())
*/
void remove(Object value);

We can implement this API by storing the elements in an array, as follows:


package collections;

import java.util.Arrays;

public class ArrayList implements List {

/**
* @invar | elements != null
* @invar | 0 <= size
* @invar | size <= elements.length
* @invar | Arrays.stream(elements, 0, size).allMatch(e -> e != null)
* @invar | Arrays.stream(elements, size, elements.length)
* | .allMatch(e -> e == null)
112 LISTS, SETS, AND MAPS

*
* @representationObject
*/
private Object[] elements = new Object[10];
private int size;

/**
* @post | size() == 0
*/
public ArrayList() {}

public Object[] toArray() {


return Arrays.copyOf(elements, size);
}

public int size() {


return size;
}

public Object get(int index) {


return elements[index];
}

private int indexOf(Object value) {


for (int i = 0; i < size; i++)
if (elements[i].equals(value))
return i;
return -1;
}

public boolean contains(Object value) {


return indexOf(value) != -1;
}

public void add(int index, Object value) {


if (elements.length == size) {
Object[] newElements = new Object[elements.length * 2];
System.arraycopy(elements, 0, newElements, 0, size);
elements = newElements;
}
System.arraycopy(elements, index, elements, index + 1, size - index);
elements[index] = value;
size++;
}

public void remove(int index) {


size--;
System.arraycopy(elements, index + 1, elements, index, size - index);
elements[size] = null;
}

public void remove(Object value) {


int index = indexOf(value);
if (index != -1)
remove(index);
}
LISTS 113

We can also implement the same API by storing the elements in a doubly-linked
chain of Node objects:
package collections;

public class LinkedList implements List {

private class Node {


/**
* @invar | (element == null) == (this == sentinel)
* @invar | previous != null
* @invar | next != null
* @invar | next.previous == this
* @invar | previous.next == this
*
* @peerObject
*/
private Node previous;
private Object element;
/** @peerObject */
private Node next;

private int getLength() { return this == sentinel ? 0 : 1 + next.getLength(); }


}

/**
* @invar | sentinel != null
* @invar | size == sentinel.next.getLength()
*/
private int size;
/** @representationObject */
private Node sentinel;

private Node getNode(int index) {


Node node = sentinel;
if (index <= size / 2)
for (; index >= 0; index--)
node = node.next;
else
for (; index < size; index++)
node = node.previous;
return node;
}

/**
* @post | size() == 0
*/
public LinkedList() {
sentinel = new Node();
sentinel.previous = sentinel;
sentinel.next = sentinel;
}

public Object[] toArray() {


Object[] result = new Object[size];
114 LISTS, SETS, AND MAPS

int i = 0;
for (Node node = sentinel.next; node != sentinel; node = node.next)
result[i++] = node.element;
return result;
}

public int size() {


return size;
}

public Object get(int index) {


return getNode(index).element;
}

public boolean contains(Object value) {


for (Node node = sentinel.next; node != sentinel; node = node.next)
if (node.element.equals(value))
return true;
return false;
}

public void add(int index, Object value) {


Node next = getNode(index);
Node node = new Node();
node.element = value;
node.next = next;
node.previous = next.previous;
node.next.previous = node;
node.previous.next = node;
size++;
}

public void add(Object value) {


add(size, value);
}

public void remove(int index) {


Node node = getNode(index);
node.next.previous = node.previous;
node.previous.next = node.next;
size--;
}

public void remove(Object value) {


Node node = sentinel.next;
for (;;) {
if (node == sentinel)
return;
if (node.element.equals(value)) {
node.next.previous = node.previous;
node.previous.next = node.next;
size--;
return;
}
node = node.next;
}
}
SETS 115

Sets
Checking whether an ArrayList or LinkedList object contains a given value and
removing a given value from an ArrayList or LinkedList take time proportional to
the size of the list. If those operations are important in a given application, a
hash table may be more appropriate. Assuming a good hash function is used,
checking for the presence of an element, adding an element if it is not yet present,
and removing an element take (amortized) constant expected time. Therefore,
hash tables are ideal for implementing a set abstract data type.
We first define interface Set as follows:
package collections;

import java.util.Arrays;
import java.util.stream.Stream;

/**
* @invar The set has no null elements.
* | Arrays.stream(toArray()).allMatch(e -> e != null)
* @invar {@code toArray()} does not contain duplicates.
* | Arrays.stream(toArray()).distinct().count() == size()
*/
public interface Set {

/**
* @inspects | this
* @creates | result
* @post | result != null
*/
Object[] toArray();

default Stream<Object> stream() { return Arrays.stream(toArray()); }

/**
* @inspects | this
* @post | result == toArray().length
*/
int size();

/**
* @pre | value != null
* @inspects | this
* @post | result == Arrays.stream(toArray()).anyMatch(e -> e.equals(value))
*/
boolean contains(Object value);

/**
* @pre | value != null
* @mutates | this
* @post The given value is in the set.
116 LISTS, SETS, AND MAPS

* | Arrays.stream(toArray()).anyMatch(e -> e.equals(value))


* @post No elements have disappeared from the set.
* | Arrays.stream(old(toArray())).allMatch(eo ->
* | Arrays.stream(toArray()).anyMatch(e -> e.equals(eo)))
* @post No elements, other than the given value, have been added.
* | Arrays.stream(toArray()).allMatch(e -> e.equals(value) ||
* | Arrays.stream(old(toArray())).anyMatch(eo -> e.equals(eo)))
*/
void add(Object value);

/**
* @pre | value != null
* @mutates | this
* @post The given value is no longer in the set.
* | Arrays.stream(toArray()).noneMatch(e -> e.equals(value))
* @post No elements, other than the given value, have disappeared
* from the set.
* | Arrays.stream(old(toArray())).allMatch(eo -> eo.equals(value) ||
* | Arrays.stream(toArray()).anyMatch(e -> e.equals(eo)))
* @post No elements have been added to the set.
* | Arrays.stream(toArray()).allMatch(e ->
* | Arrays.stream(old(toArray())).anyMatch(eo -> e.equals(eo)))
*/
void remove(Object value);

A Set object implemented using a hash table with capacity K distributes its
elements among K buckets. For each bucket, we need some other Set implemen-
tation to store the elements that belong to that bucket. For this purpose, we
first define a simple ArrayList-based Set implementation:
package collections;

public class ArraySet implements Set {

/**
* @invar | elements != null
* @invar | elements.stream().distinct().count() == elements.size()
*
* @representationObject
*/
private ArrayList elements = new ArrayList();

/** @post | size() == 0 */


public ArraySet() {}

public Object[] toArray() { return elements.toArray(); }

public int size() { return elements.size(); }

public boolean contains(Object value) { return elements.contains(value); }

public void add(Object value) {


if (elements.contains(value))
return;
SETS 117

elements.add(value);
}

public void remove(Object value) { elements.remove(value); }

Notice that the contains, add, and remove operations of an ArraySet take time
proportional to the size of the set. This will not matter, because assuming a
good hash function is used, and assuming the hash table is not overloaded, the
expected number of elements in each bucket is independent of the total number
of elements in the hash table.
We can now define the hash table-based Set implementation:
package collections;

import java.util.Arrays;
import java.util.stream.IntStream;

public class HashSet implements Set {

/**
* @invar | buckets != null
* @invar | Arrays.stream(buckets).allMatch(b -> b != null)
* @invar | IntStream.range(0, buckets.length).allMatch(i ->
* | buckets[i].stream().allMatch(e ->
* | Math.floorMod(e.hashCode(), buckets.length) == i))
*
* @representationObject
* @representationObjects Each bucket is a representation object
*/
private Set[] buckets;

private Set getBucket(Object value) {


return buckets[Math.floorMod(value.hashCode(), buckets.length)];
}

/**
* @pre | 0 < capacity
* @post | size() == 0
*/
public HashSet(int capacity) {
buckets = new Set[capacity];
for (int i = 0; i < buckets.length; i++)
buckets[i] = new ArraySet();
}

public Object[] toArray() {


Object[] result = new Object[size()];
int offset = 0;
for (int i = 0; i < buckets.length; i++) {
Object[] bucketElements = buckets[i].toArray();
System.arraycopy(bucketElements, 0,
result, offset, bucketElements.length);
offset += bucketElements.length;
118 LISTS, SETS, AND MAPS

}
return result;
}

public int size() {


return Arrays.stream(buckets).mapToInt(b -> b.size()).sum();
}

public boolean contains(Object value) {


return getBucket(value).contains(value);
}

public void add(Object value) {


getBucket(value).add(value);
}

public void remove(Object value) {


getBucket(value).remove(value);
}

Notice that the HashSet object simply delegates the contains, add, and remove
operations to the element’s bucket, which is determined by retrieving its hash
code using the hashCode() method and deriving the bucket index by taking the
remainder of dividing the hash code by the number of buckets. If method
hashCode() implements a good hash function, which means that the hash codes
of the elements are uniformly distributed, then the expected number of elements
in each bucket is proportional to the load factor, which is the total number of
elements divided by the number of buckets. This means that if the load factor is
bounded, then so is the expected number of elements in each bucket, and as a
result, the expected time taken by contains, add, and remove is independent of the
number of elements in the HashSet object. The HashSet implementation shown
above has a fixed capacity; this works well if the maximum load (i.e. number
of elements) of the HashSet object is known beforehand. If not, code should be
added to increase the capacity and rehash (i.e. copy the elements from the old
buckets to the new buckets) when the load factor exceeds some fixed threshold
value.

Maps
Applications often need to store a set of key-value pairs (where distinct pairs
have distinct keys) and efficiently add a pair and retrieve the value associated
with a given key. In Java, such sets are known as maps and a key-value pair is
called a map entry.
We first define the Map interface:
package collections;

import java.util.Objects;
MAPS 119

public interface Map {

/** @immutable */
class Entry {

/**
* @invar | key != null
* @invar | value != null
*/
private final Object key;
private final Object value;

/** @post | result != null */


public Object getKey() { return key; }
/** @post | result != null */
public Object getValue() { return value; }

/**
* @pre | key != null
* @pre | value != null
* @post | getKey() == key
* @post | getValue() == value
*/
public Entry(Object key, Object value) {
this.key = key;
this.value = value;
}

@Override
public boolean equals(Object other) {
return other instanceof Entry
&& key.equals(((Entry)other).getKey())
&& value.equals(((Entry)other).getValue());
}

@Override
public int hashCode() {
return Objects.hash(key, value);
}

/**
* @inspects | this
* @creates | result
* @post | result != null
* @post | result.stream().allMatch(e -> e instanceof Entry)
* @post No key appears twice.
* | result.stream().map(e -> ((Entry)e).getKey()).distinct().count()
* | == result.size()
*/
Set entrySet();

/**
* @post | result == entrySet().stream()
* | .filter(e -> ((Entry)e).getKey().equals(key))
120 LISTS, SETS, AND MAPS

* | .map(e -> ((Entry)e).getValue())


* | .findFirst().orElse(null)
*/
Object get(Object key);

/**
* @pre | key != null
* @pre | value != null
* @mutates | this
* @post The given entry is in the entry set.
* | entrySet().contains(new Entry(key, value))
* @post No entries, except for the updated one, have disappeared from the
* entry set.
* | old(entrySet()).stream().allMatch(e ->
* | ((Entry)e).getKey().equals(key) || entrySet().contains(e))
* @post No entries, except for the updated one, have been added to the entry
* set.
* | entrySet().stream().allMatch(e ->
* | ((Entry)e).getKey().equals(key) || old(entrySet()).contains(e))
*/
void put(Object key, Object value);

/**
* @pre | key != null
* @mutates | this
* @post All entries in the entry set were already in the entry set and
* have a key that is different from the given key.
* | entrySet().stream().allMatch(e -> !((Entry)e).getKey().equals(key)
* | && old(entrySet()).contains(e))
* @post All entries that were in the entry set, except for the specified one,
* are still in the entry set.
* | old(entrySet()).stream().allMatch(e ->
* | ((Entry)e).getKey().equals(key) || entrySet().contains(e))
*/
void remove(Object key);

We can implement this interface efficiently using a hash table. Again, we first need
a separate Map implementation for storing the entries that belong to a particular
bucket. For this purpose, we define a simple ArrayList-based implementation:
package collections;

public class ArrayMap implements Map {

/**
* @invar | entries != null
* @invar | entries.stream().allMatch(e -> e instanceof Entry)
* @invar | entries.stream().map(e -> ((Entry)e).getKey()).distinct().count()
* | == entries.size()
*
* @representationObject
*/
private ArrayList entries = new ArrayList();

private int indexOf(Object key) {


MAPS 121

for (int i = 0; i < entries.size(); i++) {


Entry entry = (Entry)entries.get(i);
if (entry.getKey().equals(key))
return i;
}
return -1;
}

public Set entrySet() {


Set result = new ArraySet();
for (int i = 0; i < entries.size(); i++)
result.add(entries.get(i));
return result;
}

public Object get(Object key) {


int index = indexOf(key);
return index == -1 ? null : ((Entry)entries.get(index)).getValue();
}

/**
* @post | entrySet().size() == 0
*/
public ArrayMap() {}

public void put(Object key, Object value) {


int index = indexOf(key);
if (index != -1)
entries.remove(index);
entries.add(new Entry(key, value));
}

public void remove(Object key) {


int index = indexOf(key);
if (index != -1)
entries.remove(index);
}

We can now define the efficient hash table-based implementation:


package collections;

import java.util.Arrays;
import java.util.stream.IntStream;

public class HashMap implements Map {

/**
* @invar | buckets != null
* @invar | Arrays.stream(buckets).allMatch(b -> b != null)
* @invar | IntStream.range(0, buckets.length).allMatch(i ->
* | buckets[i].entrySet().stream().allMatch(e ->
* | Math.floorMod(((Entry)e).getKey().hashCode(),
* | buckets.length) == i))
122 LISTS, SETS, AND MAPS

*
* @representationObject
* @representationObjects
*/
private Map[] buckets;

private Map getBucket(Object key) {


return buckets[Math.floorMod(key.hashCode(), buckets.length)];
}

public Set entrySet() {


ArraySet result = new ArraySet();
for (int i = 0; i < buckets.length; i++)
for (Object entry : buckets[i].entrySet().toArray())
result.add(entry);
return result;
}

public Object get(Object key) {


return getBucket(key).get(key);
}

/**
* @pre | 0 < capacity
* @post | entrySet().size() == 0
*/
public HashMap(int capacity) {
buckets = new Map[capacity];
for (int i = 0; i < capacity; i++)
buckets[i] = new ArrayMap();
}

public void put(Object key, Object value) {


getBucket(key).put(key, value);
}

public void remove(Object key) {


getBucket(key).remove(key);
}

The Java Collections Framework


The Java Platform API is a library of classes and interfaces that are available
to every Java program. It is divided into modules, each of which is divided
into packages. The java.util package of the java.base module contains the Java
Collections Framework, a library of interfaces representing abstract datatypes
and classes that implement them. ADT interfaces include List, Set, and Map, and
the classes include ArrayList, LinkedList, HashSet, and HashMap, and many others.
The interfaces and classes we defined above are simplified versions of the ones in
the Java Collections Framework.
THE JAVA COLLECTIONS FRAMEWORK 123

Another notable interface is interface Collection, which generalizes over lists,


sets, and other collections. Map objects are not Collection objects, but given a Map
object myMap, its set of keys myMap.keySet(), its set of entries myMap.entrySet(), and
its collection of values myMap.values() are Collection objects.
Like we did above, the classes from the Java Collections Framework use the
equals(Object) method to compare a given value to the elements of the collection
or the keys of the map. For example, if we call myCollection.contains(myValue),
the result is true if and only if there is an element e in the collection such that
myValue.equals(e). Furthermore, again like we did above, the hash table-based
implementations use the hashCode() method of the elements/keys to distribute
their elements/entries among the buckets of the hash table. For this to work
correctly, it is crucial that equal elements have equal hash codes.
The interfaces and classes from the Java Collections Framework are generic:
instead of just using Object as the element type, they take the element type as
a type argument. For example, ArrayList<Student> has element type Student, and
Map<String, Course> stores entries whose keys are String objects and whose values
are Course objects.
In Java, type arguments have to be classes or interfaces; you cannot use the
primitive types boolean, byte, short, int, long, char, float, or double as a type
argument. However, you can use the corresponding wrapper class Boolean, Byte,
Short, Integer, Long, Character, Float, or Double. Java automatically converts
between a primitive type and its wrapper class as necessary. For example:
ArrayList<Integer> myInts = new ArrayList<Integer>();
myInts.add(10); // Java automatically wraps value 10 into an `Integer` object.
int x = myInts.get(0); // Java automatically retrieves the wrapped value from the wrapper object.

(Note: the object allocation involved in wrapping a primitive value into an object
may cause a significant overhead in terms of time and space.)
You can iterate over any collection object using the enhanced for loop:
Collection<Student> myStudents = ...;
for (Student student : myStudents)
System.out.println(student.getName());

Some important methods and constructors:


• List.of(), Set.of(),
and Map.of() create an empty immutable list, set, or
map, respectively.
• List.of(42), Set.of(42), and Map.of(42, "forty-two") create a singleton im-
mutable list or a singleton immutable set containing the element 42, or
a singleton immutable map containing an entry with key 42 and value
"forty-two", respectively.
• List.of(10, 20), Set.of(10, 20), Map.of(10, "ten", 20, "twenty") create an im-
mutable list or set containing the two elements 10 and 20, or an immutable
map mapping 10 to "ten" and 20 to "twenty", respectively.
124 LISTS, SETS, AND MAPS

• List.copyOf(myList) returns an immutable copy of List object myList;


Set.copyOf(mySet) returns an immutable copy of Set object mySet; and
Map.copyOf(myMap) returns an immutable copy of Map object myMap. These
methods are useful to ensure encapsulation of representation objects. (In
fact, List.copyOf and Set.copyOf accept arbitrary collections, not just lists
or sets.)
• An alternative approach for ensuring encapsulation when returning a col-
lection from an API is to use methods Collections.unmodifiableList(myList),
Collections.unmodifiableSet(mySet), or Collections.unmodifiableMap(myMap),
which return an unmodifiable view of the given List, Set, or Map object. The
mutator methods of the view object throw an UnsupportedOperationException,
but changes made to the myList, mySet, or myMap objects are visible through
the view.
• new ArrayList<>(myCollection), new HashSet<>(myCollection), and new HashMap<>(myMap)
create a new ArrayList, HashSet, or HashMap object initialized with the
elements of the given Collection or Map object, respectively.
• Arrays.asList(myArray) returns a List view of array myArray; changes to the ar-
ray are visible through the List object, and vice versa. Calling a mutator on
the List object that changes its size throws an UnsupportedOperationException.
• myCollection.addAll(yourCollection) adds all elements of yourCollection to
myCollection.
• Collections.sort(myList) sorts myList.
• Objects.equals(o1, o2) returns true if and only if either o1 and o2 are both
null, or o1 is not null and o1.equals(o2) returns true.
• myList.equals(myObject) returns true if and only if myObject is a List object,
has the same size as myList, and for element e1 of myList and corresponding
element e2 of myObject, Objects.equals(e1, e2) returns true.
• Similarly, mySet.equals(myObject) returns true if and only if myObject is a Set
object, has the same size as mySet, and for every element e1 of mySet, there
is an element e2 in myObject such that Objects.equals(e1, e2) returns true.
• Similarly, myMap.equals(myObject) returns true if and only if myObject is a Map
object and their entry sets are equal as defined above.

Other notable data structures


• ArrayDeque implements ADTs Queue and Deque (double-ended queue) using
the ring buffer data structure. It supports adding to and removing from
the front and the back in constant time, with better performance than
LinkedList.
• When iterating over HashSet or HashMap objects, the order in which elements
are returned is unspecified and may be different every time. LinkedHashSet
and LinkedHashMap are like HashSet and HashMap except that the iteration order
is well-defined: the elements are returned in the order in which they were
added.

The following data structures require that the elements be comparable. Specifi-
THE JAVA COLLECTIONS FRAMEWORK 125

cally, either the elements have to implement interface Comparable or a Comparator


must be specified. Java Platform API classes that implement interface Comparable
include Byte, Short, Integer, Long, Char, Float, Double, and String.
• implements interface SortedSet and TreeMap implements interface
TreeSet
using a red-black tree data structure (a type of balanced search
SortedMap
tree). Lookups and mutations of elements/entries take time logarithmic in
the number of elements/entries. When iterating over the elements/entries,
they are returned in ascending order.
• PriorityQueue implements interface Queue using a priority heap data structure.
Adding an element and removing the least element take time logarithmic
in the number of elements; looking up the least element takes constant
time. It performs better in space and time than a search tree for these
operations, but in contrast to a search tree, iteration over the elements
does not return them in any particular order.
126 LISTS, SETS, AND MAPS
Entity-relationship
abstractions

In this course, we consider the question of how to split up complex software


development tasks into simpler subtasks. The main tool we teach is abstraction:
developing modules that extend the programming language with new operations
(procedural abstraction) and new datatypes (data abstraction) so that client
modules can be written in a more powerful programming language.
In this course, we consider two types of data abstractions:
• single-object abstractions, where each instance of some class represents
(immutable value abstractions) or stores (mutable value abstractions) some
abstract value. See here for an introduction to value abstraction.
• multi-object abstractions, where groups of objects of a single class (single-
class multi-object abstractions) or groups of objects of multiple classes
(multi-class multi-object abstractions) are used to together store special
kinds of abstract values called entity graphs.
Multi-object abstractions, also called entity-relationship abstractions, are the
topic of this document.

OOP Teams
Suppose I wanted to write a Java program to manage the Object-Oriented
Programming course. In particular, I would want to track team compositions
for the project. Students are encouraged to work in teams of two students, but
they can work alone if they really want to.
The information I want the program to keep track of, then, is, mathematically
speaking, the set S of students, along with the relation is-teammate-of on this
set. (One can think of this relation as a subset of the set of pairs of students S
× S.)
For example, suppose the course has five students: S = {Alice, Bob, Carol, Dan,
Eve}. Suppose Alice and Bob constitute a team, and so do Carol and Dan, but

127
128 ENTITY-RELATIONSHIP ABSTRACTIONS

Eve works alone. Then we have is-teammate-of = {(Alice, Bob), (Bob, Alice),
(Carol, Dan), (Dan, Carol)}. (is-teammate-of must always be a symmetric
relation: if X is a teammate of Y, then Y is necessarily a teammate of X as
well.)

Generalizing from this example, it is often the case that we want to use Java
programs to store information of this form, which we call entity graphs. In the
general case, an entity graph consists of some number of sets of entities, and some
number of relations between particular sets of entities. Furthermore, the graph
may associate particular values with each element of some set of entities. For
example, we may want to store each student’s study programme; mathematically,
this corresponds to a function that maps each element of S to some element of
the set P of study programmes. We call each such function, that associates a
value with each entity from some entity set, an attribute. In general, then, an
entity graph consists of some sets of entities, some relations between the sets of
entities, and some attributes.

Writing a program for managing the OOP course in Java includes the complexity
of correctly storing and manipulating OOP team composition graphs. Writing
such a program would be easier if it could be written in a programming language,
let’s call it Java++, that has a built-in way to store these graphs, that is, that
has a built-in type whose values represent OOP students, and built-in operations
for setting and getting a student’s teammate and study programme.

We can achieve this complexity reduction by splitting the task of writing a Java
program for managing the OOP course into two subtasks:

• Writing a client module for managing the OOP course, written in the
language Java++ that has built-in types and operations for storing OOP
team composition graphs
• Writing an OOP team composition graphs module that implements the
extra datatypes and operations of Java++ (or, in other words, the OOP
team composition graph abstraction) in terms of the regular Java datatypes
and constructs such as classes, fields, and methods.

By composing the OOP team composition graphs module and the client module,
we obtain a program for managing the OOP course written in Java.

A natural way to write a module for storing a particular type of entity graph
is to introduce a Java class for each type of entity, a getter in class E for each
attribute of entity type E, and getters in classes E and F for each relation
between entity types E and F.

To store a particular entity graph, then, we would create a separate instance of


class E for each entity of type E in the entity graph, and whenever a relation
from the entity graph links entities e1 and e2, we would store a reference to e2
in object e1 and a reference to e1 in object e2. Obviously, we would also store
in object e the attribute values assigned to entity e by the entity graph.
OOP TEAMS 129

For example, here is one way to implement the OOP team composition graph
abstraction in Java:

package teams;

/**
* Each instance of this class represents an Object-Oriented Programming student
* in an OOP team composition graph.
*
* @invar | getStudyProgramme() != null
* @invar | getTeammate() == null || getTeammate().getTeammate() == this
*/
public class OOPStudent {

/**
* @invar | studyProgramme != null
* @invar | teammate == null || teammate.teammate == this
*/
private final StudyProgramme studyProgramme;
/** @peerObject */
private OOPStudent teammate;

/** @immutable */
public StudyProgramme getStudyProgramme() {
return studyProgramme;
}

/**
* Returns this student's teammate, or {@code null} if this student has no
* teammate.
*
* @peerObject
*/
public OOPStudent getTeammate() {
return teammate;
}

/**
* Initializes this object to represent a student with the given study
* programme and no teammate.
*
* @throws IllegalArgumentException if {@code studyProgramme} is null
* | studyProgramme == null
* @post This student's study programme equals the given study programme
* | getStudyProgramme() == studyProgramme
* @post This student has no teammate
* | getTeammate() == null
*/
public OOPStudent(StudyProgramme studyProgramme) {
if (studyProgramme == null)
throw new IllegalArgumentException("`studyProgramme` is null");
this.studyProgramme = studyProgramme;
}

/**
* Registers the fact that this student is teaming up with the given student.
*
130 ENTITY-RELATIONSHIP ABSTRACTIONS

* @throws IllegalArgumentException if {@code teammate} is null


* | teammate == null
* @throws IllegalStateException if this student already has a teammate
* | getTeammate() != null
* @throws IllegalArgumentException if the given student already has a teammate
* | teammate.getTeammate() != null
* @mutates | this, teammate
* @post This student's teammate equals the given teammate
* | getTeammate() == teammate
* @post The given student's teammate equals this student
* (Note: this postcondition is redundant because it follows from the
* public class invariant.)
* | teammate.getTeammate() == this
*/
public void setTeammate(OOPStudent teammate) {
if (teammate == null)
throw new IllegalArgumentException("`teammate` is null");
if (this.teammate != null)
throw new IllegalStateException("This student already has a teammate");
if (teammate.teammate != null)
throw new IllegalArgumentException(
"The given teammate already has a teammate");
this.teammate = teammate;
teammate.teammate = this;
}

/**
* Registers the fact that this student is splitting up with their teammate.
*
* @throws IllegalStateException if this student has no teammate
* | getTeammate() == null
* @mutates | this
* @post This student has no teammate
* | getTeammate() == null
* @post This student's old teammate has no teammate
* | old(getTeammate()).getTeammate() == null
*/
public void clearTeammate() {
if (teammate == null)
throw new IllegalStateException(
"This student does not have a teammate");
this.teammate.teammate = null;
this.teammate = null;
}
}

The abstraction exposes the is-teammate-of relation as a getter getTeammate() on


class OOPStudent. The call s.getTeammate() returns either student s’s teammate, or
null if student s has no teammate. This reflects the fact that in OOP, a student
can have at most one teammate.
CONSISTENCY OF BIDIRECTIONAL ASSOCIATIONS 131

Consistency of bidirectional associations


This example exhibits a typical characteristic of entity-relationship abstractions:
the states of the objects that together constitute the entity-relationship abstrac-
tion are not independent: if s1.getTeammate() returns s2, then s2.getTeammate()
must return s1. We call this the consistency of the bidirectional asso-
ciation. It is characteristic of entity-relationship abstractions that there are
bidirectional associations between the objects; that is, these objects allow clients
to navigate along the association in both directions. It is a crucial responsibility
of a module that implements an entity-relationship abstraction that it maintain
the consistency of the bidirectional associations at all times. Notice that methods
setTeammate and clearTeammate do so carefully. Each check is necessary; for example,
if we left out the this.teammate != null check in setTeammate, the following client
code would break the consistency of the bidirectional association:
alice.setTeammate(bob);
alice.setTeammate(carol); // Should fail!
assertEquals(alice, bob.getTeammate()); // Inconsistent!

The second setTeammate call is incorrect and should cause an exception. Otherwise,
after this call bob.getTeammate() returns alice while alice.getTeammate() returns
carol, which is inconsistent.

Peer groups
This interdependency between the states of the objects that constitute the
entity-relationship abstraction is reflected in the representation invariants: the
representation invariant for an OOPStudent object s1 where s1.teammate == s2 talks
not just about the fields of s1 but about the fields of s2 as well: in particular,
it specifies that s2.teammate == s1. This means that changing the state of object
s2 can break the representation invariant of s1. This, in turn, means that when
implementing the methods of class OOPStudent, it is crucial to remember not just
to preserve the validity of the representation of the objects directly involved in
the call, but also to remember to preserve the validity of the representation of the
other objects whose representation invariants involve the objects directly involved
in the call. For example, in method clearTeammate, to preserve the representation
invariants of this, it is sufficient to set this.teammate = null. The extra assignment
this.teammate.teammate = null is necessary to preserve the representation invariant
of the receiver’s old teammate.
Generalizing from this example, in order to avoid inadvertently breaking the
representation invariants of objects, when implementing methods we need to
know which objects’ representations invariants we may break when mutating
particular objects. For this reason, we introduce the following rules:
• The representation invariants for an object X may mention a non-final
field or non-immutable property of an object Y only if:
132 ENTITY-RELATIONSHIP ABSTRACTIONS

– X = Y, or
– Y is a representation object of X, or
– Y is a peer object of X.
• If X is a peer object of Y, then Y must be a peer object of X.
We define the set of representation objects of an object X as the set of objects
reachable via the fields marked @representationObject of X. For example, the
array object used by an IntList object to store the elements of the list is a
representation object of the IntList object. Representation objects of an object
X must be encapsulated inside object X; that is, it must not be possible to access
the representation objects of X outside the class of X.
We define the set of peer objects of an object X as the set of objects reachable via
the fields marked @peerObject of X. For example, the teammate of an OOPStudent
object is a peer object of the OOPStudent object. This allows the representation
invariant to mention the teammate’s teammate field.
We refer to a set of objects, each of which is a direct or indirect peer object of
the others, as a peer group. The objects of a peer group together constitute an
entity-relationship abstraction. It does not make sense to consider each member
of a peer group as a separate abstraction; the notions of representation validity
and abstract state apply meaningfully only to a peer group as a whole.
The @peerObject tags on fields are internal documentation for internal use by a
module author to reason about the correctness of the module. However, it is
also important for client module authors to know which objects constitute a peer
group. Therefore, we also define an object’s peer group publicly by marking
appropriate public getters as @peerObject as well, as in the example above.
Since the objects of a peer group do not have independent representations
or abstract states, we interpret an object mentioned in a @mutates clause as
representing that object’s entire peer group. This is why, in the documentation
for method clearTeammate(), it is sufficient to write @mutates | this: the method
mutates object this.teammate as well, but since that object is a peer object of
this, it need not be mentioned separately.
Multi-class
entity-relationship
abstractions

In Entity-relationship abstractions, we introduced the concept of entity-


relationship abstractions. The examples we discussed there involved entity
graphs consisting of only a single type of entity: OOPStudent. Such abstractions are
built from groups of instances of class OOPStudent. We refer to such abstractions
as single-class entity-relationship abstractions.

In this note, we consider the task of designing and implementing abstractions


that store entity graphs involving multiple entity types. We will declare a
separate class for each entity type. Such abstractions are therefore multi-class
entity-relationship abstractions.

To illustrate the concepts, we will consider a generalization of the OOP team


composition graphs example: we will consider graphs of project course students
and the teams they belong to. For simplicity, a team may have arbitrarily many
members, but each student may be a member of only one team.

To implement an abstraction for storing such graphs, we will introduce a class


ProjectCourseStudent to represent students and a class Team to represent teams:

133
134 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS

Here is an initial attempt at implementing these classes:


package bigteams;

public class ProjectCourseStudent {

private Team team;

public Team getTeam() { return team; }

public ProjectCourseStudent() {}

public void join(Team team) {


if (this.team != null)
throw new IllegalStateException("this student is already in a team");

this.team = team;
team.members.add(this);
}

public void leaveTeam() {


if (this.team == null)
throw new IllegalStateException("this student is not in a team");

team.members.remove(this);
team = null;
}
}

package bigteams;

public class Team {


135

private HashSet<ProjectCourseStudent> members = new HashSet<>();

public Set<ProjectCourseStudent> getMembers() { return Set.copyOf(members); }

public Team() {}

package bigteams;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class BigTeamsTest {

@Test
void test() {
ProjectCourseStudent student1 = new ProjectCourseStudent();
ProjectCourseStudent student2 = new ProjectCourseStudent();
Team team = new Team();

student1.join(team);
assertEquals(team, student1.getTeam());
assertEquals(Set.of(student1), team.getMembers());

student2.join(team);
assertEquals(team, student2.getTeam());
assertEquals(Set.of(student1, student2), team.getMembers());

student1.leaveTeam();
assertEquals(Set.of(student2), team.getMembers());

student2.leaveTeam();
assertEquals(Set.of(), team.getMembers());
}

Notice that we use a HashSet to store the members of a team. Class HashSet is
a generic class; it takes the type of its elements as a type parameter. In the
parameterized type HashSet<ProjectCourseStudent>, ProjectCourseStudent is the type
argument for generic class HashSet.
Type HashSet<T> is a subtype of type Set<T>. This means that every instance of
HashSet<T> is also an instance of Set<T>.
Notice also that we implement a bidirectional association between classes
ProjectCourseStudent and class Team, and we make sure that its consistency is
preserved at all times: if a ProjectCourseStudent instance S refers to a Team in-
stance T, then T also refers to S, and vice versa.
Unfortunately, the code above is rejected by Java’s static type checker. Indeed,
class ProjectCourseStudent accesses class Team’s private field members. The private
136 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS

fields of a class may be accessed by the class itself only. We could make field
members public but this would break the abstraction’s encapsulation: clients would
see the abstraction’s internal implementation details and could modify the field
without respecting the abstraction’s representation invariants. For example, a
client could set the members field of some Team instance T to null or could add a
ProjectCourseStudent instance S to its HashSet without updating S’s team field to
point to T, thus breaking the consistency of the bidirectional association and
the validity of the abstraction’s representation.

What we need is some way to make field members accessible to class


ProjectCourseStudent without making it accessible to clients. In Java, this can
be done by putting classes ProjectCourseStudent and Team into a package by
themselves, and making field members package-accessible. A Java program element
is package-accessible by default; that is, if you do not specify any accessibility
modifier (such as private or public) for a program element, the program element
is package-accessible. This means it may be accessed by any code in the same
package.

We conclude that to implement multi-class abstractions, we generally need to


implement them using a package as the unit of encapsulation rather than
a class. We apply this principle to the example by moving the example
client (class BigTeamsTest) out of package bigteams and by making the fields
of class ProjectCourseStudent and Team package-accessible. Analogously to class-
encapsulated abstractions, we specify package representation invariants (using
@invar clauses in the fields’ Javadoc comments) and package abstract state
invariants (using @invar clauses in the classes’ Javadoc comments):
package bigteams;

import logicalcollections.LogicalSet;

/**
* Each instance of this class represents a student in a project course,
* as part of a student-team graph.
*
* @invar If a student is in a team, it is among its members.
* | getTeam() == null || getTeam().getMembers().contains(this)
*/
public class ProjectCourseStudent {

/**
* @invar | true // Phase 1 representation invariant
* @invar | team == null || team.members.contains(this) // Phase 2 rep. inv.
*
* @peerObject
*/
Team team;

/**
* Returns this student's team, or {@code null} if they are not in a team.
*
* @peerObject
137

* @basic
*/
public Team getTeam() { return team; }

/**
* Initializes this object as representing a student who is not in a team.
*/
public ProjectCourseStudent() {}

/**
* Make this student a member of the given team.
*
* @throws IllegalArgumentException if {@code team} is null.
* | team == null
* @throws IllegalStateException if this student is already in a team.
* | getTeam() != null
*
* @mutates_properties | this.getTeam(), team.getMembers()
*
* @post The given team's members equal its old members plus this student.
* | team.getMembers().equals(LogicalSet.plus(old(team.getMembers()), this))
*/
public void join(Team team) {
if (team == null)
throw new IllegalArgumentException("team is null");
if (this.team != null)
throw new IllegalStateException("this student is already in a team");

this.team = team;
team.members.add(this);
}

/**
* Make this student no longer be a member of their team.
*
* @throws IllegalStateException if this student is not in a team.
* | getTeam() == null
*
* @mutates_properties | this.getTeam(), this.getTeam().getMembers()
*
* @post This student is not in a team.
* | getTeam() == null
* @post This student's old team's members are its old members minus this
* student.
* | old(getTeam()).getMembers().equals(
* | LogicalSet.minus(old(getTeam().getMembers()), this))
*/
public void leaveTeam() {
if (this.team == null)
throw new IllegalStateException("this student is not in a team");

team.members.remove(this);
team = null;
}
}

package bigteams;
138 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS

import java.util.HashSet;
import java.util.Set;

/**
* Each instance of this class represents a team in a student-team graph.
*
* @invar Each of this team's members has this team as its team.
* | getMembers().stream().allMatch(s -> s != null && s.getTeam() == this)
*/
public class Team {

/**
* @invar | members != null // Phase 1 representation invariant
* @invar | members.stream().allMatch(s ->
* | s != null && s.team == this) // Phase 2 rep. inv.
*
* @representationObject
* @peerObjects
*/
HashSet<ProjectCourseStudent> members = new HashSet<>();

/**
* Returns this team's set of members.
*
* @post | result != null
* @creates | result
* @peerObjects
* @basic
*/
public Set<ProjectCourseStudent> getMembers() { return Set.copyOf(members); }

/**
* Initializes this object as representing an empty team.
*
* @mutates | this
* @post This team has no members.
* | getMembers().isEmpty()
*/
public Team() {}

package bigteams.tests;

import static org.junit.jupiter.api.Assertions.*;

import java.util.Set;

import org.junit.jupiter.api.Test;

import bigteams.ProjectCourseStudent;
import bigteams.Team;

class BigTeamsTest {

@Test
139

void test() {
ProjectCourseStudent student1 = new ProjectCourseStudent();
ProjectCourseStudent student2 = new ProjectCourseStudent();
Team team = new Team();

student1.join(team);
assertEquals(team, student1.getTeam());
assertEquals(Set.of(student1), team.getMembers());

student2.join(team);
assertEquals(team, student2.getTeam());
assertEquals(Set.of(student1, student2), team.getMembers());

student1.leaveTeam();
assertEquals(Set.of(student2), team.getMembers());

student2.leaveTeam();
assertEquals(Set.of(), team.getMembers());
}

Notice the following:


• We define each object’s peer group by putting a @peerObject tag in the
Javadoc comment for field team and a @peerObjects tag in the Javadoc
comment for field members. The latter means that for any Team object T,
each element of T.members is a peer object of T.
• We also let our clients know each object’s peer group by also putting
a @peerObject tag in the Javadoc comment for getter getTeam() and a
@peerObjects tag in the Javadoc comment for getter getMembers().
• The representation invariant team == null || team.members.contains(this) de-
clared in class ProjectCourseStudent is not well-defined if team.members is null.
This invariant should be guarded by another invariant that says that
team.members is not null, and that is checked first. There is a representa-
tion invariant members != null in class Team. We can use it to guard the
invariant in class ProjectCourseStudent by defining the order of checking the
representation invariants of a peer group as follows: first, the Phase 1
representation invariant of each peer object is checked; then, the Phase 2
representation invariant of each peer object is checked; and so on, until all
representation invariants have been checked. We simply define the Phase 1
representation invariant of an object to be the first representation invariant
defined in the object’s class, and so on. This is why we insert a dummy
representation invariant true into class ProjectCourseStudent; this ensures
that the invariant that relies on team.members being non-null is a Phase 2
invariant.
• We use methods LogicalSet.plus and LogicalSet.minus from logicalcollections
to concisely specify the effect of join and leaveTeam on the team’s set of
members.
• We use a @mutates_properties | this.getTeam(), team.getMembers() clause to
140 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS

conveniently specify that joinTeam(team) mutates the peer group of this and
the peer group of team, and that for any object O in either of these peer
groups, and for any basic getter M of O, we have either that (O, M) is in the
set {(this, getTeam), (team, getMembers)} or Objects.equals(O.M(), old(O.M())).
That is, from the client’s point of view, joinTeam(team) mutates only
this.getTeam() and team.getMembers() and leaves all other properties of both
peer groups involved (as given by the peer group objects’ basic getters)
unchanged. (A getter is basic if it is marked using the @basic tag.)

Nesting class-encapsulated and package-encapsulated


abstractions
Making the fields of a module package-accessible is probably fine for small
modules such as the example module, but for larger modules this quickly becomes
error-prone. Therefore, it is preferrable to always declare fields as private, and
to encapsulate them into a class-level abstraction, even if the class is part of a
multi-class abstraction as well. We call this a case of nested abstractions. In
such cases of class-level abstractions nested within a package-level abstraction,
each class of the package is a client of the class-level abstractions implemented
by the other classes of the package.
When doing so, we can take the opportunity to move as many of the package’s
representation invariants as possible into the individual class-level abstractions.
In the example, we can move the invariants stating that members is not null and
that the elements of members are not null into the Team class-level abstraction:
package bigteams;

import logicalcollections.LogicalSet;

/**
* Each instance of this class represents a student in a project course,
* as part of a student-team graph.
*
* @invar If a student is in a team, it is among its members.
* | getTeam() == null || getTeam().getMembers().contains(this)
*/
public class ProjectCourseStudent {

private Team team;

/**
* @invar | getTeamInternal() == null
* | || getTeamInternal().getMembersInternal().contains(this)
*
* @peerObject
*/
Team getTeamInternal() { return team; }

/**
* Returns this student's team, or {@code null} if they are not in a team.
NESTING CLASS-LEVEL AND PACKAGE-LEVEL ABSTRACTIONS 141

*
* @peerObject
* @basic
*/
public Team getTeam() { return team; }

/**
* Initializes this object as representing a student who is not in a team.
*/
public ProjectCourseStudent() {}

/**
* Make this student a member of the given team.
*
* @throws IllegalArgumentException if {@code team} is null.
* | team == null
* @throws IllegalStateException if this student is already in a team.
* | getTeam() != null
*
* @mutates_properties | this.getTeam(), team.getMembers()
*
* @post The given team's members equal its old members plus this student.
* | team.getMembers().equals(LogicalSet.plus(old(team.getMembers()), this))
*/
public void join(Team team) {
if (team == null)
throw new IllegalArgumentException("team is null");
if (this.team != null)
throw new IllegalStateException("this student is already in a team");

this.team = team;
team.addMember(this);
}

/**
* Make this student no longer be a member of their team.
*
* @throws IllegalStateException if this student is not in a team.
* | getTeam() == null
*
* @mutates_properties | this.getTeam(), this.getTeam().getMembers()
*
* @post This student is not in a team.
* | getTeam() == null
* @post This student's old team's members are its old members minus this
* student.
* | old(getTeam()).getMembers().equals(
* | LogicalSet.minus(old(getTeam().getMembers()), this))
*/
public void leaveTeam() {
if (this.team == null)
throw new IllegalStateException("this student is not in a team");

team.removeMember(this);
team = null;
}
}
142 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS

package bigteams;

import java.util.HashSet;
import java.util.Set;

import logicalcollections.LogicalSet;

/**
* Each instance of this class represents a team in a student-team graph.
*
* @invar Each of this team's members has this team as its team.
* | getMembers().stream().allMatch(s -> s != null && s.getTeam() == this)
*/
public class Team {

/**
* @invar | members != null
* @invar | members.stream().allMatch(s -> s != null)
*
* @representationObject
*/
private HashSet<ProjectCourseStudent> members = new HashSet<>();

/**
* Returns this team's set of members.
*
* @invar | getMembersInternal().stream().allMatch(s ->
* | s.getTeamInternal() == this)
*
* @post | result != null && result.stream().allMatch(s -> s != null)
* @peerObjects
*/
Set<ProjectCourseStudent> getMembersInternal() { return Set.copyOf(members); }

/**
* Returns this team's set of members.
*
* @post | result != null
* @creates | result
* @peerObjects
* @basic
*/
public Set<ProjectCourseStudent> getMembers() { return Set.copyOf(members); }

/**
* Initializes this object as representing an empty team.
*
* @mutates | this
* @post This team has no members.
* | getMembers().isEmpty()
*/
public Team() {}

/**
* Adds the given student to this team's set of students.
*
* @throws IllegalArgumentException if {@code student} is null
NESTING CLASS-LEVEL AND PACKAGE-LEVEL ABSTRACTIONS 143

* | student == null
* @mutates | this
* @post This team's set of members equals its old set of members plus the
* given student.
* | getMembersInternal().equals(
* | LogicalSet.plus(old(getMembersInternal()), student))
*/
void addMember(ProjectCourseStudent student) {
if (student == null)
throw new IllegalArgumentException("student is null");

members.add(student);
}

/**
* Removes the given student from this team's set of students.
*
* @throws IllegalArgumentException if {@code student} is null
* | student == null
* @mutates | this
* @post This team's set of members equals its old set of members minus the
* given student.
* | getMembersInternal().equals(
* | LogicalSet.minus(old(getMembersInternal()), student))
*/
void removeMember(ProjectCourseStudent student) {
if (student == null)
throw new IllegalArgumentException("student is null");

members.remove(student);
}

(We do not show class BigTeamsTest again because it is unchanged.)


Notice the following:
• An instance of a class that is part of a class-encapsulated abstraction nested
within a package-encapsulated abstraction may have both a class-level
peer group and a package-level peer group. The class-level peer group
is always a subset of the package-level peer group. The class-level peer
group is defined by the @peerObject and @peerObjects tags in the Javadoc
comments for the class’s private fields; the package-level peer group is
defined by the @peerObject and @peerObjects tags in the Javadoc comments
for the package’s package-accessible getters. In the example, the objects
have no class-level peer groups (or, equivalently, the class-level peer group
of O is just the singleton {O}).
• We specify a class-level abstraction’s representation invariants using @invar
tags in the Javadoc comments for the class’ private fields.
• In the case of a class-encapsulated abstraction nested within a package-
encapsulated abstraction, we specify the class-level abstraction’s abstract
state invariants using @post tags in the Javadoc comments for the class’
144 MULTI-CLASS ENTITY-RELATIONSHIP ABSTRACTIONS

nonprivate getters.
• We specify a package-level abstraction’s representation invariants using
@invar tags in the Javadoc comments for the package’s package-accessible
fields or getters.
• We specify a package-level abstraction’s abstract state invariants using
@invar tags in the Javadoc comments for the package’s classes and/or using
@post tags in the Javadoc comments for the package’s public getters.
• A public constructor’s author must ensure that when the constructor
returns, all of the constructed object’s class-level and package-level repre-
sentation invariants hold. (Those of its peer objects must hold as well, but
usually a newly constructed object does not yet have any peer objects.)
• At every call of a package-accessible method, it is the caller’s responsibility
to ensure that for each member O of the class-level peer group of each object
inspected or mutated by the method, all of O’s class-level representation
invariants hold. It is the method author’s responsibility to ensure that all
of these invariants hold again when the method returns.
• At every call of a public method, it is the caller’s responsibility to ensure
that for each member O of the package-level peer group of each object
inspected or mutated by the method, all of O’s class-level and package-level
representation invariants hold. It is the method author’s responsibility to
ensure that all of these invariants hold again when the method returns.
• This means that it is not correct to call package-accessible getters inside a
class-level representation invariant or public getters inside any representa-
tion invariant, because this would introduce a circularity: the getter would
be allowed to assume the invariants in whose definition it is involved. It
is, however, allowed (and generally necessary) to call package-accessible
getters in package-level representation invariants, because these getters
can assume only the class-level representation invariants.
• We introduce package-accessible mutators addMember and removeMember into
class Team to allow methods join and leaveTeam in class ProjectCourseStudent
to restore the consistency of the package-level bidirectional association.
How to properly document
multi-object abstractions

These instructions complement the instructions on How to properly document


single-object abstractions.

Fields documentation
If an @invar clause for an object O mentions a field of an object O’, then O’ must
be a representation object of O or a peer object of O.
An @invar clause must be well-defined (i.e. must not crash or loop forever) for
arbitrary (concrete) states of the objects involved, with the following exceptions:
• An @invar clause is evaluated only if preceding @invar clauses of the same
object evaluated to true. So, if, for example, the first clause says that some
field is non-null, the second clause is allowed to call a method on the object
referenced by that field.
• The N’th @invar clause of an object O that is a member of a peer group is
evaluated only if, for each member O’ of the same peer group, and each I
< N, the I’th @invar clause of O’ evaluated to true. So, for example, if the
second @invar clause of some class calls a method on the foo field of a peer
object, this is fine provided that the first @invar clause of the peer object’s
class says that foo is non-null.

Advanced topics
You need not read or apply this section if you are not going for a top score.
To achieve a complete specification of the behavior of a method or constructor,
the @peerObject and @peerObjects clauses should be used to define an object’s peer
group, and the @mutates, @mutates_properties, and @inspects clauses should be used
to indicate which pre-existing peer groups are mutated and inspected by the
method or constructor, and the @creates clause should be used to indicate which

145
146 HOW TO DOCUMENT MULTI-OBJECT ABSTRACTIONS

peer groups that are visible to the client after the method or constructor finishes
were created by the method or constructor.
For more information about peer groups, see the course notes on Entity-
relationship abstractions.

Mutates clauses
A method must not mutate any pre-existing peer groups not mentioned in its
@mutates clause and not used to represent the state of any peer groups mentioned
in its @mutates clause. That is, an execution of a method may mutate an object
O if and only if either O was newly created during the method execution (i.e.
it did not exist when the method execution started), or a member of O’s peer
group is mentioned in the method’s @mutates clause, or O is a member of the
peer group of a representation object of a member of the peer group of an object
mentioned in the method’s @mutates clause, or O is a member of the peer group
of a representation object of a member of the peer group of a representation
object of a member of the peer group of an object mentioned in the method’s
@mutates clause, and so forth.

@mutates_properties and @basic clauses


If a peer group is mentioned in a @mutates clause, then the new state of the
peer group must be specified using postconditions. In general, the constructor
or method’s postconditions must (explicitly or by implication) specify the new
return value of each getter of each member of the peer group. If most objects’
getters’ return values remain unchanged, and for the objects whose getters’
return values do not all remain unchanged, most getters’ return values do remain
unchanged, one can use a @mutates_properties clause.
The clause @mutates_properties | O1.M1(), O2.M2() is equivalent to @mutates | O1, O2
plus a postcondition that states that for each object O in the union of the peer
group of O1 and O2, and for each basic inspector M of O, either (O, M) is in the
set {(O1, M1), (O2, M2)} or Objects.equals(O.M(), old(O.M())). A basic inspector
is a getter whose Javadoc comment contains a @basic tag.

Inspects clauses
Similarly, a method must not inspect the state of any pre-existing mutable
objects that are not in the peer group of any object mentioned in its @inspects
or @mutates clause and not used to represent the state of any objects mentioned
in its @inspects or @mutates clause.

Defaults
If no @mutates, @mutates_properties, or @inspects clause is specified for a given
method or constructor, the default is that it may inspect and mutate any object
ADVANCED TOPICS 147

that is not an instance of an immutable class. Exception: if an instance method’s


name starts with get or is, the default is that it may mutate no object and that
it may inspect this.

Creates clauses
By specifying an object in a @creates clause, you indicate that every member of
the object’s peer group was created during the execution of the method, and
furthermore, that the peer group is disjoint from that of any direct or indirect
representation object of any peer group mentioned in any of the method’s
@inspects or @mutates clauses.

Specifying collections of objects


One can specify a collection of objects in a @mutates, @mutates_properties, @inspects,
or @creates clause using the ...collection syntax:
/**
* @inspects | lists, strings
* @mutates | ...lists
*/
static void allAddAll(StringList[] lists, String[] strings) {
for (StringList list : lists)
list.addAll(strings);
}

/**
* @inspects | lists, ...lists
*/
static boolean anyIsEmpty(StringList[] lists) {
for (StringList list : lists)
if (list.getElements().length == 0)
return true;
return false;
}

class Rectangle {

private int width;


private int height;

/** @basic */
public int getWidth() { return width; }

/** @basic */
public int getHeight() { return height; }

/** @post | result == getWidth() * getHeight() */


public int getArea() { return width * height; }

/**
* @mutates_properties | getWidth()
* @post | getWidth() == newWidth
*/
148 HOW TO DOCUMENT MULTI-OBJECT ABSTRACTIONS

public void setWidth(int newWidth) { width = newWidth; }

/**
* @inspects rectangles
* @mutates_properties | (...rectangles).getWidth()
* @post | Arrays.stream(rectangles).allMatch(r -> r.getWidth() == newWidth)
*/
public static void allSetWidth(Rectangle[] rectangles, int newWidth) {
for (Rectangle rectangle : rectangles)
rectangle.setWidth(newWidth);
}

}
Iterators

Iterators are a solution to an API design problem: how to introduce an abstract


API between a collection implementation and its clients, that hides the data
structure used to implement the collection, without increasing the time or space
(memory) usage of clients that iterate over the elements of the collection?

The problem
We illustrate the problem by means of the example of two data structures for
storing a collection of objects, and a client that iterates over both of them:
public class ArrayList {

public Object[] elements;

public class LinkedList {

public static class Node {


public Object value;
public Node next;
}

public Node firstNode;

public class ClientProgram {

public void printAll(ArrayList arrayList) {


for (int i = 0; i < arrayList.elements.length; i++)
System.out.println(arrayList.elements[i]);
}

public void printAll(LinkedList linkedList) {


for (LinkedList.Node node = linkedList.firstNode;
node != null;
node = node.next)

149
150 ITERATORS

System.out.println(node.value);
}

public void printBoth(ArrayList arrayList, LinkedList linkedList) {


printAll(arrayList);
printAll(linkedList);
}

Notice the following:

• No abstraction or encapsulation is applied to the two data structures; clients


have direct access to the representation. Of course, it is recommended to
apply abstraction and encapsulation as much as possible. However, in this
example we do not yet apply it because it is the point of this text to figure
out how such an abstraction should be designed in the first place.
• The class of linked list nodes is defined as a static nested class within the
class of linked lists. This is essentially equivalent to defining it in the usual
way outside of another class, except that in order to refer to it from outside
of the enclosing class, you need to qualify its name by the name of the
enclosing class. For example, in the client program the class is referred to
as LinkedList.Node. More details about static and non-static nested classes
will follow later in this text.
• When passing an object as an argument, System.out.println calls toString()
on it and prints the resulting string to the screen.

Notice also that the client program has to duplicate the logic of printing all
elements for the two data structures. The question we consider in this text is:
how can we introduce an API that generalizes the two data structures, and
allows the client to write the logic for printing all elements only once and apply
it unchanged to both data structures, without hurting performance?

(We could introduce a method into both data structures that returns the elements
as an array; the client could call it and then iterate over the array to print all
elements. However, this would increase the memory usage of the client: it
would need an amount of extra memory linear in the number of elements in the
collection.)

To see the solution, carefully consider both printAll methods and notice that they
follow a similar pattern: they use some piece of data to track where they are in
the data structure; they perform a test on the piece of data to see whether they
have reached the end of the data structure; they retrieve the current element
pointed to by the piece of data; and they update the piece of data to point to
the next element.
ITERATORS 151

Iterators
The solution, then, is to encapsulate this piece of data into an object, which
we will call an iterator, and provide methods to allow clients to test whether
the iterator has reached the end of the data structure, to retrieve the current
element, and to mutate the iterator so that it points to the next element.
Such an iterator API can take a few different forms, depending on whether and
how some of these functionalities are combined into a single method. Whereas
most programming languages define some kind of standard iterator API, they
do so in different ways. However, in the end all of these forms are essentially
equivalent.
We show the styles used by some of the most popular programming languages
(translated into Java syntax) below:
// Java
public interface Iterator {
boolean hasNext();
/**
* Mutates the iterator to point to the next element and returns the current
* element.
*/
Object next();
}

// C#
public interface Enumerator {
Object getCurrent();
/**
* Mutates the enumerator to point to the next element, or returns `false` if
* the end has been reached.
*/
boolean moveNext();
}

// C++
public interface Iterator {
Object getCurrent(); // syntax in C++: *iterator
void moveNext(); // syntax in C++: iterator++
// in C++, to tell whether you have reached the end of the data structure, you
// have to test equality with a special "one-past-the-end" iterator
boolean equals(Iterator other);
}

// Python
public interface Iterator {
/**
* Throws a StopIteration exception if the end of the data structure has been
* reached.
*/
Object next();
}

// JavaScript
public interface Iterator {
152 ITERATORS

// Nested classes inside interfaces are implicitly public and static


class NextResult { public Object value; public boolean done; }
/**
* If result.done is true, result.value is not an element but an "iterator
* return value".
*/
NextResult next();
}

Note: we also omitted some optional additional functionalities provided by some


of the APIs. For example, Java’s iterator API optionally supports removing
the current element, and C#’s API optionally supports resetting the iterator
so that it again points to the first element of the collection. In each case, the
extra method throws an exception if the iterator object does not support the
operation.
In the remainder of this text, we use the Java-style Iterator API.
We can implement iterators for ArrayList and LinkedList as follows:
public class ArrayListIterator implements Iterator {

public ArrayList arrayList;


public int index;

public ArrayListIterator(ArrayList arrayList) {


this.arrayList = arrayList;
}

public boolean hasNext() { return index < arrayList.elements.length; }

public Object next() { return arrayList.elements[index++]; }

public class LinkedListIterator implements Iterator {

public LinkedList.Node node;

public LinkedListIterator(LinkedList linkedList) {


node = linkedList.firstNode;
}

public boolean hasNext() { return node != null; }

public Object next() {


Object result = node.value;
node = node.next;
return result;
}

Again, for now we do not apply any encapsulation.


ITERABLES 153

This way, the client can now reuse the logic for printing all elements:
public class ClientProgram {

public void printAll(Iterator iterator) {


while (iterator.hasNext())
System.out.println(iterator.next());
}

public void printBoth(ArrayList arrayList, LinkedList linkedList) {


printAll(new ArrayListIterator(arrayList));
printAll(new LinkedListIterator(linkedList));
}

Iterables
We can perform a further step of generalization, and further simplify the client
code, by introducing an interface to be implemented by any collection that
supports iteration:
public interface Iterable {
/** Returns a new iterator that points to the start of the data structure. */
Iterator iterator();
}

We can easily update classes ArrayList and LinkedList to implement this interface.
For example, we can update class ArrayList as follows:
public class ArrayList implements Iterable {

public Object[] elements;

public Iterator iterator() { return new ArrayListIterator(this); }

This allows us to simplify the client code as follows:


public class ClientProgram {

public void printAll(Iterable iterable) {


for (Iterator iterator = iterable.iterator(); iterator.hasNext(); )
System.out.println(iterator.next());
}

public void printBoth(Iterable collection1, Iterable collection2) {


printAll(collection1);
printAll(collection2);
}

}
154 ITERATORS

Notice that the client no longer even needs to know that it is dealing with an
array-based list and a linked list.

Applying nested classes


Static nested classes
Classes ArrayList and ArrayListIterator are not really independent classes; rather,
they together implement the Iterable abstraction. To make this explicit, it makes
sense to instead define ArrayListIterator as a nested class inside of class ArrayList:
public class ArrayList implements Iterable {

private Object[] elements;

private static class IteratorImpl implements Iterator {

private ArrayList arrayList;


private int index;

private IteratorImpl(ArrayList arrayList) { this.arrayList = arrayList; }

public boolean hasNext() { return index < arrayList.elements.length; }

public Object next() { return arrayList.elements[index++]; }

public Iterator iterator() { return new IteratorImpl(this); }

// Constructors and mutators not shown

One major advantage of defining the iterator API implementation as a nested


class is that since nested classes have access to the private members of their
enclosing class, we can now encapsulate the elements field. Another major
advantage is that the nested class itself can be made private; it is now hidden
from other toplevel classes, even if they are in the same package.

Inner classes
We can simplify the implementation of class ArrayList by turning class IteratorImpl
into a nonstatic nested class, more commonly referred to as an inner class. In
the same way that a nonstatic method takes an implicit this argument that
points to an instance of the enclosing class, an inner class has an implicit
EnclosingClass.this field that points to an instance of the enclosing class. Fur-
thermore, each constructor takes an implicit EnclosingClass.this argument and
uses it to initialize the implicit field:
public class ArrayList implements Iterable {
APPLYING NESTED CLASSES 155

private Object[] elements;

private class IteratorImpl implements Iterator {

private int index;

private IteratorImpl() {}

public boolean hasNext() { return index < ArrayList.this.elements.length; }

public Object next() { return ArrayList.this.elements[index++]; }

public Iterator iterator() { return new IteratorImpl(); }

// Constructors and mutators not shown

This new version of IteratorImpl is entirely equivalent to the previous one; it is


in fact identical, except that the explicit field arrayList has been replaced by an
implicit field ArrayList.this and the explicit constructor parameter arrayList has
been replaced by an implicit one.

In fact, inside an inner class, we can refer to members of the enclosing class using
their simple names; in case of nonstatic members, they are implicitly prefixed by
EnclosingClass.this. This means we can further simplify the implementation of
class ArrayList as follows:

public class ArrayList implements Iterable {

private Object[] elements;

private class IteratorImpl implements Iterator {

private int index;

public boolean hasNext() { return index < elements.length; }

public Object next() { return elements[index++]; }

public Iterator iterator() { return new IteratorImpl(); }

// Constructors and mutators not shown

We also left out the constructor, which is now generated implicitly.


156 ITERATORS

Local classes
In fact, since class IteratorImpl is only referred to inside method iterator(), it is
cleaner to define it as a local class inside method iterator():
public class ArrayList implements Iterable {

private Object[] elements;

public Iterator iterator() {

class IteratorImpl implements Iterator {

private int index;

public boolean hasNext() { return index < elements.length; }

public Object next() { return elements[index++]; }

return new IteratorImpl();


}

// Constructors and mutators not shown

Note: local classes are not visible outside the enclosing method, so there would
be no point in adding the private keyword, and in fact this is not allowed.
As we will see later, an additional advantage of local classes is that they can
(under certain conditions) refer to the parameters and local variables of the
enclosing method.

Anonymous classes
We can apply one more simplification: since in our example class IteratorImpl is
referred to in only one place, to create an instance of it, we can replace it by an
anonymous class. This relieves us from the need to invent a name for it:
public class ArrayList implements Iterable {

private Object[] elements;

public Iterator iterator() {

return new Iterator() {

private int index;

public boolean hasNext() { return index < elements.length; }

public Object next() { return elements[index++]; }


ENHANCED FOR LOOPS 157

};

// Constructors and mutators not shown

As you can see, an anonymous class instance creation expression consists of:
• the keyword new
• the name of a class or interface to be implemented by the anonymous class
• a constructor argument list
• a class body

Enhanced for loops


The interfaces Iterator and Iterable defined above exist in the Java Platform
API in packages java.util and java.lang, respectively. When using those, client
code like
for (Iterator iterator = iterable.iterator(); iterator.hasNext(); ) {
Object element = iterator.next();
System.out.println(element);
}

can be written more concisely using an enhanced for loop, as follows:


for (Object element : iterable) {
System.out.println(element);
}

All collection classes in package java.util, such as java.util.ArrayList and


java.util.HashSet, implement interface java.lang.Iterable and can therefore be
iterated over using an enhanced for loop.

Internal iterators: forEach methods


The kind of iterators we have seen so far are referred to as external iterators:
the client requests an iterator object and then interacts with the iterator object
to retrieve the successive elements of the collection. Another common type of
iterator API is known as internal iterators. This is the primary type of iterators
in the Ruby programming language. In the case of internal iterators, the client
passes a consumer object to the collection; the collection then repeatedly calls
the consumer object’s accept method, passing each element of the collection in
turn as an argument.
This leads to an alternative definition of Iterable:
158 ITERATORS

public interface Iterable {


void forEach(Consumer consumer);
}

where interface Consumer is defined as follows:


public interface Consumer {
void accept(Object value);
}

We can trivially make class ArrayList implement this interface as follows:


public class ArrayList implements Iterable {

public Object[] elements;

public void forEach(Consumer consumer) {


for (int i = 0; i < elements.length; i++)
consumer.accept(elements[i]);
}

We can adapt the client program to use this API as follows:


public class ClientProgram {

public void printAll(Iterable iterable) {


iterable.forEach(new Consumer() {

public void accept(Object value) {


System.out.println(value);
}

});
}

public void printBoth(Iterable collection1, Iterable collection2) {


printAll(collection1);
printAll(collection2);
}

Notice that the client program implements interface Consumer with an anonymous
class whose accept method prints the element to the screen.

Lambda expressions
We can write this client program even more concisely using a lambda expression:
public class ClientProgram {
INTERNAL ITERATORS: FOREACH METHODS 159

public void printAll(Iterable iterable) {


iterable.forEach((Object value) -> { System.out.println(value); });
}

public void printBoth(Iterable collection1, Iterable collection2) {


printAll(collection1);
printAll(collection2);
}

This lambda expression is exactly equivalent to the anonymous class instance


creation expression it replaces; it is a very concise way to implement a functional
interface, an interface with a single abstract method. Interface Consumer is a
functional interface. The lambda expression in the example implements interface
Consumer because this is the type of object that method forEach expects to receive
as an argument; this is known as target typing.
In general, a lambda expression consists of a parameter list, an arrow, and a
method body. We can, however, leave out the parameter types; furthermore, if
the method body is of the form { E; } or { return E; }, where E is an expression,
we can instead write just E, and if there is only a single parameter, we can leave
out the parentheses:
iterable.forEach(value -> System.out.println(value));

Capturing outer variables


Local classes, anonymous classes, and lambda expressions can refer to parameters
and local variables from the enclosing method. Their value is copied into an
implicit field of the resulting object; this is known as capturing.
For example, suppose we want to print only the elements that satisfy some
condition, given by a Predicate object:
public class ClientProgram {

public void printAll(Predicate condition, Iterable iterable) {


iterable.forEach(value -> {
if (condition.test(value))
System.out.println(value);
});
}

public void printBoth(Predicate condition,


Iterable collection1, Iterable collection2) {
printAll(condition, collection1);
printAll(condition, collection2);
}

}
160 ITERATORS

The Consumer object that results from evaluating the lambda expression has an
implicit field that holds a copy of parameter condition.
Java allows only effectively final variables to be captured this way. A variable is
effectively final if it is not mutated after initialization.
Generics

Basic concept
In this note, we introduce Java’s support for generic classes, interfaces, and
methods. We motivate and illustrate the concepts by means of the example task
of implementing a class University that has the following methods:
class University {

void addStudent(Student student) { ... }

boolean hasStudent(Student student) { ... }

/** Returns the number of students that have obtained at least 120 credits. */
int getNbFinishers() { ... }

void addStaff(Staff staff) { ... }

boolean hasStaff(Staff staff) { ... }

/**
* Returns the average number of scientific publications authored by this
* university's staff members.
*/
int getAvgNbPubs() { ... }

(In this note, we mostly ignore issues of encapsulation and documentation, to


focus on the topic of generics.)
Class University uses classes Student and Staff:

class Student { int nbCredits; }


class Staff { int nbPubs; }

Approach 1: Duplicate collection classes


To implement class University, we need to implement a data structure for storing
the current set of students, and a data structure for storing the current set of

161
162 GENERICS

staff members. A data structure for storing a set of students could look like this:
class LinkedListOfStudent implements IterableOfStudent {

static class Node {


Student element;
Node next;

Node(Student element, Node next) {


this.element = element;
this.next = next;
}
}

Node firstNode;

boolean contains(Student student) {


for (Node node = firstNode; node != null; node = node.next)
if (node.element == student)
return true;
return false;
}

public IteratorOfStudent iterator() {


return new IteratorOfStudent() {
Node node = firstNode;
public boolean hasNext() { return node != null; }
public Student next() {
Student result = node.element;
node = node.next;
return result;
}
};
}

void addFirst(Student student) {


firstNode = new Node(student, firstNode);
}
}

This class uses the interfaces IterableOfStudent and IteratorOfStudent defined as


follows:
interface IterableOfStudent {

IteratorOfStudent iterator();

interface IteratorOfStudent {

boolean hasNext();

Student next();

}
BASIC CONCEPT 163

Assuming we also implement analogous types LinkedListOfStaff, IterableOfStaff


and IteratorOfStaff, we can implement class University as follows:
class University {

private LinkedListOfStudent students = new LinkedListOfStudent();


private LinkedListOfStaff staffMembers = new LinkedListOfStaff();

void addStudent(Student student) {


students.addFirst(student);
}

boolean hasStudent(Student student) {


return students.contains(student);
}

/** Returns the number of students that have obtained at least 120 credits. */
int getNbFinishers() {
int result = 0;
for (IteratorOfStudent iterator = students.iterator();
iterator.hasNext(); ) {
Student student = iterator.next();
if (student.nbCredits >= 120)
result++;
}
return result;
}

void addStaff(Staff staff) {


staffMembers.addFirst(staff);
}

boolean hasStaff(Staff staff) {


return staffMembers.contains(staff);
}

/**
* Returns the average number of scientific publications authored by this
* university's staff members.
*/
int getAvgNbPubs() {
int nbStaff = 0;
int totalNbPubs = 0;
for (IteratorOfStaff iterator = staffMembers.iterator();
iterator.hasNext(); ) {
Staff staff = iterator.next();
nbStaff++;
totalNbPubs += staff.nbPubs;
}
return totalNbPubs / nbStaff;
}

Obviously, we would like to avoid having to introduce a separate linked list


implementation, and separate types for iterables and iterators, for each element
164 GENERICS

type.

Approach 2: Using subtype polymorphism


One way to achieve reuse of collection classes is by exploiting the fact that all
objects are of type Object, so we can use a data structure with element type
Object to store any collection:

interface Iterable {

Iterator iterator();

interface Iterator {

boolean hasNext();

Object next();

class LinkedList implements Iterable {

static class Node {


Object element;
Node next;

Node(Object element, Node next) {


this.element = element;
this.next = next;
}
}

Node firstNode;

boolean contains(Object element) {


for (Node node = firstNode; node != null; node = node.next)
if (node.element == element)
return true;
return false;
}

public Iterator iterator() {


return new Iterator() {
Node node = firstNode;
public boolean hasNext() { return node != null; }
public Object next() {
Object result = node.element;
node = node.next;
return result;
}
};
}

void addFirst(Object element) {


BASIC CONCEPT 165

firstNode = new Node(element, firstNode);


}
}

We can use class LinkedList to implement class University. Note, however, that
we do have to insert a typecast in methods getNbFinishers() and getAvgNbPubs():
class University {

private LinkedList students = new LinkedList();


private LinkedList staffMembers = new LinkedList();

void addStudent(Student student) {


students.addFirst(student);
}

boolean hasStudent(Student student) {


return students.contains(student);
}

/** Returns the number of students that have obtained at least 120 credits. */
int getNbFinishers() {
int result = 0;
for (Iterator iterator = students.iterator(); iterator.hasNext(); ) {
Student student = (Student)iterator.next(); // Typecast!
if (student.nbCredits >= 120)
result++;
}
return result;
}

void addStaff(Staff staff) {


staffMembers.addFirst(staff);
}

boolean hasStaff(Staff staff) {


return staffMembers.contains(staff);
}

/**
* Returns the average number of scientific publications authored by this
* university's staff members.
*/
int getAvgNbPubs() {
int nbStaff = 0;
int totalNbPubs = 0;
for (Iterator iterator = staffMembers.iterator(); iterator.hasNext(); ) {
Staff staff = (Staff)iterator.next(); // Typecast!
nbStaff++;
totalNbPubs += staff.nbPubs;
}
return totalNbPubs / nbStaff;
}

}
166 GENERICS

Note also that with this approach, we lose much of the benefit of Java’s static
type checker: many programming errors that would be caught by the static type
checker before we run the program when using Approach 1 are only detected, if
at all, during execution of the program when using Approach 2. This includes
the following errors:
• Adding the student to staffMembers instead of students in method addStudent.
This would lead to a ClassCastException in getAvgNbPubs() if and when that
method gets called.
• Calling contains on staffMembers instead of students in method hasStudent.
This would not cause an exception; instead, it would silently produce wrong
results.
• Iterating over staffMembers instead of students in method getNbFinishers. This
would lead to a ClassCastException in getNbFinishers(), except if staffMembers
is empty. In the latter case, it would silently produce wrong results.
• The corresponding errors in addStaff, hasStaff, and getAvgNbPubs.

Aproach 3: Generics
We can achieve reuse without sacrificing static type checking by defining types
Iterator, Iterable, and LinkedList as generic types with a type parameter T:

interface Iterable<T> {

Iterator<T> iterator();

interface Iterator<T> {

boolean hasNext();

T next();

class LinkedList<T> implements Iterable<T> {

static class Node<T> {


T element;
Node<T> next;

Node(T element, Node<T> next) {


this.element = element;
this.next = next;
}
}

Node<T> firstNode;

boolean contains(T element) {


for (Node<T> node = firstNode; node != null; node = node.next)
if (node.element == element)
BASIC CONCEPT 167

return true;
return false;
}

public Iterator<T> iterator() {


return new Iterator<T>() {
Node<T> node = firstNode;
public boolean hasNext() { return node != null; }
public T next() {
T result = node.element;
node = node.next;
return result;
}
};
}

void addFirst(T element) {


firstNode = new Node<T>(element, firstNode);
}
}

We can obtain classes equivalent to classes LinkedListOfStudent and


LinkedListOfStaff above, simply by instantiating generic class LinkedList
with type argument Student and Staff, respectively, to obtain parameterized types
LinkedList<Student> and LinkedList<Staff>:

class University {

private LinkedList<Student> students = new LinkedList<Student>();


private LinkedList<Staff> staffMembers = new LinkedList<Staff>();

void addStudent(Student student) {


students.addFirst(student);
}

boolean hasStudent(Student student) {


return students.contains(student);
}

/** Returns the number of students that have obtained at least 120 credits. */
int getNbFinishers() {
int result = 0;
for (Iterator<Student> iterator = students.iterator();
iterator.hasNext(); ) {
Student student = iterator.next();
if (student.nbCredits >= 120)
result++;
}
return result;
}

void addStaff(Staff staff) {


staffMembers.addFirst(staff);
}

boolean hasStaff(Staff staff) {


168 GENERICS

return staffMembers.contains(staff);
}

/**
* Returns the average number of scientific publications authored by this
* university's staff members.
*/
int getAvgNbPubs() {
int nbStaff = 0;
int totalNbPubs = 0;
for (Iterator<Staff> iterator = staffMembers.iterator();
iterator.hasNext(); ) {
Staff staff = iterator.next();
nbStaff++;
totalNbPubs += staff.nbPubs;
}
return totalNbPubs / nbStaff;
}

Note: if the type arguments for an instance creation expression can be de-
rived from the context, they can be omitted: in the example above, instead of
new LinkedList<Student>(), we can simply write new LinkedList<>(); this is known
as diamond notation.

Bounded type parameters


Suppose we want to store the university’s students sorted by number of cred-
its obtained, and the staff members sorted by number of publications. We
want to develop a class SortedLinkedList<T> as a subclass of LinkedList<T>. For
class SortedLinkedList<T> to be able to compare its elements, its elements should
implement interface Comparable<T>, defined as follows:
interface Comparable<T> {

/**
* Returns a negative number, zero, or a positive number if this object
* compares as less than, equal to, or greater than {@code other}.
*/
int compareTo(T other);

We can let the static type checker enforce this by declaring Comparable<T> as an
upper bound of the type parameter of SortedLinkedList:
class SortedLinkedList<T extends Comparable<T>> extends LinkedList<T> {

@Override
void addFirst(T element) {
Node<T> sentinel = new Node<T>(null, firstNode);
Node<T> node = sentinel;
INVARIANCE 169

while (node.next != null && node.next.element.compareTo(element) < 0)


node = node.next;
node.next = new Node<T>(element, node.next);
firstNode = sentinel.next;
}

Notice that thanks to the upper bound, the static type checker allows us to call
the compareTo method on expressions of type T.
We can update class University to use class SortedLinkedList by updating the field
declarations as follows:
private LinkedList<Student> = new SortedLinkedList<>();
private LinkedList<Staff> = new SortedLinkedList<>();

The static type checker will allow this only after we extend classes Student and
Staff to implement interface Comparable:

class Student implements Comparable<Student> {

int nbCredits;

public int compareTo(Student other) { return nbCredits - other.nbCredits; }

class Staff implements Comparable<Staff> {

int nbPubs;

public int compareTo(Staff other) { return nbPubs - other.nbPubs; }

(The term "upper bound" refers to the image of superclasses being "above"
subclasses.)

Invariance
Suppose, now, that we want to extend class University with a method getMembers()
that returns a collection containing all members of the university. First, we
introduce class Member as a superclass of Student and Staff:
class Member {}

class Student extends Member implements Comparable<Student> { ... }

class Staff extends Member implements Comparable<Staff> { ... }


170 GENERICS

For the purpose of illustrating various aspects of generics, we implement method


getMembers() as follows:

class University {

// ...

LinkedList<Member> getMembers() {
LinkedList<Member> members = new LinkedList<>();
members.addAll(students);
staffMembers.copyInto(members);
return members;
}

We make a first attempt at defining methods addAll and copyInto in class LinkedList
as follows:
class LinkedList<T> implements Iterable<T> {

// ...

void addAll(LinkedList<T> other) {


for (Iterator<T> iterator = other.iterator(); iterator.hasNext(); )
addFirst(iterator.next());
}

void copyInto(LinkedList<T> other) {


for (Iterator<T> iterator = this.iterator(); iterator.hasNext(); )
other.addFirst(iterator.next());
}

This, however, does not work, for two reasons.


• Firstly, the static type checker complains about the call of addAll in
method getMembers(): its argument is of type LinkedList<Student>, which
is not assignable to parameter other of type LinkedList<Member>. Indeed,
LinkedList<Student> is not a subtype of LinkedList<Member>, even though
Student is a subtype of Member. In other words, LinkedList<T> is not co-
variant in T.
There is a good reason for this: if LinkedList<T> were covariant, the static
type checker would not complain if we incorrectly tried to add the univer-
sity’s staff members to its collection of students, as follows:
LinkedList<Member> studentsAsMembers = university.students;
studentsAsMembers.addAll(university.staffMembers);

Indeed, an object of type LinkedList<Member> must accept the addition of


arbitrary new elements of type Member. An object of type LinkedList<Student>
does not satisfy that condition: it accepts only Student objects.
WILDCARDS 171

• Secondly, the static type checker also complains about the call of copyInto
in method getMembers(): its argument is of type LinkedList<Member>, which
is not assignable to parameter other of type LinkedList<Staff>. Indeed,
LinkedList<Member> is not a subtype of LinkedList<Staff>, even though Member
is a supertype of Staff. In other words, LinkedList<T> is not contravariant
in T.
The reason for this is obvious: a LinkedList<Staff> must contain only Staff
objects, whereas a LinkedList<Member> object may additionally contain other
Member objects, such as Student objects.

In summary, generic types are neither covariant nor contravariant in their type
parameter. In other words, they are invariant.

Wildcards
Upper-bounded wildcards
Even though LinkedList<Student> is not a subtype of LinkedList<Member>, it is in
fact safe for the call of members.addAll to pass students as an argument: indeed,
addAll only retrieves elements from its argument; it does not add new elements
to it. For that reason, it is safe for addAll to take as an argument a LinkedList<U>
object, for any subtype U of T. We can express this by using an upper-bounded
wildcard:
void addAll(LinkedList<? extends T> other) {
for (Iterator<? extends T> iterator = other.iterator(); iterator.hasNext(); )
addFirst(iterator.next());
}

Wildcard type LinkedList<? extends T> generalizes over all types LinkedList<U>, for
all subtypes U of T, as well as LinkedList<T> itself. It could be proncounced "linked
list of some type that extends T".

Lower-bounded wildcards
Even though LinkedList<Member> is not a subtype of LinkedList<Staff>, it is in
fact safe for the call of staffMembers.copyInto to pass members as an argument:
indeed, copyInto only puts elements into its argument; it does not retrieve any
elements from it. For that reason, it is safe for copyInto to take as an argument
a LinkedList<U> object, for any supertype U of T. We can express this by using a
lower-bounded wildcard:
void copyInto(LinkedList<? super T> other) {
for (Iterator<T> iterator = this.iterator(); iterator.hasNext(); )
other.addFirst(iterator.next());
}
172 GENERICS

Wildcard type LinkedList<? super T> generalizes over all types LinkedList<U>, for
all supertypes U of T, as well as LinkedList<T> itself. It could be pronounced "linked
list of some supertype of T".

Generic methods
Suppose we add the following static method to class LinkedList:

static void copyInto(LinkedList<...> from, LinkedList<...> to) {


from.copyInto(to); // equivalently: to.addAll(from);
}

What type arguments should we write here?


Here, we need to link the two parameter types. To do so, we need to turn method
copyInto into a generic method by declaring a method-level type parameter T.
Then, there are three equivalent ways to complete the method declaration:
static <T> void copyInto(LinkedList<T> from, LinkedList<? super T> to) {
static <T> void copyInto2(LinkedList<? extends T> from, LinkedList<T> to) {
static <T> void copyInto3(LinkedList<? extends T> from, LinkedList<? super T> to) {

Assuming we pick the first declaration, we can rewrite method getMembers() in


class University to use this method as follows:
LinkedList<Member> getMembers() {
LinkedList<Member> members = new LinkedList<>();
LinkedList.<Student>copyInto(students, members);
LinkedList.<Staff>copyInto(staffMembers, members);
return members;
}

Notice that we can specify the type argument for the type parameter of the
method being called by putting it between angle brackets in front of the method
name.
In most cases, however, the type argument can be inferred from the argument
types or from the context of the call; in these cases, we can omit the type
argument:
LinkedList<Member> getMembers() {
LinkedList<Member> members = new LinkedList<>();
LinkedList.copyInto(students, members);
LinkedList.copyInto(staffMembers, members);
return members;
}

We can also use a method-level type parameter to link the return type to a
parameter type:
LIMITATIONS 173

static <T> LinkedList<T> copy(LinkedList<T> list) {


LinkedList<T> result = new LinkedList<>();
result.addAll(list);
return result;
}

Limitations
After the static type checker finishes checking a Java program, but before it is
executed, all generics are erased from it. That is, in generic type declarations,
each type parameter is replaced by its upper bound (or Object if it has no explicit
upper bound), and type arguments are simply removed. Typecasts are inserted
as necessary to preserve well-typedness. For example, after erasing the example
program from Approach 3, we obtain the example program from Approach 2.
This involves inserting typecasts that cast the result of iterator.next() to the
expected type.
This approach, called erasure, has the following implications:
• Type arguments must be subtypes of Object; using a primitive type (like
int) as a type argument is not allowed. To store primitive values in a
generic collection, you must first box them. For example, ArrayList<int> is
not allowed, but ArrayList<Integer> is allowed. Java will convert int values
to Integer objects and vice versa automatically. (However, the memory
allocation involved in autoboxing may have a significant performance
impact.)
• Since type arguments are not available at run time, they cannot be used
in ways that affect the run-time type of an object or otherwise affect the
program’s run-time behavior. For example, if T is a type parameter, the
expressions new T(), new T[], T.class, or ... instanceof T are not allowed.
However, new LinkedList<T>() is allowed, since the type argument <T> is
removed during erasure anyway.
• Casts to types that are not run-time types (i.e. types that are different from
their erasure, such as T (whose erasure is Object) or LinkedList<T> (whose
erasure is LinkedList)) are allowed, but it may not be possible to fully check
them at run time. In those cases, the static type checker generates an
unchecked cast warning. For example, suppose object is a variable of type
Object. The cast (LinkedList<Student>)object generates an unchecked cast
warning, since it cannot be determined at run time whether the LinkedList
object was created as a LinkedList<Student>, a LinkedList<Staff>, or with yet
some other type argument. If object points to a LinkedList object that
contains a Staff object, then ((LinkedList<Student>)object).iterator().next()
will throw a ClassCastException when the next() call returns, because this
expression’s erasure is (Student)((LinkedList)object).iterator().next() and
the return value of the next() call is a Staff object.
174 GENERICS

You might also like