Java Collections Framework – Comprehensive Guide
Java Collections Framework – Comprehensive Guide
Guide
The Java Collections Framework (JCF), introduced in JDK 1.2, provides a unified architecture for storing
and manipulating groups of objects 1 . It defines core interfaces (such as Collection , List , Set ,
Queue , Deque , Map , etc.) and concrete implementations ( ArrayList , LinkedList , HashSet ,
TreeSet , HashMap , etc.) 2 3 . The framework also includes utility classes and algorithms for sorting,
searching, and more. Below we explore the hierarchy, code examples, comparisons, use-cases,
performance, and interview tips for the JCF.
These relationships are illustrated in the diagram above 4 . In practice, a List maintains element order
and allows duplicates, while a Set prohibits duplicates (with TreeSet keeping elements in sort order)
16 17 . A Map stores key→value pairs; for sorted key maps TreeMap is used, while HashMap is the
general-purpose unordered map 3 .
1
Core Interfaces and Implementations
Collection<E> is the root of the framework (except Map ). It defines common methods like add ,
remove , size , iterator , etc. All concrete collections (e.g. lists, sets, queues) implement it 5 .
• ArrayList<E> – A resizable-array list (unsynchronized). Oracle docs call it “the best all-around
implementation of List” 7 . It supports fast random access (O(1) get) and amortized O(1) adds at the
end, but O(n) for insertions/deletions in the middle (due to shifting) 18 .
• LinkedList<E> – A doubly-linked list implementing both List and Deque . It provides O(1) inserts/
removes at the ends, but O(n) for indexed access 18 . As Oracle notes, it “provides better
performance than ArrayList if elements are frequently inserted or deleted” 8 . It can also function
as a FIFO queue (using offer/poll ) or a stack (using push/pop ).
• Vector<E> – A legacy synchronized array list. It has similar behavior to ArrayList but with all
methods synchronized (thread-safe) 9 . Its iteration is fail-fast like other collections (modifications
during iteration throw ConcurrentModificationException ).
• Stack<E> – A subclass of Vector representing a LIFO stack (push, pop, peek). It is considered
legacy; for stack behavior, Deque (e.g. ArrayDeque ) is usually preferred now.
2
System.out.println("Stack pop: " + stack.pop()); // banana
System.out.println("Stack peek: " + stack.peek()); // apple
• Performance: As the Big-O table below shows, ArrayList has O(1) get, O(1) add(at end,
amortized), but O(n) for insert/remove at arbitrary positions 18 . LinkedList has O(n) for
get(index) and contains() , but O(1) for adding/removing at head or tail 18 .
• HashSet<E> – Hash table implementation of Set . Unordered, allows one null element. Operations
add , remove , contains are O(1) on average 20 . It is the typical all-purpose set.
• LinkedHashSet<E> – Like HashSet , but maintains a doubly-linked list of entries to preserve
insertion order. Performance is similar to HashSet (all basic ops O(1) amortized) 21 . Use it when
predictable iteration order is needed.
3
• TreeSet<E> – Sorted set based on a red-black tree ( NavigableSet ). Elements are ordered by their
natural ordering or a supplied Comparator . Operations add , remove , contains are O(log n)
21 . Does not allow null elements (for natural order).
• When to use: HashSet for fastest lookup when order doesn’t matter. LinkedHashSet if you
need to iterate in insertion order. TreeSet when you need a sorted set (for example, maintaining a
sorted list of items) 22 21 . As GFG notes: use HashSet for unique objects (no order),
LinkedHashSet to preserve insertion order, and TreeSet when sorting is needed 22 .
• HashMap<K,V> – Hash table, unsynchronized, allows one null key and multiple null values. It’s
generally the “best all-around” map 12 . Average get/put are O(1) 23 (worst-case O(n) in rare
collision scenarios).
• LinkedHashMap<K,V> – Extends HashMap with a linked list of entries. Maintains insertion order
(or access order if configured) 13 . Performance is slightly slower than HashMap due to ordering
overhead, but still O(1) for basic ops. Often used for LRU caches by overriding
removeEldestEntry .
• TreeMap<K,V> – Red-black tree implementation of NavigableMap . Keys are sorted by their natural
order or a Comparator . Guarantees O(log n) for get , put , etc. 14 . Does not allow null keys
(unless a custom comparator handles them).
• Hashtable<K,V> – Legacy synchronized hash table (no null keys or values) 9 . Generally replaced by
ConcurrentHashMap or a synchronized HashMap .
4
hashMap.put("two", 2);
System.out.println("HashMap get one: " + hashMap.get("one"));
• Properties: A special subclass of Hashtable for string-based key/value pairs (often configuration).
As the Oracle docs state, Properties is a persistent set of properties, with both key and value as
String 15 . You typically use setProperty / getProperty and can load/store from streams.
• A class can implement Comparable<T> and override compareTo(T o) to define its natural
ordering. For example, many Java classes ( String , Integer , etc.) implement Comparable ,
which allows methods like Collections.sort(list) to sort them automatically 24 . Oracle
notes: “Comparable implementations provide a natural ordering for a class”, enabling automatic sorting
24 . If you call Collections.sort(list) on a list of objects not implementing Comparable ,
5
public String toString() { return name+":"+age; }
}
ArrayList vs LinkedList
• Underlying structure: ArrayList uses a resizable array; LinkedList uses a doubly-linked list.
• Access: ArrayList has O(1) random access (via index) 18 , whereas LinkedList must traverse
nodes (O(n)).
• Insertion/Removal: Inserting or removing in the middle of an ArrayList is O(n) (elements must
be shifted) 18 . In a LinkedList , adding/removing at the ends is O(1) (just pointer updates), but
finding the insertion point is O(n). If insertions/deletions are frequent and not always at the end,
LinkedList can be faster overall 8 .
• Memory overhead: ArrayList stores a contiguous array (waste only an occasional extra array
capacity). LinkedList stores each element in a node with two additional references (previous/
next), so it has more per-element overhead.
• Use case: Typically, ArrayList is preferred for most use-cases (better cache locality and random
access). LinkedList is chosen if you need constant-time insert/delete at the ends or iterating in
both directions.
Performance O(1) average 23 O(log n) guaranteed O(1) average (slightly slower than
(get/put) (worst-case O(n)) 23 HashMap) 23
6
Note: LinkedHashMap can be used to implement LRU caches by overriding
removeEldestEntry .
Backed by a
Underlying Backed by a HashMap Backed by a LinkedHashMap
TreeMap
7
• Avoid nulls if possible: Although some collections allow null , using null as a map key or inserting
null into a set can complicate logic.
Real-world example: To remove duplicates from a List while preserving order, one idiom is:
The above converts the List to a LinkedHashSet (which drops duplicates but keeps insertion order)
and back to a List .
HashMap/ HashSet/
Type Operation ArrayList LinkedList TreeMap TreeSet Priorit
LinkedHashMap LinkedHashSet
O(log n) O(log
Search contains(o) O(n) 18 O(n) 18 O(1) avg 23 O(1) avg 20 O(n)
23 n) 21
O(1)
O(1) (at O(log n) O(log
Insert add / put amortized O(1) avg 23 O(1) avg 20 O(log
ends) 18 23 n) 21
18
Key points: ArrayList provides constant-time positional access, but linear-time insert/remove except at the
end 18 . LinkedList has faster adds/removes at the ends, but slow random access 18 . HashMap/HashSet
have average O(1) for basic ops 20 23 (worst-case O(n) in case of many collisions). TreeMap/TreeSet
guarantee O(log n) for gets/puts 23 21 . PriorityQueue enqueues/dequeues in O(log n). The actual
constants and cache effects can also matter; for example, ArrayList iterations are very fast due to
contiguous memory.
8
• Thread-safety: Vector and Hashtable are synchronized (legacy); most modern code uses
unsynchronized collections or the java.util.concurrent package. Discuss when to use
Collections.synchronizedX() or ConcurrentHashMap .
• Iterator vs Enumeration: Iterator replaces legacy Enumeration . Only Iterator has
remove() and is used by all collections (fail-fast).
• Removing items during iteration: Use Iterator.remove() , or use Java 8 streams or
removeIf to avoid ConcurrentModificationException .
• Comparable/Comparator: Understand how to implement ordering. You may be asked to sort a list
of custom objects – recall that the class should implement Comparable , or you must supply a
Comparator .
• Collisions and HashCodes: For maps/sets, understand that poorly implemented hashCode() can
degrade performance.
• Use-cases: Be prepared to suggest which collection fits a scenario. For example, “If you need a FIFO
buffer, use LinkedList or ArrayDeque ; for LIFO, use Deque ; for priority-based processing, use
PriorityQueue ; for lookup by key, use HashMap or TreeMap depending on ordering needs,”
etc.
• Utilities: Mention knowledge of Collections utility class (sorting, reversing, empty collections,
singleton collections) and the fact that Arrays.asList() provides a fixed-size list view.
• Example concept – LRU cache: A common example is using LinkedHashMap with
removeEldestEntry to implement an LRU cache.
• Performance discussion: Be ready to articulate the Big-O for add/get/remove for common
collections. The JCF interview guide suggests candidates must “understand and articulate” these
complexities 26 .
In summary, the Java Collections Framework offers a rich set of data structures. Choosing the right one
involves understanding ordering, nullability, synchronization, and performance characteristics. By
combining the concepts above – the hierarchy, usage patterns, complexity, and interview considerations –
one can effectively use and explain Java’s collections in both code and discussion.
9
16 21 22 25 Difference and similarities between HashSet, LinkedHashSet and TreeSet in Java |
GeeksforGeeks
https://www.geeksforgeeks.org/difference-and-similarities-between-hashset-linkedhashset-and-treeset-in-java/
10