
This chapter shows how Java technology can help you accomplish the traditional data structuring needed for serious programming. In college computer science programs, there is a course called Data Structures that usually takes a s emester to complete, so there are many, many books devoted to this important topic. Exhaustively covering all the data structures that may be useful is not our goal in this chapter; instead, we cover the fundamental ones that the standard Java library supplies. We hope that, after you finish this chapter, you will find it easy to translate any of your data structures to the Java programming language.
Collection Interfaces
Before the release of the Java 2 platform, the standard library supplied only a small set of classes for the most useful data structures:
Vector, Stack, Hashtable, BitSet
, and the Enumeration
interface that provides an abstract mechanism for visiting elements in an arbitrary container. That was certainly a wise choice--it takes time and skill to come up with a comprehensive collection class library.
With the advent of the Java 2 platform, the designers felt that the time had come to roll out a full-fledged set of data structures. They faced a number of conflicting design decisions. They wanted the library to be small and easy to learn. They did not want the complexity of the Standard Template Library (or STL) of C++, but they wanted the benefit of generic algorithms that STL pioneered. They wanted the legacy classes to fit into the new framework. As all designers of collection libraries do, they had to make some hard choices, and they came up with a number of idiosyncratic design decisions along the way. In this section, we will explore the basic design of the Java collections framework, show you how to put it to work, and explain the reasoning behind some of the more controversial features.
Separating Collection Interfaces and Implementation
As is common for modern data structure libraries, the Java collection library separates interfaces and implementations. Let us look at that separation with a familiar data structure, the queue. The Java library does not supply a queue, but it is nevertheless a good example to introduce the basic concepts.
NOTE: If you need a queue, you can simply use the LinkedList
class that we discuss later in this chapter.
A queue interface specifies that you can add elements at the tail end of the queue, remove them at the head, and find out how many elements are in the queue. You use a queue when you need to collect objects and retrieve them in a "first in, first out" fashion (see Figure 2-1).
Figure 2-1: A queue


If there was a queue interface in the collections library, it might look like this:
interface Queue { void add(Object obj); Object remove(); int size(); } |
The interface tells you nothing about how the queue is implemented. There are two common implementations of a queue, one that uses a "circular array" and one that uses a linked list (see Figure 2-2).

Figure 2-2: Queue implementations
Each implementation can be expressed by a class that realizes the Queue interface:
class CircularArrayQueue implements Queue { CircularArrayQueue( int capacity) { . . . } public void add(Object obj) { . . . } public Object remove() { . . . } public int size() { . . . } private Object[] elements; private int head; private int tail; } class LinkedListQueue implements Queue { LinkedListQueue() { . . . } public void add(Object obj) { . . . } public Object remove() { . . . } public int size() { . . . } private Link head; private Link tail; } |
When you use a queue in your program, you don't need to know which implementation is actually used once the collection has been constructed. Therefore, it makes sense to use the concrete class (such as
CircularArrayQueue
) only when you construct the collection object. Use the interface type to hold the collection reference.Queue expressLane = new CircularArrayQueue(100); expressLane.add(new Customer("Harry"));
This approach makes it easy to change your mind and use a different implementation. You only need to change your program in one place--the constructor. If you decide that a
LinkedListQueue
is a better choice after all, your code becomesQueue expressLane = new LinkedListQueue(); expressLane.add( new Customer("Harry"));
Why would you choose one implementation over another? The interface says nothing about the efficiency of the implementation. A circular array is somewhat more efficient than a linked list, so it is generally preferable. However, as usual, there is a price to pay. The circular array is a bounded collection--it has a finite capacity. If you don't have an upper limit on the number of objects that your program will collect, you may be better off with a linked list implementation after all.
This example illustrates another issue for the designer of a collection class library. Strictly speaking, in a bounded collection, the interface of the add method should indicate that the method can fail:
class CircularArrayQueue { public void add(Object obj) throws CollectionFullException . . . } |
That's a problem--now the
CircularArrayQueue
class can't implement the Queue
interface since you can't add exception specifiers when overriding a method. Should one have two interfaces, BoundedQueue
and Queue
? Or should the add
method throw an unchecked exception? There are advantages and disadvantages to both approaches. It is these kinds of issues that make it genuinely hard to design a logically coherent collection class library.
As we already mentioned, the Java library has no separate class for queues. We just used this example to illustrate the difference between interface and implementation since a queue has a simple interface and two well-known distinct implementations. In the next section, you will see how the Java library classifies the collections that it supports.
The fundamental interface for collection classes in the Java library is the
Collection
interface. The interface has two fundamental methods:boolean add(Object obj) Iterator iterator()
There are several methods in addition to these two; we will discuss them later. The
add
method adds an object to the collection. The add
method returns true
if adding the object actually changed the collection; false
, if the collection is unchanged. For example, if you try to add an object to a set and the object is already present, then the add
request is rejected since sets reject duplicates.
The
iterator
method returns an object that implements the Iterator
interface-- we will describe that interface in a moment. You can use the iterator
object to visit the elements in the container one by one.
The
Iterator
interface has three fundamental methods:Object next() boolean hasNext() void remove()
By repeatedly calling the
next
method, you can visit the elements from the collection one by one. However, if you reach the end of the collection, the next method throws aNoSuchElementException
. Therefore, you need to call the hasNext
method before calling next. That method returns true if the iterator
object still has more elements to visit. If you want to inspect all elements in a container, you request an iterator
and then keep calling the next
method while hasNext
returns true.Iterator iter = c.iterator(); while (iter.hasNext()) { Object obj = iter.next(); do something with obj } |
![]() |
NOTE: Old-timers will notice that the next and
hasNext
methods of the Iterator
interface serve the same purpose as the nextElement
and hasMoreElements
methods of an Enumeration. The designers of the Java collection library could have chosen to extend the Enumeration
interface. But they disliked the cumbersome method names and chose to introduce a new interface with shorter method names instead.![]() |
Finally, the
remove
method removes the element that was returned by the last call to next
.
You may well wonder why the remove method is a part of the
Iterator
interface. It is more efficient to remove an element if you know where it is. The iterator
knows aboutpositions in the collection. Therefore, the remove
method was added to the Iterator
interface. If you visited an element that you didn't like, you can efficiently remove it.
There is an important conceptual difference between iterators in the Java collection library and iterators in other libraries. In traditional collection libraries such as the Standard Template Library of C++, iterators are modeled after array indexes. Given such an iterator, you can look up the element that is stored at that position, much like you can look up an array element
a[i]
if you have an array index i
. Independently of the lookup, you can advance the iterator to the next position, just like you can advance an array index with the i++
operation without performing a lookup. However, the Java iterators
do not work like that. The lookup and position change are tightly coupled. The only way to look up an element is to call next, and that lookup advances the position.
Instead, you should think of Java iterators as being between elements. When you call
next
, the iterator
jumps over the next element, and it returns a reference to the element that it just passed (see Figure 2-3).
Figure 2-3: Advancing an iterator
![]() |
NOTE: Here is another useful analogy. You can think of
Iterator.next
as the equivalent of InputStream.read
. Reading a byte from a stream automatically consumes the byte. The next call to read
consumes and returns the next byte from the input. Similarly, repeated calls to next let you read all elements in a collection.
You must be careful when using the
remove
method. Calling remove
removes the element that was returned by the last call to next
. That makes sense if you want to remove a particular value--you need to see the element before you can decide that it is the one that should be removed. But if you want to remove an element by position, you first need to skip past the element. For example, here is how you remove the first element in a collection.Iterator it = c.iterator(); it.next(); // skip over the //first element it.remove(); // now remove it
More importantly, there is a dependency between calls to the
next
and remove
methods. It is illegal to call remove if it wasn't preceded by a call to next. If you try, anIllegalStateException
is thrown.
If you want to remove two adjacent elements, you cannot simply call
it.remove(); it.remove(); // Error!
Instead, you must first call
next
to jump over the element to be removed.it.remove(); it.next(); it.remove(); // Ok
Because the collection and iterator interfaces are generic, you can write utility methods that operate on any kind of collection. For example, here is a generic
print
method that prints all elements in a collection.public static void print(Collection c) { System.out.print("[ "); Iterator iter = c.iterator(); while (iter.hasNext()) System.out.print( iter.next() + " "); System.out.println("]"); } |
![]() |
NOTE: We give this example to illustrate how to write a generic method. If you want to print the elements in a collection, you can just call
System.out.println(c)
. This works because each collection class has a toString
method that returns a string containing all elements in the collection.![]() |
Here is a method that adds all objects from one collection to another:
public static boolean addAll( Collection to, Collection from) { Iterator iter = from.iterator(); boolean modified = false; while (iter.hasNext()) if (to.add(iter.next())) modified = true; return modified; } |
Recall that the
add
method returns true
if adding the element modified the collection. You can implement these utility methods for arbitrary collections because the Collection
and Iterator
interfaces supply fundamental methods such as add and next.
The designers of the Java library decided that some of these utility methods are so useful that the library should make them available. That way, users don't have to keep reinventing the wheel. The
addAll
method is one such method.
Had
Collection
been an abstract class instead of an interface, then it would have been an easy matter to supply this functionality in the class. However, you cannot supply methods in interfaces of the Java programming language. Therefore, the collection library takes a slightly different approach. The Collection
interface declares quite a few useful methods that all implementing classes must supply. Among them are:int size() boolean isEmpty() boolean contains(Object obj) boolean containsAll(Collection c) boolean equals(Object other) boolean addAll(Collection from) boolean remove(Object obj) boolean removeAll(Collection c) void clear() boolean retainAll(Collection c) Object[] toArray() |
Many of these methods are self-explanatory; you will find full documentation in the API notes at the end of this section.
Of course, it is a bother if every class that implements the
Collection
interface has to supply so many routine methods. To make life easier for implementors, the classAbstractCollection
leaves the fundamental methods (such as add
and iterator
) abstract but implements the routine methods in terms of them. For example,public class AbstractCollection implements Collection { . . . public abstract boolean add( Object obj); public boolean addAll( Collection from) { Iterator iter = iterator(); boolean modified = false; while (iter.hasNext()) if (add(iter.next())) modified = true; return modified } . . . } |
A concrete collection class can now extend the
AbstractCollection
class.
It is now up to the concrete collection class to supply an add method, but the
addAll
method has been taken care of by the AbstractCollection
superclass. However, if the subclass has a more efficient way of implementing addAll
, it is free to do so.
This is a good design for a class framework. The users of the collection classes have a richer set of methods available in the generic interface, but the implementors of the actual data structures do not have the burden of implementing all the routine methods.
java.util.Collection
Iterator iterator()
returns aniterator
that can be used to visit the elements in the collection.int size()
returns the number of elements currently stored in the collection.boolean isEmpty()
returns true if this collection contains no elements.boolean contains(Object obj)
returnstrue
if this collection contains an object equal toobj
.
Parameters:obj
the object to match in the collectionboolean containsAll(Collection other)
returns true if this collection contains all elements in the other collection.
Parameters:other
the collection holding the elements to matchboolean add(Object element)
adds an element to the collection. Returns true if the collection changed as a result of this call.
Parameters:element
the element to addboolean addAll(Collection other)
adds all elements from the other collection to this collection. Returnstrue
if the collection changed as a result of this call.
Parameters:other
the collection holding the elements to addboolean remove(Object obj)
removes an object equal to obj from this collection. Returnstrue
if a matching object was removed.
Parameters:obj
an object that equals the element to removeboolean removeAll(Collection other)
removes all elements from the other collection from this collection. Returnstrue
if the collection changed as a result of this call.
Parameters:other
the collection holding the elements to addvoid clear()
removes all elements from this collection.boolean retainAll(Collection other)
removes all elements from this collection that do not equal one of the elements in the other collection. Returns true if the collection changed as a result of this call.
Parameters:other
the collection holding the elements to be keptObject[] toArray()
returns an array of the objects in the collection.
java.util.Iterator
boolean hasNext()
returns true if there is another element to visit.Object next()
returns the next object to visit. Throws aNoSuchElementException
if the end of the collection has been reached.Object remove()
removes and returns the last visited object. This method must immediately follow an element visit. If the collection has been modified since the last element visit, then the method throws anIllegalStateException
.