Collections API


Object-oriented programming encapsulates data inside classes, but this doesn't make the organization of data inside the classes any less important than in traditional programming languages. Of course, how you choose to structure the data depends on the problem you are trying to solve. Does your class need a way to easily search through thousands (or even millions) of items quickly? Does it need an ordered sequence of elements and the ability to rapidly insert and remove elements in the middle of the sequence? Does it need an arraylike structure with random-access ability that can grow at run time? The way you structure your data inside your classes can make a big difference when it comes to implementing methods in a natural style, as well as for performance.
This chapter shows how Java technology can help you accomplish the traditional data structuring needed for serious programming. In college computer science programs, there is a course called Data Structures that usually takes a s emester to complete, so there are many, many books devoted to this important topic. Exhaustively covering all the data structures that may be useful is not our goal in this chapter; instead, we cover the fundamental ones that the standard Java library supplies. We hope that, after you finish this chapter, you will find it easy to translate any of your data structures to the Java programming language.

Collection Interfaces

Before the release of the Java 2 platform, the standard library supplied only a small set of classes for the most useful data structures: Vector, Stack, Hashtable, BitSet, and the Enumeration interface that provides an abstract mechanism for visiting elements in an arbitrary container. That was certainly a wise choice--it takes time and skill to come up with a comprehensive collection class library.
With the advent of the Java 2 platform, the designers felt that the time had come to roll out a full-fledged set of data structures. They faced a number of conflicting design decisions. They wanted the library to be small and easy to learn. They did not want the complexity of the Standard Template Library (or STL) of C++, but they wanted the benefit of generic algorithms that STL pioneered. They wanted the legacy classes to fit into the new framework. As all designers of collection libraries do, they had to make some hard choices, and they came up with a number of idiosyncratic design decisions along the way. In this section, we will explore the basic design of the Java collections framework, show you how to put it to work, and explain the reasoning behind some of the more controversial features.
Separating Collection Interfaces and Implementation
As is common for modern data structure libraries, the Java collection library separates interfaces and implementations. Let us look at that separation with a familiar data structure, the queue. The Java library does not supply a queue, but it is nevertheless a good example to introduce the basic concepts.
NOTE: If you need a queue, you can simply use the LinkedList class that we discuss later in this chapter.

queue interface specifies that you can add elements at the tail end of the queue, remove them at the head, and find out how many elements are in the queue. You use a queue when you need to collect objects and retrieve them in a "first in, first out" fashion (see Figure 2-1).
Figure 2-1: A queue
If there was a queue interface in the collections library, it might look like this:
interface Queue
{  void add(Object obj);
   Object remove();
   int size();
}


The interface tells you nothing about how the queue is implemented. There are two common implementations of a queue, one that uses a "circular array" and one that uses a linked list (see Figure 2-2).

Figure 2-2: Queue implementations
Each implementation can be expressed by a class that realizes the Queue interface:
class CircularArrayQueue 
             implements Queue
{  CircularArrayQueue(
                 int capacity) { . . . }
   public void add(Object obj) { . . . }
   public Object remove() { . . . }
   public int size() { . . . }
   private Object[] elements;
   private int head;
   private int tail;
}

class LinkedListQueue implements Queue
{  LinkedListQueue() { . . . }
   public void add(Object obj) { . . . }
   public Object remove() { . . . }
   public int size() { . . . }

   private Link head;
   private Link tail;
} 

When you use a queue in your program, you don't need to know which implementation is actually used once the collection has been constructed. Therefore, it makes sense to use the concrete class (such as CircularArrayQueueonly when you construct the collection object. Use the interface type to hold the collection reference.
Queue expressLane = 
        new CircularArrayQueue(100);
expressLane.add(new Customer("Harry"));
This approach makes it easy to change your mind and use a different implementation. You only need to change your program in one place--the constructor. If you decide that aLinkedListQueue is a better choice after all, your code becomes
Queue expressLane = 
   new LinkedListQueue();
expressLane.add(
  new Customer("Harry"));
Why would you choose one implementation over another? The interface says nothing about the efficiency of the implementation. A circular array is somewhat more efficient than a linked list, so it is generally preferable. However, as usual, there is a price to pay. The circular array is a bounded collection--it has a finite capacity. If you don't have an upper limit on the number of objects that your program will collect, you may be better off with a linked list implementation after all.
This example illustrates another issue for the designer of a collection class library. Strictly speaking, in a bounded collection, the interface of the add method should indicate that the method can fail:
class CircularArrayQueue
{  public void add(Object obj)
  throws CollectionFullException
   . . .
}  

That's a problem--now the CircularArrayQueue class can't implement the Queue interface since you can't add exception specifiers when overriding a method. Should one have two interfaces, BoundedQueue and Queue? Or should the add method throw an unchecked exception? There are advantages and disadvantages to both approaches. It is these kinds of issues that make it genuinely hard to design a logically coherent collection class library.
As we already mentioned, the Java library has no separate class for queues. We just used this example to illustrate the difference between interface and implementation since a queue has a simple interface and two well-known distinct implementations. In the next section, you will see how the Java library classifies the collections that it supports.
The fundamental interface for collection classes in the Java library is the Collection interface. The interface has two fundamental methods:
boolean add(Object obj)
Iterator iterator()
There are several methods in addition to these two; we will discuss them later. The add method adds an object to the collection. The add method returns true if adding the object actually changed the collection; false, if the collection is unchanged. For example, if you try to add an object to a set and the object is already present, then the add request is rejected since sets reject duplicates.
The iterator method returns an object that implements the Iterator interface-- we will describe that interface in a moment. You can use the iterator object to visit the elements in the container one by one.
The Iterator interface has three fundamental methods:
Object next()
boolean hasNext()
void remove()
By repeatedly calling the next method, you can visit the elements from the collection one by one. However, if you reach the end of the collection, the next method throws aNoSuchElementException. Therefore, you need to call the hasNext method before calling next. That method returns true if the iterator object still has more elements to visit. If you want to inspect all elements in a container, you request an iterator and then keep calling the next method while hasNext returns true.
Iterator iter = c.iterator();
while (iter.hasNext())
{  Object obj = iter.next();
   do something with obj
}

NOTE: Old-timers will notice that the next and hasNext methods of the Iterator interface serve the same purpose as the nextElement and hasMoreElements methods of an Enumeration. The designers of the Java collection library could have chosen to extend the Enumeration interface. But they disliked the cumbersome method names and chose to introduce a new interface with shorter method names instead.

Finally, the remove method removes the element that was returned by the last call to next.
You may well wonder why the remove method is a part of the Iterator interface. It is more efficient to remove an element if you know where it is. The iterator knows aboutpositions in the collection. Therefore, the remove method was added to the Iterator interface. If you visited an element that you didn't like, you can efficiently remove it.
There is an important conceptual difference between iterators in the Java collection library and iterators in other libraries. In traditional collection libraries such as the Standard Template Library of C++, iterators are modeled after array indexes. Given such an iterator, you can look up the element that is stored at that position, much like you can look up an array element a[i] if you have an array index i. Independently of the lookup, you can advance the iterator to the next position, just like you can advance an array index with the i++operation without performing a lookup. However, the Java iterators do not work like that. The lookup and position change are tightly coupled. The only way to look up an element is to call next, and that lookup advances the position.
Instead, you should think of Java iterators as being between elements. When you call next, the iterator jumps over the next element, and it returns a reference to the element that it just passed (see Figure 2-3).

Figure 2-3: Advancing an iterator

NOTE: Here is another useful analogy. You can think of Iterator.next as the equivalent of InputStream.read. Reading a byte from a stream automatically consumes the byte. The next call to read consumes and returns the next byte from the input. Similarly, repeated calls to next let you read all elements in a collection.
You must be careful when using the remove method. Calling remove removes the element that was returned by the last call to next. That makes sense if you want to remove a particular value--you need to see the element before you can decide that it is the one that should be removed. But if you want to remove an element by position, you first need to skip past the element. For example, here is how you remove the first element in a collection.
Iterator it = c.iterator();
it.next(); // skip over the 
          //first element
it.remove(); // now remove it
More importantly, there is a dependency between calls to the next and remove methods. It is illegal to call remove if it wasn't preceded by a call to next. If you try, anIllegalStateException is thrown.
If you want to remove two adjacent elements, you cannot simply call
it.remove();  
it.remove(); // Error!
Instead, you must first call next to jump over the element to be removed.
it.remove();  
it.next();
it.remove(); // Ok
Because the collection and iterator interfaces are generic, you can write utility methods that operate on any kind of collection. For example, here is a generic print method that prints all elements in a collection.
public static void print(Collection c)
{  System.out.print("[ ");
   Iterator iter = c.iterator();
   while (iter.hasNext())
      System.out.print(
          iter.next() + " ");
   System.out.println("]");
}

NOTE: We give this example to illustrate how to write a generic method. If you want to print the elements in a collection, you can just call System.out.println(c). This works because each collection class has a toString method that returns a string containing all elements in the collection.
Here is a method that adds all objects from one collection to another:
public static boolean addAll(
    Collection to, Collection from)
{  Iterator iter = from.iterator();
   boolean modified = false;
   while (iter.hasNext())
      if (to.add(iter.next()))
         modified = true;
   return modified;
}

Recall that the add method returns true if adding the element modified the collection. You can implement these utility methods for arbitrary collections because the Collectionand Iterator interfaces supply fundamental methods such as add and next.
The designers of the Java library decided that some of these utility methods are so useful that the library should make them available. That way, users don't have to keep reinventing the wheel. The addAll method is one such method.
Had Collection been an abstract class instead of an interface, then it would have been an easy matter to supply this functionality in the class. However, you cannot supply methods in interfaces of the Java programming language. Therefore, the collection library takes a slightly different approach. The Collection interface declares quite a few useful methods that all implementing classes must supply. Among them are:
int size()
boolean isEmpty()
boolean contains(Object obj)
boolean containsAll(Collection c)
boolean equals(Object other)
boolean addAll(Collection from)
boolean remove(Object obj)
boolean removeAll(Collection c)
void clear()
boolean retainAll(Collection c)
Object[] toArray()

Many of these methods are self-explanatory; you will find full documentation in the API notes at the end of this section.
Of course, it is a bother if every class that implements the Collection interface has to supply so many routine methods. To make life easier for implementors, the classAbstractCollection leaves the fundamental methods (such as add and iterator) abstract but implements the routine methods in terms of them. For example,
public class AbstractCollection
   implements Collection
{  . . .
   public abstract boolean add(
                     Object obj);

   public boolean addAll(
                 Collection from)
   {  Iterator iter = iterator();
      boolean modified = false;
      while (iter.hasNext())
         if (add(iter.next()))
            modified = true;
      return modified
   }
   . . .
}

A concrete collection class can now extend the AbstractCollection class.
It is now up to the concrete collection class to supply an add method, but the addAll method has been taken care of by the AbstractCollection superclass. However, if the subclass has a more efficient way of implementing addAll, it is free to do so.
This is a good design for a class framework. The users of the collection classes have a richer set of methods available in the generic interface, but the implementors of the actual data structures do not have the burden of implementing all the routine methods.
java.util.Collection
  • Iterator iterator()
    returns an iterator that can be used to visit the elements in the collection.
  • int size()
    returns the number of elements currently stored in the collection.
  • boolean isEmpty()
    returns true if this collection contains no elements.
  • boolean contains(Object obj)
    returns true if this collection contains an object equal to obj.
    Parameters: obj the object to match in the collection
  • boolean containsAll(Collection other)
    returns true if this collection contains all elements in the other collection.
    Parameters: other the collection holding the elements to match
  • boolean add(Object element)
    adds an element to the collection. Returns true if the collection changed as a result of this call.
    Parameters: element the element to add
  • boolean addAll(Collection other)
    adds all elements from the other collection to this collection. Returns true if the collection changed as a result of this call.
    Parameters: other the collection holding the elements to add
  • boolean remove(Object obj)
    removes an object equal to obj from this collection. Returns true if a matching object was removed.
    Parameters: obj an object that equals the element to remove
  • boolean removeAll(Collection other)
    removes all elements from the other collection from this collection. Returns true if the collection changed as a result of this call.
    Parameters: other the collection holding the elements to add
  • void clear()
    removes all elements from this collection.
  • boolean retainAll(Collection other)
    removes all elements from this collection that do not equal one of the elements in the other collection. Returns true if the collection changed as a result of this call.
    Parameters: other the collection holding the elements to be kept
  • Object[] toArray()
    returns an array of the objects in the collection.
java.util.Iterator
  • boolean hasNext()
    returns true if there is another element to visit.
  • Object next()
    returns the next object to visit. Throws a NoSuchElementException if the end of the collection has been reached.
  • Object remove()
    removes and returns the last visited object. This method must immediately follow an element visit. If the collection has been modified since the last element visit, then the method throws an IllegalStateException.


Enter your email address to get our daily JOBS & INTERVIEW FAQ's Straight to your Inbox.

Make sure to activate your subscription by clicking on the activation link sent to your email


Enter your email address to get our daily JOBS & INTERVIEW FAQ's Straight to your Inbox.

Make sure to activate your subscription by clicking on the activation link sent to your email