Introduction

JDK provides a powerful, high quality and high performance reusable data structures and algorithms in a so-called Java Collections Framework. This is a quick reference post to get to know about or using Java Collections interfaces and their implementations.

Looking for what collection class to use?

Follow the flowchart and summary table to select the best collection class for your requirement.

Pairs
Values Only
No
Yes
Yes
No
Yes
No
Insertion Order
Key Sorted
Yes
No
No
Yes
No
Yes
Insertion Order
Value Sorted
Yes
No
Yes
Yes
Yes
No
Yes
No
No
Yes
No
Yes
Start
Contains
key/value pair?
Is Order
Important
Will it contain
duplicate values?
Should it be used as cache
or store construction
expensive objects?
Insertion order or
sorted by order?
WeakHashMap
Unique key to be
determined using
equals() instead '==' oper?
HashMap
IdentityHashMap
LinkedHashMap
Should it be
synchronized?
HashTable
TreeMap
Is element order
Important
Should it be
synchronized?
HashSet
Insertion order or
sorted by values?
LinkedHasSet
TreeSet
Should it act like
LIFO stack?
Fast random
access required?
Stack
Vector
ArrayList
Fast sequential access,
addition/deletion
required?
LinkedList
Should it be used as
as a queue or
double-ended queue?
Does it have to be
fast permorning queue?
ArrayDeque
Collection classesDuplicate elements allowedElements are OrderedElements are SortedIs Thread-safe
ArrayListYesYesNoNo
LinkedListYesYesNoNo
VectorYesYesNoYes
HashSetNoNoNoNo
LinkedHashSetNoYesNoNo
TreeSetNoYesYesNo
HashMapNoNoNoNo
LinkedHashMapNoYesNoNo
HashtableNoNoNoYes
TreeMapNoYesYesNo

In Short, The following characteristics of the main collections in Java Collection Frameworks:

Implementation Classes grouped by their Interfaces

An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.

List list = Collections.synchronizedList(new ArrayList(...));

An object that maps keys to values. A map cannot contain duplicate keys; each key can map to at most one value.

Some insights on hashCode()

int index = (hash & 0x7FFFFFFF) % tab.length;

Value of index is a reminder of the division hash by the array size

HashMap is Hash table based implementation of the Map interface. HashMap makes no guarantees as to the iteration order of the map, it does not guarantee that the order will remain constant over time and HashMap permits null key and null values.

HashMap class is equivalent to Hashtable, except that it is `unsynchronized and permits nulls. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. To prevent accidental unsynchronized access to the list, create the instance of list as below

Map map = Collections.synchronizedMap(new HashMap(...));

LinkedHashMap is just like HashMap with an additional feature of maintaining an order of elements inserted into it. HashMap does quick insertion, search and deletion but it does not maintain the order of insertion. This is achieved by LinkedHashMap which ensures elements are inserted in the order they are inserted and can be accessed in their insertion order.

WeakHashMap is a hash table based implementation of the Map interface, with keys that are of a WeakReference type. An entry in a WeakHashMap will automatically be removed when its key is no longer in ordinary use i.e, that there is no single Reference that point to that key. When the garbage collection (GC) process discards a key, its entry is effectively removed from the map, so this class behaves somewhat differently from other Map implementations.

Bit of info on Strong, Soft and WeakReference

String description = "This variable has Strong Reference to value";
String description = "This variable has Strong Reference to value";
SoftReference<String> soft = new SoftReference<String>(description); 
description = null;
String description = "This variable has Strong Reference to value";
WeakReference<String> soft = new WeakReference<String>(description); 
description = null;

EnumMap is specialized implementation of Map interface for enumeration types, i.e Enum as its keys.

TreeMap class is a

red-black
tree based implementation. It provides an efficient means of storing key-value pairs in sorted order.

ConcurrentHashMap is a hash table supporting full concurrency of retrievals and high expected concurrency for updates.

ConcurrentSkipListMap is scalable concurrent version of TreeMap. When ordering of keys is required, we can use ConcurrentSkipListMap.

A collection that contains no duplicate elements. Sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. That being said as Set is a well-defined collection of distinct objects and in other words, the same object cannot occur more than once.

Java Set and List interfaces are quite similar to each other, but has two major differences.

The first difference is that the same element cannot occur more than once in a Set. This is different from a List where each element can occur more than once.

The second difference is that the elements in a Set has no guaranteed internal order. The elements in a List has an internal order, and the elements can be iterated in that order.

Queue represents a data structure designed to have elements inserted at the end of the queue, and elements removed from the beginning of the queue.

Besides basic Collection operations, queues provide additional insertion, extraction, and inspection operations. Each of these methods exists in two forms: one throws an exception if the operation fails, the other returns a special value (either null or false, depending on the operation). The latter form of the insert operation is designed specifically for use with capacity-restricted Queue implementations; in most implementations, insert operations cannot fail.

Queue Operations:

BlockingQueue Operations:

Dequeue represents a double ended queue, meaning a queue where you can add and remove elements from both ends of the queue. The name Deque is an abbreviation of Double Ended Queue.

Frequently asked questions

What is the difference between an ordered and a sorted collection?

What is fail safe and fail fast Iterator ?

Java Collections supports two types of Iterator, fail safe and fail fast.

The main distinction between a fail-fast and fail-safe Iterator is whether or not the underlying collection can be modified while its begin iterated. If you have used Collection like ArrayList then you know that when you iterate over them, no other thread should modify the collection. If Iterator detects any structural change after iteration has begun e.g adding or removing a new element then it throws ConcurrentModificationException, this is known as fail-fast behavior and these iterators are called fail-fast iterator because they fail as soon as they detect any modification . Though it’s not necessary that iterator will throw this exception when multiple threads modified it simultaneously. it can happen even with the single thread when you try to remove elements by using ArrayList’s remove() method instead of Iterator’s remove method. Most of the Collection classes e.g. Vector, ArrayList, HashMap, HashSet has fail-fast iterators.

The iterators used by concurrent collection classes e.g. ConcurrentHashMap, CopyOnWriteArrayList and CopyOnWriteArraySet uses a view of original collection for doing iteration and that’s why they doesn’t throw ConcurrentModificationException even when original collection was modified after iteration has begun. This means you could iterate and work with stale value, but this is the cost you need to pay for fail-safe iterator.

What is load factor?

The load factor is a measure of how full the collection(hash table) is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

What is Natural Ordering?

Natural ordering is the default ordering of objects of a specific type when they are sorted in an array or a collection. The Java language provides the Comparable interface that allows us define the natural ordering of a class. This interface is declared as follows:

public interface Comparable<T> {
    public int compareTo(T object);
}

As observed, this interface is parameterized (generics) and it has a single method compareTo() that allows two objects of a same type to be compared with each other. The important point here is the value returned by this method: an integer number indicates the comparison result of two objects.

Rules to remember: compare value = 0: two objects are equal. compare value > 0: the first object (the current object) is greater than the second one. compare value < 0: the first object is less than the second one.

Collections are sorted by natural ordering of its elements:

How to Override equals() and hashCode()

It is recommended practice to override both equals and hashCode methods when comparing two Java objects.

Simple approach

Below is simple approacg to override equals & hashcode methods in a pojo class

public class Employee {

    private int id;
    private String name;
    private String address;

	//getters and setters, constructor

    @Override
    public boolean equals(Object o) {

        if (o == this) return true;
        if (!(o instanceof Employee)) {
            return false;
        }

        Employee employee = (Employee) o;

        return employee.name.equals(name) &&
                employee.id == id &&
                employee.address.equals(address);
    }

    @Override
    public int hashCode() {
        // 15 * 28 are some random integers to derive the hashcode value
        int hashCode = 15;
        hashCode = 28 * hashCode + name.hashCode();
        hashCode = 28 * hashCode + id;
        hashCode = 28 * hashCode + address.hashCode();
        return hashCode;
    }
}

Using Objects.hash()

Objects class includes utility method equals() && hash() to generate the equals and hash code values.

public class Employee {

    private int id;
    private String name;
    private String address;

	//getters and setters, constructor

    @Override
    public boolean equals(Object o) {

        if (o == this) return true;
        if (!(o instanceof Employee)) {
            return false;
        }
        Employee employee = (Employee) o;
        return id == employee.id &&
                Objects.equals(name, employee.name) &&
                Objects.equals(address, employee.address);
    }

    @Override
    public int hashCode() {
        return Objects.hash(id, name, address);
    }
}

Using Apache Commons Lang

Apache Commons Lang library has got utility classes EqualsBuilder and HashCodeBuilder to perform equal operation & hashCode generations

@Override
public boolean equals(Object o) {

    if (o == this) return true;
    if (!(o instanceof Employee)) {
        return false;
    }

    Employee employee = (Employee) o;

    return new EqualsBuilder()
            .append(id, employee.id)
            .append(name, employee.name)
            .append(address, employee.address)
            .isEquals();
}

@Override
public int hashCode() {
    return new HashCodeBuilder(15, 28)
            .append(id)
            .append(name)
            .append(address)
            .toHashCode();
}

Using Lombok

Lombok
provides an efficient implementation by automatically generating the overridden methods.
@EqualsAndHashCode
public class Employee {
    private int id;
    private String name;
    private String address;

	//getters and setters, constructor
}

How Set interface implemented classes like HashSet, LinkedHashSet, TreeSet etc. achieve this uniqueness?

To add an element to Set we need to use add() method. add() method returns false when we try to add a duplicate element which indicates element is not added to set.

Difference between ArrayList and Vector?

java.util.Vector came along with the first version of java development kit (JDK). java.util.ArrayList was introduced in java version1.2, as part of java collections framework.

All the methods of Vector is synchronized. But, the methods of ArrayList is not synchronized. All the new implementations of java collection framework is not synchronized.

Vector and ArrayList both uses Array internally as data structure. They are dynamically resizable. Difference is in the way they are internally resized. By default, Vector doubles the size of its array when its size is increased. But, ArrayList increases by half of its size when its size is increased.

Therefore as per Java API the only main difference is, Vector’s methods are synchronized and ArrayList’s methods are not synchronized.

Difference between HashMap and HashTable?

Similarities in both the classes:

Differences between both the classes:

Difference between HashMap and EnumMap?

Below is the brief difference between HashMap & EnumMap

HashMapEnumMap
HashMap is general purpose map implementation, internally uses HashTable. HashMap can use any object as key.EnumMap is optimized for Enum type keys to store in map. EnumMap internally using as arrays, this representation is extremely compact and efficient, provide constant time performance for common methods like get, and put.
HashMap performance is not better as EnumMap.Due to specialized optimization done in EnumMap, its performance is better than HashMap.
All type of object can use as keys in HashMap.Only Enum type can use as keys in EnumMap.
HashMap using hashCode() to store the keys and values so there is probability of collision.Since EnumMap internally maintain an array and they are stored in their natural order using ordinal(), so there is no probability of collision.
HashMap is not ordered, HashMap order can change over the time.EnumMap stores keys in the natural order of their keys, the order in which the enum constants are declared.

ordinal() - The ordinal() method returns the order of an enum instance. It represents the sequence in the enum declaration, where the initial constant is assigned an ordinal of ‘0’. It is very much like array indexes.It is designed for use by sophisticated enum-based data structures, such as EnumSet and EnumMap.

public enum STATUS 
{
   ACTIVE, INACTIVE, DELETED
}

STATUS.ACTIVE.ordinal();     //0
 
STATUS.DELETED.ordinal();    //2

Difference between Enumeration and Iterator?

The functionality of Enumeration and the Iterator are same. Using Enumeration you can only traverse and fetch the objects, where as using Iterator we can also add and remove the objects. Iterator can be useful if you want to manipulate the list and Enumeration is for read-only access.

IteratorEnumeration
Iterator is applicable for all the collection classes.Enumeration applies only to legacy classes.
Iterator has the remove() method.Enumeration does not have the remove() method.
Iterator can do modifications. Calling remove() during collection traversal will remove the element from the Collection.Enumeration acts as a read only interface, one can not do any modifications to Collection while traversing the elements.
Iterator is not a legacy interface. Iterator can be used for the traversal of HashMap, LinkedList, ArrayList, HashSet, TreeMap, TreeSet.Enumeration is a legacy interface which is used for traversing Vector, Hashtable.

References