JDK provides a powerful, high quality and high performance reusable data structures and algorithms in a so-called Java Collections Framework. This is a quick reference post to get to know about or using Java Collections interfaces and their implementations.
Follow the flowchart and summary table to select the best collection class for your requirement.
Collection classes | Duplicate elements allowed | Elements are Ordered | Elements are Sorted | Is Thread-safe |
---|---|---|---|---|
ArrayList | Yes | Yes | No | No |
LinkedList | Yes | Yes | No | No |
Vector | Yes | Yes | No | Yes |
HashSet | No | No | No | No |
LinkedHashSet | No | Yes | No | No |
TreeSet | No | Yes | Yes | No |
HashMap | No | No | No | No |
LinkedHashMap | No | Yes | No | No |
Hashtable | No | No | No | Yes |
TreeMap | No | Yes | Yes | No |
In Short, The following characteristics of the main collections in Java Collection Frameworks:
lists
allow duplicate elements which are ordered by index.sets
and maps
do not allow duplicate elements.list
elements are not sorted.sets
and maps
do not sort its elements, except TreeSet
and TreeMap
– which sort elements by natural order or by a comparator.sets
and maps
are not ordered, except for:
** LinkedHashSet
and LinkedHashMap
have elements ordered by insertion order.TreeSet
and TreeMap
have elements ordered by natural order or by a comparator.Vector
and Hastable
. The rest are not thread-safe.An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
List list = Collections.synchronizedList(new ArrayList(...));
ArrayList
in which all mutative operations (add, set, and so on) are implemented by making a fresh copy of the underlying array.ConcurrentModificationException
. The iterator will not reflect additions, removals, or changes to the list since the iterator was created. Element-changing operations on iterators themselves (remove, set, and add) are not supported. These methods throw UnsupportedOperationException.An object that maps keys to values. A map cannot contain duplicate keys; each key can map to at most one value.
hashCode()
, ==
and equals()
for entry lookup. The lookup sequence for a given key k is as follows:k.hashCode()
to determine which bucket the entry is stored, if anyk == k1
|| k.equals(k1)
, then return k1’s entrySome insights on hashCode()
hashCode()
contract i.e, equal objects must return the same code.hashCode()
is inherited by all Java objects and this method returns int
value. Value returned from hashCode() method is calculated from the internal memory address of the object. By default hashCode() returns distinct integers for distinct objects.get()
, put()
, remove()
contains the below piece of codeint index = (hash & 0x7FFFFFFF) % tab.length;
Value of index is a reminder of the division hash by the array size
0x7FFFFFFF
- is a 32-bit integer in hexadecimal with all but the highest bit settab.length
- Array sizehash
- number returned by the key’s hashCode() methodThis class implements a hash table, which maps keys to values. Any non-null object can be used as a key or as a value.
To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the hashCode
method and the equals
method.
Any non-null object can be used as a key or as a value.
To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the hashCode method and the equals method.
Hashtables can be iterated in few ways
Fail Fast: Iteration - ConcurrentModificationException
will be thrown after Hashtable is modified after its Iterator is created
Hashtable<String, String> hashTable = new Hashtable<>();
hashTable.put("key-1", "value-1");
hashTable.put("key-2", "value-2");
Iterator<String> it = hashTable.keySet().iterator();
hashTable.remove("key-2");
while (it.hasNext()) {
String key = it.next();
}
java.util.ConcurrentModificationException
at java.util.Hashtable$Enumerator.next(Hashtable.java:1378)
Not Fail Fast: Enumeration makes Hashtable not fail-fast
Hashtable<String, String> hashTable = new Hashtable<>();
hashTable.put("key-1", "value-1");
hashTable.put("key-2", "value-2");
Enumeration<String> it = hashTable.keys;
hashTable.remove("key-2");
while (enumKey.hasMoreElements()) {
String key = enumKey.nextElement();
}
Iteration order in a Hashtable is unpredictable and does not match the order in which the entries were added.
If a thread-safe implementation is not needed, it is recommended to use HashMap
in place of Hashtable.
If a thread-safe highly-concurrent implementation is desired, then it is recommended to use ConcurrentHashMap
in place of Hashtable
HashMap
is Hash table based implementation of the Map
interface. HashMap
makes no guarantees as to the iteration order of the map, it does not guarantee that the order will remain constant over time and HashMap permits null key
and null values
.
HashMap
class is equivalent to Hashtable
, except that it is `unsynchronized and permits nulls. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. To prevent accidental unsynchronized access to the list, create the instance of list as below
Map map = Collections.synchronizedMap(new HashMap(...));
LinkedHashMap
is just like HashMap
with an additional feature of maintaining an order of elements inserted into it. HashMap does quick insertion, search and deletion but it does not maintain the order of insertion. This is achieved by LinkedHashMap which ensures elements are inserted in the order they are inserted and can be accessed in their insertion order.
WeakHashMap
is a hash table based implementation of the Map
interface, with keys that are of a WeakReference
type. An entry in a WeakHashMap
will automatically be removed when its key is no longer in ordinary use i.e, that there is no single Reference that point to that key. When the garbage collection (GC) process discards a key, its entry is effectively removed from the map, so this class behaves somewhat differently from other Map implementations.
Bit of info on Strong, Soft and WeakReference
description
has a strong reference to an String object with its assigned value.String description = "This variable has Strong Reference to value";
String description = "This variable has Strong Reference to value";
SoftReference<String> soft = new SoftReference<String>(description);
description = null;
String description = "This variable has Strong Reference to value";
WeakReference<String> soft = new WeakReference<String>(description);
description = null;
EnumMap
is specialized implementation of Map interface for enumeration types, i.e Enum
as its keys.
HashMap
.TreeMap
class is a
NavigableMap
interface and extends AbstractMap
class.ConcurrentHashMap
is a hash table supporting full concurrency of retrievals and high expected concurrency for updates.
HashMap
is not a thread-safe implementation, while Hashtable does provide thread-safety by synchronizing operations, but it is not very efficient. Collections.synchronizedMap, does not exhibit great efficiency.ConcurrentMap
is introduced for high throughput under high concurrency.ConcurrentHashMap
is thread-safe i.e. multiple thread can operate on a single object without any complications.null
as key or value.ConcurrentSkipListMap
is scalable concurrent version of TreeMap
. When ordering of keys is required, we can use ConcurrentSkipListMap.
A collection that contains no duplicate elements. Sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. That being said as Set
is a well-defined collection of distinct objects and in other words, the same object cannot occur more than once.
Java Set
and List
interfaces are quite similar to each other, but has two major differences.
The first difference is that the same element cannot occur more than once in a Set
. This is different from a List
where each element can occur more than once.
The second difference is that the elements in a Set
has no guaranteed internal order. The elements in a List
has an internal order, and the elements can be iterated in that order.
LinkedHashSet
class is a Hashtable
and LinkedList
implementation of the set
interface. It inherits HashSet
class and implements Set interface.TreeSet
implements the TreeMap
to store elements. The elements in a TreeSet are sorted according to their natural ordering or based on a custom Comparator that is supplied at the time of creation of the TreeSetHashSet
which uses a underlying CopyOnWriteArrayList
for all of its operations.NavigableSet
implementation based on a ConcurrentSkipListMap
.Queue represents a data structure designed to have elements inserted at the end of the queue, and elements removed from the beginning of the queue.
Besides basic Collection operations, queues provide additional insertion, extraction, and inspection operations. Each of these methods exists in two forms: one throws an exception if the operation fails, the other returns a special value (either null or false, depending on the operation). The latter form of the insert operation is designed specifically for use with capacity-restricted Queue implementations; in most implementations, insert operations cannot fail.
Queue Operations:
BlockingQueue Operations:
PriorityQueue
is used when the objects are supposed to be processed based on the priority heap.First-In-First-Out
algorithm, but sometimes the elements of the queue are needed to be processed according to the priority.LinkedBlockingQueue
class implements the BlockingQueue interface.ArrayBlockingQueue
class is a bounded blocking queue backed by an array. By bounded it means that the size of the Queue is fixed. Once created, the capacity cannot be changed.PriorityBlockingQueue
is an unbounded blocking queue that uses the same ordering rules as class PriorityQueue
and supplies blocking retrieval operations.DelayQueue
is an unbounded blocking queue of delayed elements, in which the consumer wants to take an element from the queue, they can take it only when the delay for that particular element has expired.Delayed
interface and it’s getDelay()
method return a zero or negative value which indicate that the delay has already elapsed.poll()
will return null.take()
or poll()
, they are otherwise treated as normal elements in the queue. size()
method returns the count of both expired and unexpired elements.SynchronousQueue
is a type of Blocking Queue which implements BlockingQueue
.take()
and put()
, and both of them are blocking.LinkedTransferQueue
class is an implementation of TransferQueue
. It is a concurrent blocking queue implementation in which producers may wait for receipt of messages by consumers.Dequeue represents a double ended queue, meaning a queue where you can add and remove elements from both ends of the queue. The name Deque is an abbreviation of Double Ended Queue.
LinkedList
class is a pretty standard Deque and Queue implementation. It uses a linked list internally to model a queue or a deque.ArrayDeque
provides resizable-array and implements the Deque
interface. It is also known as Array Double Ended Queue or Array Deck.List
is an example.SortedSet
is an example.Set
is an example.Java Collections supports two types of Iterator, fail safe and fail fast.
The main distinction between a fail-fast and fail-safe Iterator is whether or not the underlying collection can be modified while its begin iterated. If you have used Collection like ArrayList
then you know that when you iterate over them, no other thread should modify the collection. If Iterator
detects any structural change after iteration has begun e.g adding
or removing
a new element then it throws ConcurrentModificationException
, this is known as fail-fast behavior
and these iterators are called fail-fast iterator
because they fail as soon as they detect any modification . Though it’s not necessary that iterator will throw this exception when multiple threads modified it simultaneously. it can happen even with the single thread when you try to remove elements by using ArrayList’s remove() method instead of Iterator’s remove method. Most of the Collection classes e.g. Vector
, ArrayList
, HashMap
, HashSet
has fail-fast iterators.
The iterators used by concurrent collection classes e.g. ConcurrentHashMap
, CopyOnWriteArrayList
and CopyOnWriteArraySet
uses a view of original collection for doing iteration and that’s why they doesn’t throw ConcurrentModificationException
even when original collection was modified after iteration has begun. This means you could iterate and work with stale value, but this is the cost you need to pay for fail-safe iterator
.
The load factor is a measure of how full the collection(hash table) is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
Natural ordering is the default ordering of objects of a specific type when they are sorted in an array or a collection. The Java language provides the Comparable
interface that allows us define the natural ordering of a class. This interface is declared as follows:
public interface Comparable<T> {
public int compareTo(T object);
}
As observed, this interface is parameterized (generics) and it has a single method compareTo()
that allows two objects of a same type to be compared with each other. The important point here is the value returned by this method: an integer number indicates the comparison result of two objects.
Rules to remember: compare value = 0: two objects are equal. compare value > 0: the first object (the current object) is greater than the second one. compare value < 0: the first object is less than the second one.
Collections are sorted by natural ordering of its elements:
It is recommended practice to override both equals
and hashCode
methods when comparing two Java objects.
Simple approach
Below is simple approacg to override equals & hashcode methods in a pojo class
public class Employee {
private int id;
private String name;
private String address;
//getters and setters, constructor
@Override
public boolean equals(Object o) {
if (o == this) return true;
if (!(o instanceof Employee)) {
return false;
}
Employee employee = (Employee) o;
return employee.name.equals(name) &&
employee.id == id &&
employee.address.equals(address);
}
@Override
public int hashCode() {
// 15 * 28 are some random integers to derive the hashcode value
int hashCode = 15;
hashCode = 28 * hashCode + name.hashCode();
hashCode = 28 * hashCode + id;
hashCode = 28 * hashCode + address.hashCode();
return hashCode;
}
}
Using Objects.hash()
Objects
class includes utility method equals()
&& hash()
to generate the equals and hash code values.
public class Employee {
private int id;
private String name;
private String address;
//getters and setters, constructor
@Override
public boolean equals(Object o) {
if (o == this) return true;
if (!(o instanceof Employee)) {
return false;
}
Employee employee = (Employee) o;
return id == employee.id &&
Objects.equals(name, employee.name) &&
Objects.equals(address, employee.address);
}
@Override
public int hashCode() {
return Objects.hash(id, name, address);
}
}
Using Apache Commons Lang
Apache Commons Lang library has got utility classes EqualsBuilder
and HashCodeBuilder
to perform equal operation & hashCode generations
@Override
public boolean equals(Object o) {
if (o == this) return true;
if (!(o instanceof Employee)) {
return false;
}
Employee employee = (Employee) o;
return new EqualsBuilder()
.append(id, employee.id)
.append(name, employee.name)
.append(address, employee.address)
.isEquals();
}
@Override
public int hashCode() {
return new HashCodeBuilder(15, 28)
.append(id)
.append(name)
.append(address)
.toHashCode();
}
Using Lombok
@EqualsAndHashCode
public class Employee {
private int id;
private String name;
private String address;
//getters and setters, constructor
}
To add an element to Set
we need to use add()
method. add() method returns false when we try to add a duplicate element which indicates element is not added to set.
java.util.Vector
came along with the first version of java development kit (JDK). java.util.ArrayList
was introduced in java version1.2, as part of java collections framework.
All the methods of Vector is synchronized. But, the methods of ArrayList is not synchronized. All the new implementations of java collection framework is not synchronized.
Vector
and ArrayList
both uses Array internally as data structure. They are dynamically resizable. Difference is in the way they are internally resized. By default, Vector
doubles the size of its array when its size is increased. But, ArrayList
increases by half of its size when its size is increased.
Therefore as per Java API the only main difference is, Vector’s methods are synchronized and ArrayList’s methods are not synchronized.
Similarities in both the classes:
Differences between both the classes:
HashMap
doesn’t provide any Enumeration
, while Hashtable
provides not fail-fast Enumeration.Hashtable
doesn’t allow null keys and null values, while HashMap
do allow one null key and any number of null valuesHashtable
’s methods are synchronized while HashMaps
’s methods are not synchronized.HashMap
extends AbstractMap and implement Map interface, EnumMap
extends AbstractMap class which implements Map interface.HashMap
internally uses HashTable
, on the other hand EnumMap
internally uses arrays
and thus representation is extremely compact and efficient.Below is the brief difference between HashMap & EnumMap
HashMap | EnumMap |
---|---|
HashMap is general purpose map implementation, internally uses HashTable . HashMap can use any object as key. | EnumMap is optimized for Enum type keys to store in map. EnumMap internally using as arrays, this representation is extremely compact and efficient, provide constant time performance for common methods like get , and put . |
HashMap performance is not better as EnumMap. | Due to specialized optimization done in EnumMap, its performance is better than HashMap. |
All type of object can use as keys in HashMap. | Only Enum type can use as keys in EnumMap. |
HashMap using hashCode() to store the keys and values so there is probability of collision. | Since EnumMap internally maintain an array and they are stored in their natural order using ordinal() , so there is no probability of collision. |
HashMap is not ordered, HashMap order can change over the time. | EnumMap stores keys in the natural order of their keys, the order in which the enum constants are declared. |
ordinal() - The ordinal() method returns the order of an enum instance. It represents the sequence in the enum declaration, where the initial constant is assigned an ordinal of ‘0’. It is very much like array indexes.It is designed for use by sophisticated enum-based data structures, such as EnumSet and EnumMap.
public enum STATUS
{
ACTIVE, INACTIVE, DELETED
}
STATUS.ACTIVE.ordinal(); //0
STATUS.DELETED.ordinal(); //2
The functionality of Enumeration and the Iterator are same. Using Enumeration
you can only traverse and fetch the objects, where as using Iterator
we can also add and remove the objects. Iterator can be useful if you want to manipulate the list and Enumeration is for read-only access.
Iterator | Enumeration |
---|---|
Iterator is applicable for all the collection classes. | Enumeration applies only to legacy classes. |
Iterator has the remove() method. | Enumeration does not have the remove() method. |
Iterator can do modifications. Calling remove() during collection traversal will remove the element from the Collection. | Enumeration acts as a read only interface, one can not do any modifications to Collection while traversing the elements. |
Iterator is not a legacy interface. Iterator can be used for the traversal of HashMap, LinkedList, ArrayList, HashSet, TreeMap, TreeSet. | Enumeration is a legacy interface which is used for traversing Vector, Hashtable. |