JDK provides a powerful, high quality and high performance reusable data structures and algorithms in a so-called Java Collections Framework. This is a quick reference post to get to know about or using Java Collections interfaces and their implementations.
Follow the flowchart and summary table to select the best collection class for your requirement.
| Collection classes | Duplicate elements allowed | Elements are Ordered | Elements are Sorted | Is Thread-safe |
|---|---|---|---|---|
| ArrayList | Yes | Yes | No | No |
| LinkedList | Yes | Yes | No | No |
| Vector | Yes | Yes | No | Yes |
| HashSet | No | No | No | No |
| LinkedHashSet | No | Yes | No | No |
| TreeSet | No | Yes | Yes | No |
| HashMap | No | No | No | No |
| LinkedHashMap | No | Yes | No | No |
| Hashtable | No | No | No | Yes |
| TreeMap | No | Yes | Yes | No |
In Short, The following characteristics of the main collections in Java Collection Frameworks:
lists allow duplicate elements which are ordered by index.sets and maps do not allow duplicate elements.list elements are not sorted.sets and maps do not sort its elements, except TreeSet and TreeMap – which sort elements by natural order or by a comparator.sets and maps are not ordered, except for:
** LinkedHashSet and LinkedHashMap have elements ordered by insertion order.TreeSet and TreeMap have elements ordered by natural order or by a comparator.Vector and Hastable. The rest are not thread-safe.An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
List list = Collections.synchronizedList(new ArrayList(...));ArrayList in which all mutative operations (add, set, and so on) are implemented by making a fresh copy of the underlying array.ConcurrentModificationException. The iterator will not reflect additions, removals, or changes to the list since the iterator was created. Element-changing operations on iterators themselves (remove, set, and add) are not supported. These methods throw UnsupportedOperationException.An object that maps keys to values. A map cannot contain duplicate keys; each key can map to at most one value.
hashCode(), == and equals() for entry lookup. The lookup sequence for a given key k is as follows:k.hashCode() to determine which bucket the entry is stored, if anyk == k1 || k.equals(k1), then return k1’s entrySome insights on hashCode()
hashCode() contract i.e, equal objects must return the same code.hashCode() is inherited by all Java objects and this method returns int value. Value returned from hashCode() method is calculated from the internal memory address of the object. By default hashCode() returns distinct integers for distinct objects.get(), put(), remove() contains the below piece of codeint index = (hash & 0x7FFFFFFF) % tab.length;Value of index is a reminder of the division hash by the array size
0x7FFFFFFF - is a 32-bit integer in hexadecimal with all but the highest bit settab.length - Array sizehash - number returned by the key’s hashCode() methodThis class implements a hash table, which maps keys to values. Any non-null object can be used as a key or as a value.
To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the hashCode method and the equals method.
Any non-null object can be used as a key or as a value.
To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the hashCode method and the equals method.
Hashtables can be iterated in few ways
Fail Fast: Iteration - ConcurrentModificationException will be thrown after Hashtable is modified after its Iterator is created
Hashtable<String, String> hashTable = new Hashtable<>();
hashTable.put("key-1", "value-1");
hashTable.put("key-2", "value-2");
Iterator<String> it = hashTable.keySet().iterator();
hashTable.remove("key-2");
while (it.hasNext()) {
String key = it.next();
}
java.util.ConcurrentModificationException
at java.util.Hashtable$Enumerator.next(Hashtable.java:1378)Not Fail Fast: Enumeration makes Hashtable not fail-fast
Hashtable<String, String> hashTable = new Hashtable<>();
hashTable.put("key-1", "value-1");
hashTable.put("key-2", "value-2");
Enumeration<String> it = hashTable.keys;
hashTable.remove("key-2");
while (enumKey.hasMoreElements()) {
String key = enumKey.nextElement();
}Iteration order in a Hashtable is unpredictable and does not match the order in which the entries were added.
If a thread-safe implementation is not needed, it is recommended to use HashMap in place of Hashtable.
If a thread-safe highly-concurrent implementation is desired, then it is recommended to use ConcurrentHashMap in place of Hashtable
HashMap is Hash table based implementation of the Map interface. HashMap makes no guarantees as to the iteration order of the map, it does not guarantee that the order will remain constant over time and HashMap permits null key and null values.
HashMap class is equivalent to Hashtable, except that it is `unsynchronized and permits nulls. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. To prevent accidental unsynchronized access to the list, create the instance of list as below
Map map = Collections.synchronizedMap(new HashMap(...));LinkedHashMap is just like HashMap with an additional feature of maintaining an order of elements inserted into it. HashMap does quick insertion, search and deletion but it does not maintain the order of insertion. This is achieved by LinkedHashMap which ensures elements are inserted in the order they are inserted and can be accessed in their insertion order.
WeakHashMap is a hash table based implementation of the Map interface, with keys that are of a WeakReference type. An entry in a WeakHashMap will automatically be removed when its key is no longer in ordinary use i.e, that there is no single Reference that point to that key. When the garbage collection (GC) process discards a key, its entry is effectively removed from the map, so this class behaves somewhat differently from other Map implementations.
Bit of info on Strong, Soft and WeakReference
description has a strong reference to an String object with its assigned value.String description = "This variable has Strong Reference to value";String description = "This variable has Strong Reference to value";
SoftReference<String> soft = new SoftReference<String>(description);
description = null;String description = "This variable has Strong Reference to value";
WeakReference<String> soft = new WeakReference<String>(description);
description = null;EnumMap is specialized implementation of Map interface for enumeration types, i.e Enum as its keys.
HashMap.TreeMap class is a
NavigableMap interface and extends AbstractMap class.ConcurrentHashMap is a hash table supporting full concurrency of retrievals and high expected concurrency for updates.
HashMap is not a thread-safe implementation, while Hashtable does provide thread-safety by synchronizing operations, but it is not very efficient. Collections.synchronizedMap, does not exhibit great efficiency.ConcurrentMap is introduced for high throughput under high concurrency.ConcurrentHashMap is thread-safe i.e. multiple thread can operate on a single object without any complications.null as key or value.ConcurrentSkipListMap is scalable concurrent version of TreeMap. When ordering of keys is required, we can use ConcurrentSkipListMap.
A collection that contains no duplicate elements. Sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. That being said as Set is a well-defined collection of distinct objects and in other words, the same object cannot occur more than once.
Java Set and List interfaces are quite similar to each other, but has two major differences.
The first difference is that the same element cannot occur more than once in a Set. This is different from a List where each element can occur more than once.
The second difference is that the elements in a Set has no guaranteed internal order. The elements in a List has an internal order, and the elements can be iterated in that order.
LinkedHashSet class is a Hashtable and LinkedList implementation of the set interface. It inherits HashSet class and implements Set interface.TreeSet implements the TreeMap to store elements. The elements in a TreeSet are sorted according to their natural ordering or based on a custom Comparator that is supplied at the time of creation of the TreeSetHashSet which uses a underlying CopyOnWriteArrayList for all of its operations.NavigableSet implementation based on a ConcurrentSkipListMap.Queue represents a data structure designed to have elements inserted at the end of the queue, and elements removed from the beginning of the queue.
Besides basic Collection operations, queues provide additional insertion, extraction, and inspection operations. Each of these methods exists in two forms: one throws an exception if the operation fails, the other returns a special value (either null or false, depending on the operation). The latter form of the insert operation is designed specifically for use with capacity-restricted Queue implementations; in most implementations, insert operations cannot fail.
Queue Operations:
BlockingQueue Operations:
PriorityQueue is used when the objects are supposed to be processed based on the priority heap.First-In-First-Out algorithm, but sometimes the elements of the queue are needed to be processed according to the priority.LinkedBlockingQueue class implements the BlockingQueue interface.ArrayBlockingQueue class is a bounded blocking queue backed by an array. By bounded it means that the size of the Queue is fixed. Once created, the capacity cannot be changed.PriorityBlockingQueue is an unbounded blocking queue that uses the same ordering rules as class PriorityQueue and supplies blocking retrieval operations.DelayQueue is an unbounded blocking queue of delayed elements, in which the consumer wants to take an element from the queue, they can take it only when the delay for that particular element has expired.Delayed interface and it’s getDelay() method return a zero or negative value which indicate that the delay has already elapsed.poll() will return null.take() or poll(), they are otherwise treated as normal elements in the queue. size() method returns the count of both expired and unexpired elements.SynchronousQueue is a type of Blocking Queue which implements BlockingQueue.take() and put(), and both of them are blocking.LinkedTransferQueue class is an implementation of TransferQueue. It is a concurrent blocking queue implementation in which producers may wait for receipt of messages by consumers.Dequeue represents a double ended queue, meaning a queue where you can add and remove elements from both ends of the queue. The name Deque is an abbreviation of Double Ended Queue.
LinkedList class is a pretty standard Deque and Queue implementation. It uses a linked list internally to model a queue or a deque.ArrayDeque provides resizable-array and implements the Deque interface. It is also known as Array Double Ended Queue or Array Deck.List is an example.SortedSet is an example.Set is an example.Java Collections supports two types of Iterator, fail safe and fail fast.
The main distinction between a fail-fast and fail-safe Iterator is whether or not the underlying collection can be modified while its begin iterated. If you have used Collection like ArrayList then you know that when you iterate over them, no other thread should modify the collection. If Iterator detects any structural change after iteration has begun e.g adding or removing a new element then it throws ConcurrentModificationException, this is known as fail-fast behavior and these iterators are called fail-fast iterator because they fail as soon as they detect any modification . Though it’s not necessary that iterator will throw this exception when multiple threads modified it simultaneously. it can happen even with the single thread when you try to remove elements by using ArrayList’s remove() method instead of Iterator’s remove method. Most of the Collection classes e.g. Vector, ArrayList, HashMap, HashSet has fail-fast iterators.
The iterators used by concurrent collection classes e.g. ConcurrentHashMap, CopyOnWriteArrayList and CopyOnWriteArraySet uses a view of original collection for doing iteration and that’s why they doesn’t throw ConcurrentModificationException even when original collection was modified after iteration has begun. This means you could iterate and work with stale value, but this is the cost you need to pay for fail-safe iterator.
The load factor is a measure of how full the collection(hash table) is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
Natural ordering is the default ordering of objects of a specific type when they are sorted in an array or a collection. The Java language provides the Comparable interface that allows us define the natural ordering of a class. This interface is declared as follows:
public interface Comparable<T> {
public int compareTo(T object);
}As observed, this interface is parameterized (generics) and it has a single method compareTo() that allows two objects of a same type to be compared with each other. The important point here is the value returned by this method: an integer number indicates the comparison result of two objects.
Rules to remember: compare value = 0: two objects are equal. compare value > 0: the first object (the current object) is greater than the second one. compare value < 0: the first object is less than the second one.
Collections are sorted by natural ordering of its elements:
It is recommended practice to override both equals and hashCode methods when comparing two Java objects.
Simple approach
Below is simple approacg to override equals & hashcode methods in a pojo class
public class Employee {
private int id;
private String name;
private String address;
//getters and setters, constructor
@Override
public boolean equals(Object o) {
if (o == this) return true;
if (!(o instanceof Employee)) {
return false;
}
Employee employee = (Employee) o;
return employee.name.equals(name) &&
employee.id == id &&
employee.address.equals(address);
}
@Override
public int hashCode() {
// 15 * 28 are some random integers to derive the hashcode value
int hashCode = 15;
hashCode = 28 * hashCode + name.hashCode();
hashCode = 28 * hashCode + id;
hashCode = 28 * hashCode + address.hashCode();
return hashCode;
}
}Using Objects.hash()
Objects class includes utility method equals() && hash() to generate the equals and hash code values.
public class Employee {
private int id;
private String name;
private String address;
//getters and setters, constructor
@Override
public boolean equals(Object o) {
if (o == this) return true;
if (!(o instanceof Employee)) {
return false;
}
Employee employee = (Employee) o;
return id == employee.id &&
Objects.equals(name, employee.name) &&
Objects.equals(address, employee.address);
}
@Override
public int hashCode() {
return Objects.hash(id, name, address);
}
}Using Apache Commons Lang
Apache Commons Lang library has got utility classes EqualsBuilder and HashCodeBuilder to perform equal operation & hashCode generations
@Override
public boolean equals(Object o) {
if (o == this) return true;
if (!(o instanceof Employee)) {
return false;
}
Employee employee = (Employee) o;
return new EqualsBuilder()
.append(id, employee.id)
.append(name, employee.name)
.append(address, employee.address)
.isEquals();
}
@Override
public int hashCode() {
return new HashCodeBuilder(15, 28)
.append(id)
.append(name)
.append(address)
.toHashCode();
}
Using Lombok
@EqualsAndHashCode
public class Employee {
private int id;
private String name;
private String address;
//getters and setters, constructor
}To add an element to Set we need to use add() method. add() method returns false when we try to add a duplicate element which indicates element is not added to set.
java.util.Vector came along with the first version of java development kit (JDK). java.util.ArrayList was introduced in java version1.2, as part of java collections framework.
All the methods of Vector is synchronized. But, the methods of ArrayList is not synchronized. All the new implementations of java collection framework is not synchronized.
Vector and ArrayList both uses Array internally as data structure. They are dynamically resizable. Difference is in the way they are internally resized. By default, Vector doubles the size of its array when its size is increased. But, ArrayList increases by half of its size when its size is increased.
Therefore as per Java API the only main difference is, Vector’s methods are synchronized and ArrayList’s methods are not synchronized.
Similarities in both the classes:
Differences between both the classes:
HashMap doesn’t provide any Enumeration, while Hashtable provides not fail-fast Enumeration.Hashtable doesn’t allow null keys and null values, while HashMap do allow one null key and any number of null valuesHashtable’s methods are synchronized while HashMaps’s methods are not synchronized.HashMap extends AbstractMap and implement Map interface, EnumMap extends AbstractMap class which implements Map interface.HashMap internally uses HashTable, on the other hand EnumMap internally uses arrays and thus representation is extremely compact and efficient.Below is the brief difference between HashMap & EnumMap
| HashMap | EnumMap |
|---|---|
HashMap is general purpose map implementation, internally uses HashTable. HashMap can use any object as key. | EnumMap is optimized for Enum type keys to store in map. EnumMap internally using as arrays, this representation is extremely compact and efficient, provide constant time performance for common methods like get, and put. |
| HashMap performance is not better as EnumMap. | Due to specialized optimization done in EnumMap, its performance is better than HashMap. |
| All type of object can use as keys in HashMap. | Only Enum type can use as keys in EnumMap. |
HashMap using hashCode() to store the keys and values so there is probability of collision. | Since EnumMap internally maintain an array and they are stored in their natural order using ordinal(), so there is no probability of collision. |
| HashMap is not ordered, HashMap order can change over the time. | EnumMap stores keys in the natural order of their keys, the order in which the enum constants are declared. |
ordinal() - The ordinal() method returns the order of an enum instance. It represents the sequence in the enum declaration, where the initial constant is assigned an ordinal of ‘0’. It is very much like array indexes.It is designed for use by sophisticated enum-based data structures, such as EnumSet and EnumMap.
public enum STATUS
{
ACTIVE, INACTIVE, DELETED
}
STATUS.ACTIVE.ordinal(); //0
STATUS.DELETED.ordinal(); //2The functionality of Enumeration and the Iterator are same. Using Enumeration you can only traverse and fetch the objects, where as using Iterator we can also add and remove the objects. Iterator can be useful if you want to manipulate the list and Enumeration is for read-only access.
| Iterator | Enumeration |
|---|---|
| Iterator is applicable for all the collection classes. | Enumeration applies only to legacy classes. |
Iterator has the remove() method. | Enumeration does not have the remove() method. |
| Iterator can do modifications. Calling remove() during collection traversal will remove the element from the Collection. | Enumeration acts as a read only interface, one can not do any modifications to Collection while traversing the elements. |
| Iterator is not a legacy interface. Iterator can be used for the traversal of HashMap, LinkedList, ArrayList, HashSet, TreeMap, TreeSet. | Enumeration is a legacy interface which is used for traversing Vector, Hashtable. |