An Introduction to Java Collections: Lists, Sets, and Maps

Java Collections give every program a reliable toolbox for grouping, searching, and transforming data. They hide low-level arrays behind clean interfaces and ready-made algorithms, letting you focus on business logic instead of memory juggling.

The framework centers on three core interfaces: List keeps order, Set enforces uniqueness, and Map pairs keys with values. Choosing the right one early prevents hidden performance traps and keeps code readable as requirements shift.

Lists: Ordered Sequences That Embrace Duplicates

A List remembers every element’s position and gladly accepts the same object twice. This predictability makes it the default choice for menus, playlists, or any data where “next” and “previous” matter.

ArrayList wraps a resizeable array underneath, giving fast positional reads and writes at the cost of slower middle insertions. LinkedList stores nodes with forward and backward pointers, so adding or removing in the middle is cheap but jumping to index n requires a walk from either end.

Pick ArrayList when you mostly read or append; switch to LinkedList only if frequent add/remove in the middle dominates your workload.

Common List Operations and Their Complexity

get(index) on ArrayList is constant time because it offsets straight into the internal array. add(E) at the end amortizes to constant time until the buffer must grow, then it briefly spikes while a larger array is allocated and copied.

LinkedList offers constant-time addFirst, addLast, removeFirst, removeLast, making it ideal for queues or deques. Avoid indexed loops on LinkedList; use iterators to skip the costly node traversal.

Choosing Between ArrayList and LinkedList

Measure your dominant operation, not the occasional edge case. A loop that calls get(i) millions of times will feel sluggish on LinkedList even if insertions are rare.

Memory footprint also differs: ArrayList pays a single object header plus array slots, while LinkedList pays per node, so large lists can bloat heap unexpectedly. When in doubt, start with ArrayList and switch only after profiling proves LinkedList wins.

Sets: Uniqueness Guards Without Positional Promises

A Set refuses duplicates by definition, making it perfect for tags, user roles, or any collection where “is it already there?” is the key question. It does not promise element order unless you pick a specific implementation.

HashSet stores objects in a hash table, offering near-constant add, remove, and contains time. TreeSet keeps entries in a red-black tree, so every element remains sorted by natural order or a custom comparator at the price of logarithmic time.

LinkedHashSet marries both worlds: hash table speed plus insertion-order iteration, useful for caches that must expire oldest first.

HashSet Internals and Load Factor

HashSet delegates to HashMap internally, using each element as a key and a dummy value. The load factor (default 0.75) triggers resizing once buckets fill beyond that ratio, preventing chains from growing too long.

You can raise the load factor to delay resizing and save memory, but expect more collisions and slower contains. Lowering it burns extra memory yet keeps operations snappy.

TreeSet Sorted Features

TreeSet exposes first(), last(), headSet(to), tailSet(from), letting you treat the set like a live, sorted slice. These views update automatically when the original set changes, so you can build range queries without manual filtering.

Remember that elements must be mutually comparable; attempting to add incomparable types throws ClassCastException immediately, failing fast instead of corrupting order silently.

Maps: Key-Value Pairs as the Backbone of Lookup

Map is not a Collection in the strict sense, yet it sits at the heart of most real-world Java code. Every cache, index, or dictionary eventually becomes a Map where keys drive lightning-fast lookups.

HashMap scatters keys by hash code into buckets, giving average constant-time performance for get and put. TreeMap keeps entries sorted by key, enabling binary-search operations like floorKey and ceilingKey that HashMap cannot offer.

LinkedHashMap preserves either insertion or access order, powering LRU caches with a simple constructor flag.

HashMap Collision Handling

When two keys land in the same bucket, Java first compares with equals; if still equal, the new value overwrites. From Java 8 onward, buckets switch to a balanced tree once a threshold of colliding keys is reached, guarding against hash-flooding denial-of-service attacks.

Good key hygiene—immutable fields, consistent equals/hashCode—keeps collisions rare and buckets lean. Never use a mutable object as a key if its hash code can change after insertion; the map will lose track of it immediately.

TreeMap Range Queries

subMap(from, to) returns a live view that shrinks or grows as you add or remove entries. Combine it with descendingMap() to walk results in reverse order without extra sorts.

This makes TreeMap ideal for time-based keys where you often ask “what happened between 10:00 and 11:00?” The map answers in log time and returns a navigable subset you can iterate once.

Thread Safety and Concurrent Alternatives

Standard collections are not synchronized; concurrent access can corrupt internal structure or throw ConcurrentModificationException. Wrap with Collections.synchronized* factories for simple cases, but understand that the wrapper locks the entire structure on every call.

For high concurrency, prefer ConcurrentHashMap, CopyOnWriteArrayList, or ConcurrentSkipListSet. These use segmented locks or copy-on-write snapshots to let many threads read and update simultaneously with minimal blocking.

Fail-Fast vs Fail-Safe Iterators

Iterator on ArrayList checks for structural modification and throws ConcurrentModificationException if another thread adds or removes during traversal. This fail-fast behavior surfaces bugs early but demands external locking or single-thread usage.

Concurrent collections return fail-safe iterators that clone or traverse over a snapshot, so updates continue in parallel without exceptions. The trade-off is that the snapshot may miss very recent changes, yet reads remain consistent.

Generic Type Safety and the Raw Type Trap

Collections without generics default to Object, forcing casts and risking ClassCastException at runtime. Always declare the element type: List names = new ArrayList<>(); keeps the contract explicit and lets the compiler reject add(123) immediately.

Diamond operator <> on the right side repeats no type information, keeping declarations concise. Wildcards extend or super let you accept or produce subtypes safely: List can be read but not written with arbitrary Numbers.

P-E-C-S Principle in Wildcards

Producer Extends, Consumer Super: if a method only reads Ts from a collection, declare it List; if it only writes, use List. This maximizes flexibility while preserving type correctness.

Following PECS prevents the need for multiple overloaded methods and keeps APIs future-proof when subclasses emerge later.

Utility Classes and Algorithm Shortcuts

Collections utility class packs static helpers: sort, reverse, shuffle, binarySearch, and synchronized wrappers. These methods operate on raw lists, so you can retrofit order or thread safety without rewriting code.

Arrays utility bridges arrays and lists: Arrays.asList returns a fixed-size list view backed by the original array, useful for quick literal initialization. Change either the array or the list and both reflect the update, but adding or removing throws UnsupportedOperationException.

Immutable Factory Methods

List.of, Set.of, Map.of create compact, immutable instances in Java 9+. They reject null elements and throw UnsupportedOperationException on any modification attempt, making them perfect for constant lookup tables.

Because these factories skip generic type arguments through target typing, you can write Map days = Map.of(“Mon”,1,”Tue”,2); with no extra boilerplate.

Performance Footprint and Memory Layout

ArrayList header plus array costs about 12 bytes plus 4 bytes per slot even when empty, doubling on each growth. LinkedList node wraps element, prev, next pointers, ballooning to roughly 24 bytes per entry regardless of payload size.

HashMap entry node holds key, value, hash, and next reference, totaling ~32 bytes each before key/value objects. TreeMap entry adds color bit and parent/left/right links, pushing overhead higher but still logarithmic in operations.

Right-Sizing on Construction

Supply an initial capacity to ArrayList or HashMap when the final size is predictable: new ArrayList<>(10000) avoids four incremental resizes and array copies. The same trick applies to HashMap—size / loadFactor gives the minimal initial buckets.

Over-estimating by a small margin is cheaper than under-estimating and paying for repeated rehashing or copying as data piles up.

Iteration Styles and Hidden Costs

Traditional for-loop with get(index) on LinkedList turns linear operation into quadratic nightmare as index walks the chain each time. Prefer enhanced for-each or explicit iterator when the implementation might vary.

Java 8 forEach method accepts a lambda and can be parallelized via collection.parallelStream(), but only if the Spliterator splits evenly and the workload justifies overhead.

Iterator remove vs Collection remove

Iterator.remove avoids ConcurrentModificationException by updating the iterator’s internal expected-modification count. Calling list.remove(Object) during iteration invalidates the iterator immediately and throws on the next next().

When filtering while traversing, always use the iterator’s remove or collect items into a separate list and call removeAll afterwards.

Comparable vs Comparator Strategies

Comparable is baked into the element class, defining natural order via compareTo. Comparator lives outside the class, letting you plug in multiple orderings without touching the original source.

TreeSet and TreeMap demand one or the other; supplying none with non-Comparable elements triggers runtime exception. Keep compareTo consistent with equals to avoid odd behavior in sorted collections where equality and ordering differ.

Comparator Factory Methods

Comparator.comparing(Function) creates a key-based comparator in one line: Comparator.comparing(Employee::getSalary).reversed() sorts highest salary first. ThenComparing lets you chain ties without writing verbose multi-field logic.

Static methods naturalOrder() and reverseOrder() reuse built-in Comparable logic, saving you from reimplementing the obvious.

Real-World Selection Cheat Sheet

Need fast random access? ArrayList. Need uniqueness without order? HashSet. Need key lookup? HashMap. Need sorting? TreeSet or TreeMap. Need insertion order? LinkedHashSet or LinkedHashMap. Need concurrency? ConcurrentHashMap or CopyOnWriteArrayList.

Start with the simplest fit, measure under realistic load, and upgrade only when profiling proves the need. Collections are interchangeable at the interface level, so swapping later is usually a one-line change once you hide concrete types behind List, Set, or Map declarations.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *