Web Analytics

Java HashSet

Intermediate ~15 min read

A HashSet is the most commonly used Set implementation. It stores unique elements using a hash table, providing O(1) average time for add, remove, and contains operations.

Java HashSet Internal Structure with Buckets

How HashSet Works

HashSet is backed by a HashMap internally. When you add an element:

  1. The element's hashCode() is calculated
  2. The hash determines which "bucket" to use
  3. equals() checks if element already exists in that bucket
  4. If unique, element is added; if duplicate, it's ignored
Important: For custom objects in HashSet, you MUST override both hashCode() and equals() methods for correct behavior.

Key Characteristics

Feature HashSet
Duplicates Not allowed (silently ignored)
Null elements One null allowed
Order No guaranteed order
Thread-safe No (use Collections.synchronizedSet())
Performance O(1) average for add, remove, contains

Creating a HashSet

import java.util.HashSet;
import java.util.Set;
import java.util.Arrays;

// Standard way
Set<String> fruits = new HashSet<>();

// With initial capacity (default is 16)
Set<String> bigSet = new HashSet<>(1000);

// From existing collection
List<String> list = Arrays.asList("A", "B", "A", "C");
Set<String> unique = new HashSet<>(list); // {A, B, C}

// Java 9+ factory method (immutable)
Set<String> immutable = Set.of("A", "B", "C");

Essential Operations

Set<String> colors = new HashSet<>();

// Adding elements
colors.add("Red");      // true (added)
colors.add("Green");    // true (added)
colors.add("Red");      // false (duplicate ignored)

// Checking membership - O(1)
boolean hasRed = colors.contains("Red");   // true
boolean hasBlue = colors.contains("Blue"); // false

// Removing elements
colors.remove("Green"); // true (removed)
colors.remove("Pink");  // false (not found)

// Size and empty check
int size = colors.size();      // 1
boolean empty = colors.isEmpty(); // false

// Clear all
colors.clear();
Output
Click Run to execute your code

Set Operations (Union, Intersection, Difference)

Set<Integer> setA = new HashSet<>(Arrays.asList(1, 2, 3, 4));
Set<Integer> setB = new HashSet<>(Arrays.asList(3, 4, 5, 6));

// Union (A ∪ B) - all elements from both sets
Set<Integer> union = new HashSet<>(setA);
union.addAll(setB);  // {1, 2, 3, 4, 5, 6}

// Intersection (A ∩ B) - common elements
Set<Integer> intersection = new HashSet<>(setA);
intersection.retainAll(setB);  // {3, 4}

// Difference (A - B) - elements in A but not in B
Set<Integer> difference = new HashSet<>(setA);
difference.removeAll(setB);  // {1, 2}

// Subset check
boolean isSubset = setA.containsAll(setB); // false

Using Custom Objects

class Person {
    String name;
    int age;

    // MUST override equals and hashCode!
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Person)) return false;
        Person p = (Person) o;
        return age == p.age && Objects.equals(name, p.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, age);
    }
}

// Now HashSet works correctly with Person objects
Set<Person> people = new HashSet<>();

When to Use Which Set?

Set Type Order Performance Use When
HashSet None O(1) Best general-purpose choice
LinkedHashSet Insertion order O(1) Need predictable iteration
TreeSet Sorted O(log n) Need sorted elements

Summary

  • HashSet provides O(1) performance for most operations
  • No duplicates - uses equals() and hashCode()
  • No guaranteed iteration order
  • Perfect for: removing duplicates, membership testing, set operations
  • Custom objects need proper equals() and hashCode()