🔢 Counting Sort

Intermediate ~12 min read

You're grading 100 test papers, each scored 0-100. Instead of comparing scores, you have a smarter strategy: Count how many students got each score, then reconstruct the sorted list. No comparisons needed! That's Counting Sort - O(n) linear time!

The Tally Counter

Imagine a tally sheet with boxes for each possible score. Go through the papers once, making a tally mark for each score. Then read off the tallies in order!

The Strategy

Find the range: Minimum to maximum values
Count occurrences: How many times each value appears
Reconstruct: Build sorted array from counts
Done in O(n) time! No comparisons!

Key insight: When values are in a small range, counting beats comparing!

Breaking the O(n log n) Barrier!

The Breakthrough

All comparison-based sorts (Merge, Quick, Heap) have a theoretical lower bound of O(n log n).

Counting Sort breaks this barrier!

Comparison sorts: O(n log n) minimum
Counting Sort: O(n + k) where k = range
When k is small: Basically O(n)! 🚀

How? It doesn't compare elements - it counts them!

See It in Action

Watch how Counting Sort counts occurrences and reconstructs the sorted array:

Understanding the Process

Phase 1: Count occurrences - O(n)
Phase 2: Reconstruct sorted array - O(n + k)
Total: O(n + k) linear time!
Tally marks: Visual count of each value
No comparisons: Just counting!

The Code

"""
Counting Sort - The Tally Counter
O(n) linear time sorting for small ranges!
"""

def counting_sort_basic(arr):
    """
    Basic Counting Sort (not stable)
    Time: O(n + k) where k = range
    Space: O(k)
    """
    if not arr:
        return arr
    
    print(f"Original array: {arr}")
    print(f"\n{'='*60}")
    print("PHASE 1: Counting Occurrences")
    print(f"{'='*60}")
    
    # Find range
    min_val = min(arr)
    max_val = max(arr)
    range_size = max_val - min_val + 1
    
    print(f"Min value: {min_val}")
    print(f"Max value: {max_val}")
    print(f"Range size: {range_size}")
    
    # Create count array
    count = [0] * range_size
    print(f"\nInitial count array (size {range_size}): {count}")
    
    # Count occurrences
    for num in arr:
        count[num - min_val] += 1
        print(f"Counted {num} → count[{num - min_val}] = {count[num - min_val]}")
    
    print(f"\nFinal count array: {count}")
    
    print(f"\n{'='*60}")
    print("PHASE 2: Reconstructing Sorted Array")
    print(f"{'='*60}\n")
    
    # Reconstruct sorted array
    result = []
    for i in range(range_size):
        value = i + min_val
        occurrences = count[i]
        if occurrences > 0:
            print(f"Value {value} appears {occurrences} time(s)")
            result.extend([value] * occurrences)
    
    return result

def counting_sort_stable(arr):
    """
    Stable Counting Sort (preserves order of equal elements)
    Time: O(n + k) where k = range
    Space: O(n + k)
    """
    if not arr:
        return arr
    
    print(f"Original array: {arr}")
    print(f"\n{'='*60}")
    print("STABLE VERSION - Uses Cumulative Counts")
    print(f"{'='*60}")
    
    # Find range
    min_val = min(arr)
    max_val = max(arr)
    range_size = max_val - min_val + 1
    
    print(f"Range: {min_val} to {max_val} (size: {range_size})")
    
    # Count occurrences
    count = [0] * range_size
    for num in arr:
        count[num - min_val] += 1
    
    print(f"\nCount array: {count}")
    
    # Convert to cumulative counts
    for i in range(1, range_size):
        count[i] += count[i - 1]
    
    print(f"Cumulative counts: {count}")
    print("(Each position shows: how many elements ≤ this value)")
    
    # Build output array (right to left for stability)
    output = [0] * len(arr)
    print(f"\n{'='*60}")
    print("Building output (right to left for stability):")
    print(f"{'='*60}\n")
    
    for i in range(len(arr) - 1, -1, -1):
        num = arr[i]
        index = count[num - min_val] - 1
        output[index] = num
        count[num - min_val] -= 1
        print(f"Place {num} at index {index} → output: {output}")
    
    return output

# Example 1: Basic counting sort
print("=" * 60)
print("Example 1: Basic Counting Sort (Ages)")
print("=" * 60)
ages = [23, 19, 27, 19, 25, 23, 30]
result1 = counting_sort_basic(ages.copy())
print(f"\nSorted: {result1}\n")

# Example 2: Stable counting sort
print("\n" + "=" * 60)
print("Example 2: Stable Counting Sort (Test Scores)")
print("=" * 60)
scores = [85, 90, 85, 78, 90, 95, 78]
result2 = counting_sort_stable(scores.copy())
print(f"\nSorted: {result2}\n")

# Example 3: With negative numbers
print("=" * 60)
print("Example 3: With Negative Numbers")
print("=" * 60)
temps = [-5, 0, -3, 2, -5, 3, 0]
result3 = counting_sort_basic(temps.copy())
print(f"\nSorted: {result3}\n")

# Example 4: Small range (perfect for counting sort!)
print("=" * 60)
print("Example 4: Grades (A=4, B=3, C=2, D=1, F=0)")
print("=" * 60)
grades = [3, 4, 2, 3, 4, 1, 2, 4, 3]
grade_names = ['F', 'D', 'C', 'B', 'A']
result4 = counting_sort_basic(grades.copy())
result4_names = [grade_names[g] for g in result4]
print(f"\nSorted grades: {result4_names}\n")

# Example 5: Large range (NOT ideal for counting sort!)
print("=" * 60)
print("Example 5: Large Range (Inefficient!)")
print("=" * 60)
large_range = [1, 1000000, 5, 999999]
print(f"Array: {large_range}")
print(f"Range: {min(large_range)} to {max(large_range)}")
print(f"Range size: {max(large_range) - min(large_range) + 1}")
print("⚠️ Would need 1,000,000 element count array!")
print("⚠️ Use comparison-based sort instead!\n")

print("=" * 60)
print("Key Insights:")
print("=" * 60)
print("✓ Time: O(n + k) where k = range")
print("✓ Space: O(k) for count array + O(n) for output")
print("✓ Linear time when k is small!")
print("✓ No comparisons - counts occurrences instead")
print("✓ Stable version preserves order of equal elements")
print("✓ Perfect for: ages, grades, small integers")
print("✓ Terrible for: large ranges, arbitrary numbers")
print("✓ Foundation for Radix Sort!")

Output

Click Run to execute your code

Two Versions: The basic version is simpler but not stable. The stable version uses cumulative counts and builds right-to-left to preserve the order of equal elements. This is crucial for Radix Sort!

Quick Quiz

Why can Counting Sort be O(n) when comparison sorts are O(n log n)?

Show answer
Counting Sort doesn't compare elements! It counts occurrences instead. The O(n log n) lower bound only applies to comparison-based sorting. By using a different approach (counting), we can beat this barrier for specific cases (small integer ranges).
When would Counting Sort be terrible?

Show answer
When the range is huge! If you're sorting 100 numbers with values from 1 to 1,000,000,000, you'd need a billion-element count array. That's impractical! Counting Sort only works well when the range (k) is small compared to the number of elements (n).

Why O(n + k)?

Let's break down the complexity:

Step	Time	Why
Find min/max	O(n)	Scan array once
Count occurrences	O(n)	Scan array once, increment counts
Reconstruct array	O(n + k)	Iterate through counts (k), place elements (n)
Total	O(n + k)	Linear when k is small!
Space	O(k)	Count array of size k

✅ When k is Small

If k (range) is O(n) or smaller:

Time: O(n + k) = O(n) ✓
Example: Sorting 1 million ages (k=120) → O(1,000,000) linear!
Faster than: O(n log n) = O(20,000,000) for comparison sorts!

⚠️ When k is Large

If k (range) is much larger than n:

Time: O(n + k) ≈ O(k) - dominated by range!
Space: O(k) - huge count array!
Example: Sorting 100 numbers (n=100) with range 1-1,000,000,000 (k=1B)
Result: Terrible! Use comparison sort instead!

When to Use Counting Sort

✅ Perfect For:

Ages: 0-120 (k=120)
Grades: 0-100 or A-F (k=100 or 5)
Pixel intensities: 0-255 (k=255)
Small integers: Range is reasonable
Foundation for Radix Sort: Digit-by-digit sorting

Rule of thumb: If k ≤ n, Counting Sort wins!

❌ Terrible For:

Arbitrary large integers: Range too big
Floating-point numbers: Not integers
Strings: Use Radix Sort instead
Sparse ranges: Few values in huge range

Example: Sorting [1, 1000000, 5, 999999] needs 1M count array for 4 numbers!

Counting Sort vs Others

Feature	Quick Sort	Heap Sort	Counting Sort
Type	Comparison	Comparison	Non-comparison ✓
Time (avg)	O(n log n)	O(n log n)	O(n + k) ✓
Space	O(log n) ✓	O(1) ✓	O(k)
Stable	No	No	Yes (stable version) ✓
Use When	General purpose	Guaranteed + in-place	Small integer range ✓

The Verdict

Counting Sort is a specialist:

✅ Unbeatable for small ranges - O(n) linear!
✅ Foundation for Radix Sort
✅ Stable version available
⚠️ Only works for integers in small ranges
⚠️ Space usage depends on range, not array size

Know your data, choose your algorithm!

The Stable Version

Why do we need a stable version?

Stability Matters for Radix Sort

When sorting by multiple keys (like sorting by last name, then first name), we need stability:

Example: Sorting students by grade, then by name
First: Sort by name (stable)
Then: Sort by grade (must be stable!)
Result: Students with same grade are still sorted by name

Radix Sort uses Counting Sort repeatedly for each digit - stability is crucial!

The stable version uses cumulative counts:

How Cumulative Counts Work

Count occurrences: count[i] = how many elements equal i
Make cumulative: count[i] += count[i-1]
Now count[i] = how many elements ≤ i
Build output: Right to left, place each element at count[value]-1
Result: Stable! Equal elements maintain original order

Common Misconceptions

❌ Misconception 1: "Counting Sort works for any data"

Reality: Only for integers in a small range!

Integers: Yes ✓
Floats: No (can convert to integers)
Strings: No (use Radix Sort)
Objects: No (unless you extract integer keys)
Large range: No (space explosion!)

It's a specialist tool, not a general-purpose sort!

❌ Misconception 2: "Counting Sort is always faster"

Reality: Only when k (range) is small!

Small range (k ≤ n): Counting Sort wins! O(n) vs O(n log n)
Large range (k >> n): Comparison sorts win! O(n log n) vs O(k)
Example 1: 1M ages (n=1M, k=120) → Counting wins!
Example 2: 100 numbers, range 1-1B (n=100, k=1B) → Quick Sort wins!

Analyze your data before choosing!

❌ Misconception 3: "Counting Sort is in-place"

Reality: Needs O(k) extra space!

Count array: Size k (range)
Output array: Size n (stable version)
Total space: O(n + k)
Not in-place: Unlike Heap Sort or Quick Sort

Trade-off: Speed (O(n)) for space (O(k))

❌ Misconception 4: "Counting Sort is the fastest sort"

Reality: Fastest for its specific use case!

For small ranges: Yes, unbeatable!
For general data: No, use comparison sorts
For strings: No, use Radix Sort
For large ranges: No, terrible!

Every algorithm has its niche!

💡 Key Takeaway

Counting Sort is a game-changer for small ranges:

✅ O(n) linear time - breaks O(n log n) barrier!
✅ Non-comparison based - counts instead of compares
✅ Stable version available - crucial for Radix Sort
⚠️ Only for integers in small ranges
⚠️ Space usage depends on range

Understanding Counting Sort teaches you that sometimes, a specialized tool beats a general-purpose one!

Summary

The 3-Point Counting Sort Guide

Strategy: Count occurrences, reconstruct - no comparisons needed!
Complexity: O(n + k) linear when k is small - breaks O(n log n) barrier!
Trade-off: Speed (O(n)) vs limited use case (small integer ranges)

Interview tip: Mention Counting Sort for sorting integers in small ranges. Explain why it's O(n), when to use it, and why it needs stability for Radix Sort!

What's Next?

You've learned the first non-comparison sort! Next: Radix Sort - extending Counting Sort to handle larger ranges by sorting digit-by-digit!

Practice: Implement Counting Sort for sorting students by age. Then try the stable version for sorting by grade, then by name!

Previous Heap Sort

Next Radix Sort