Count Of Subarrays With At Most K Distinct Elements

10 min read Oct 03, 2024
Count Of Subarrays With At Most K Distinct Elements

Counting Subarrays with at Most K Distinct Elements: A Guide to Efficient Solutions

The problem of counting subarrays with at most k distinct elements is a classic problem in computer science and is often encountered in coding interviews and algorithm challenges. It involves analyzing an array and finding the total number of contiguous subarrays that satisfy the condition of having no more than a specified number (k) of unique elements.

This article aims to demystify this problem by providing an in-depth guide to its efficient solutions. We'll explore various approaches, including the sliding window technique, and understand the underlying concepts behind them.

Understanding the Problem

Let's first break down the problem statement:

  • Subarray: A contiguous segment of an array. For example, in the array [1, 2, 3, 4, 5], [2, 3, 4] is a subarray.
  • Distinct Elements: Unique elements within a subarray.
  • At Most K Distinct Elements: The subarray should contain no more than k unique elements.

Example:

Consider the array [1, 2, 1, 3, 2, 1] and k = 2.

  • The subarray [1, 2] is valid because it has 2 distinct elements (1 and 2).
  • The subarray [1, 2, 1] is also valid because it has 2 distinct elements (1 and 2), even though the element '1' appears twice.
  • The subarray [1, 2, 1, 3] is invalid because it has 3 distinct elements (1, 2, and 3).

Approach: Sliding Window Technique

The sliding window technique is an elegant and efficient approach to solving this problem. Here's how it works:

  1. Initialization:

    • We use two pointers: left and right. Both pointers are initialized to 0, marking the beginning of the sliding window.
    • We also use a dictionary (or hash map) to store the frequency of each element encountered in the current window.
  2. Expanding the Window:

    • We move the right pointer forward, adding each element to the window.
    • For each new element, we update its frequency in the dictionary.
  3. Shrinking the Window:

    • If the number of distinct elements in the window exceeds k, we start shrinking the window by moving the left pointer forward.
    • As we move left, we decrement the frequency of the element it points to in the dictionary. If the frequency of an element becomes 0, we remove it from the dictionary.
  4. Counting Subarrays:

    • For each valid window (i.e., with at most k distinct elements), we add the size of the window to the total count of subarrays.

Algorithm:

def count_subarrays(arr, k):
  """Counts subarrays with at most k distinct elements.

  Args:
    arr: The input array.
    k: The maximum number of distinct elements allowed in a subarray.

  Returns:
    The total count of subarrays meeting the criteria.
  """
  left = 0
  right = 0
  count = 0
  distinct_elements = {}

  while right < len(arr):
    distinct_elements[arr[right]] = distinct_elements.get(arr[right], 0) + 1

    while len(distinct_elements) > k:
      distinct_elements[arr[left]] -= 1
      if distinct_elements[arr[left]] == 0:
        del distinct_elements[arr[left]]
      left += 1

    count += right - left + 1
    right += 1

  return count

Explanation:

  • The while loop iterates through the array, expanding the window with the right pointer.
  • The while loop inside checks if the number of distinct elements in the window exceeds k. If it does, it shrinks the window by moving the left pointer.
  • For each valid window, the count variable is incremented by the size of the window.

Time Complexity: O(N) – The algorithm makes a single pass through the array, performing constant-time operations for each element.

Space Complexity: O(K) – The dictionary used to store the frequency of elements has a maximum size of k.

Example:

Let's demonstrate the algorithm with the array [1, 2, 1, 3, 2, 1] and k = 2.

left right distinct_elements count Window
0 0 {1: 1} 1 [1]
0 1 {1: 1, 2: 1} 3 [1, 2]
0 2 {1: 2, 2: 1} 5 [1, 2, 1]
1 2 {1: 1, 2: 1} 6 [2, 1]
1 3 {1: 1, 2: 1, 3: 1} 8 [2, 1, 3]
2 3 {1: 1, 2: 1, 3: 1} 9 [1, 3]
2 4 {1: 1, 2: 2, 3: 1} 11 [1, 3, 2]
3 4 {2: 2, 3: 1} 12 [3, 2]
3 5 {2: 2, 3: 1, 1: 1} 14 [3, 2, 1]

Key Takeaways:

  • The sliding window technique provides an efficient way to solve this problem, achieving linear time complexity.
  • The use of a dictionary for element frequency tracking helps manage the window's distinct element count.

Beyond the Basics

The fundamental concept of counting subarrays with specific constraints can be extended to more complex scenarios. Here are some possible variations:

  • Counting Subarrays with at Least K Distinct Elements: This involves adapting the sliding window to track the minimum number of distinct elements.
  • Counting Subarrays with Exactly K Distinct Elements: This variant can be solved by modifying the shrinking logic to only shrink the window when the number of distinct elements becomes less than k.

Conclusion

The problem of counting subarrays with at most k distinct elements is a valuable exercise in understanding the power of the sliding window technique. This approach offers an efficient solution with linear time complexity, making it a practical and widely applicable method. By grasping the fundamentals and their extensions, you'll be well-equipped to tackle similar problems involving subarray analysis and element constraints.