37. Group Anagrams
Problem Statement
Given an array of strings strs
, group the anagrams together. You can return the answer in any order.
An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Example 1:
Input: strs = ["eat","tea","tan","ate","nat","bat"] Output: [["bat"],["nat","tan"],["ate","eat","tea"]]
Example 2:
Input: strs = [""] Output: [[""]]
Example 3:
Input: strs = ["a"] Output: [["a"]]
Solution
from collections import defaultdict
class Solution:
def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
# Use a defaultdict where keys are sorted tuples of characters
# and values are lists of anagrams.
anagram_groups = defaultdict(list)
for s in strs:
# Sort the string to create a canonical key for anagrams
# e.g., "eat" -> "aet", "tea" -> "aet", "ate" -> "aet"
sorted_s = ''.join(sorted(s))
anagram_groups[sorted_s].append(s)
# Return the values (lists of anagrams) from the dictionary
return list(anagram_groups.values())
Explanation
The core idea to group anagrams is to find a canonical representation for each group of anagrams. Since anagrams contain the same characters with the same frequencies, sorting the characters of an anagram will always result in the same string.
-
Canonical Key: For each string in the input list
strs
, we sort its characters alphabetically. For example, "eat", "tea", and "ate" all become "aet" after sorting. This sorted string acts as a unique key for all its anagrams. -
Hash Map (DefaultDict): We use a hash map (specifically,
collections.defaultdict(list)
in Python) where:- The keys are the sorted strings (our canonical representation).
- The values are lists of the original strings that produce that sorted key.
-
Grouping: We iterate through the input
strs
. For each strings
, we sort it to get itssorted_s
key. We then append the original strings
to the list associated withsorted_s
in ouranagram_groups
dictionary. -
Result: Finally, we extract all the values from the
anagram_groups
dictionary. Each value is a list of strings that are anagrams of each other.
The time complexity is O(N * K log K), where N is the number of strings in strs
and K is the maximum length of a string. This is because for each of the N strings, we sort it, which takes K log K time. The space complexity is O(N * K) in the worst case, where all strings are unique and stored in the dictionary.