Tries
December 16, 2024 | Data Structures
A trie (pronounced “try”), also known as a prefix tree, is a tree-like data structure that stores strings in a way that facilitates fast retrieval. Tries are widely used in autocomplete systems, dictionaries, and search engines.
What is a Trie?
A trie organizes strings so that common prefixes are stored once, saving space and allowing for efficient operations. Each node in a trie represents a single character, and the path from the root to a node forms a prefix of the strings stored in the trie.
Example Structure
Consider storing the words “cat,” “cap,” and “dog”:
(root)
/ \
c d
/ \ \
a o o
/ \ \
t g g
/ \
(p) (end)
Properties of a Trie
- Root: Represents the starting point (empty string).
- Edges: Represent characters in the string.
- Nodes: Store characters and indicate if they form a complete word.
Operations on a Trie
1. Insertion
To insert a word, start at the root and create new nodes for characters not already in the trie.
class TrieNode:
def __init__(self):
self.children = {}
self.is_end_of_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
current = self.root
for char in word:
if char not in current.children:
current.children[char] = TrieNode()
current = current.children[char]
current.is_end_of_word = True
# Example usage:
trie = Trie()
trie.insert("cat")
trie.insert("cap")
2. Search
To search for a word, traverse the trie following the characters of the word. If the path exists and ends at a complete word, the search is successful.
def search(self, word):
current = self.root
for char in word:
if char not in current.children:
return False
current = current.children[char]
return current.is_end_of_word
# Example usage:
print(trie.search("cat")) # Output: True
print(trie.search("cap")) # Output: True
print(trie.search("can")) # Output: False
3. Starts With
To check if a prefix exists, traverse the trie following the characters of the prefix.
def starts_with(self, prefix):
current = self.root
for char in prefix:
if char not in current.children:
return False
current = current.children[char]
return True
# Example usage:
print(trie.starts_with("ca")) # Output: True
print(trie.starts_with("do")) # Output: True
print(trie.starts_with("de")) # Output: False
Applications of Tries
- Autocomplete: Suggesting words based on a prefix.
- Spell Checkers: Verifying if a word exists in a dictionary.
- IP Routing: Storing routing tables in networking.
- Genome Sequencing: Searching for DNA subsequences efficiently.
- Data Compression: Using tries to represent strings compactly.
Advantages and Disadvantages
Advantages
- Fast Search: Time complexity for search and insertion is (O(L)), where (L) is the length of the word.
- Efficient Prefix Matching: Ideal for operations that involve prefixes.
- Space Optimization: Common prefixes are stored once.
Disadvantages
- Memory Usage: Tries can use significant memory for sparse datasets.
- Complexity: Implementation is more intricate compared to simpler data structures like arrays or hash maps.