Blog Con X

Tries

December 16, 2024 | Data Structures

A trie (pronounced “try”), also known as a prefix tree, is a tree-like data structure that stores strings in a way that facilitates fast retrieval. Tries are widely used in autocomplete systems, dictionaries, and search engines.

What is a Trie?

A trie organizes strings so that common prefixes are stored once, saving space and allowing for efficient operations. Each node in a trie represents a single character, and the path from the root to a node forms a prefix of the strings stored in the trie.

Example Structure

Consider storing the words “cat,” “cap,” and “dog”:

       (root)
       /   \
      c     d
     / \      \
    a   o      o
   /     \       \
  t       g       g
 / \
(p) (end)

Properties of a Trie

  • Root: Represents the starting point (empty string).
  • Edges: Represent characters in the string.
  • Nodes: Store characters and indicate if they form a complete word.

Operations on a Trie

1. Insertion

To insert a word, start at the root and create new nodes for characters not already in the trie.

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end_of_word = False

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        current = self.root
        for char in word:
            if char not in current.children:
                current.children[char] = TrieNode()
            current = current.children[char]
        current.is_end_of_word = True

# Example usage:
trie = Trie()
trie.insert("cat")
trie.insert("cap")

To search for a word, traverse the trie following the characters of the word. If the path exists and ends at a complete word, the search is successful.

def search(self, word):
    current = self.root
    for char in word:
        if char not in current.children:
            return False
        current = current.children[char]
    return current.is_end_of_word

# Example usage:
print(trie.search("cat"))  # Output: True
print(trie.search("cap"))  # Output: True
print(trie.search("can"))  # Output: False

3. Starts With

To check if a prefix exists, traverse the trie following the characters of the prefix.

def starts_with(self, prefix):
    current = self.root
    for char in prefix:
        if char not in current.children:
            return False
        current = current.children[char]
    return True

# Example usage:
print(trie.starts_with("ca"))  # Output: True
print(trie.starts_with("do"))  # Output: True
print(trie.starts_with("de"))  # Output: False

Applications of Tries

  1. Autocomplete: Suggesting words based on a prefix.
  2. Spell Checkers: Verifying if a word exists in a dictionary.
  3. IP Routing: Storing routing tables in networking.
  4. Genome Sequencing: Searching for DNA subsequences efficiently.
  5. Data Compression: Using tries to represent strings compactly.

Advantages and Disadvantages

Advantages

  • Fast Search: Time complexity for search and insertion is (O(L)), where (L) is the length of the word.
  • Efficient Prefix Matching: Ideal for operations that involve prefixes.
  • Space Optimization: Common prefixes are stored once.

Disadvantages

  • Memory Usage: Tries can use significant memory for sparse datasets.
  • Complexity: Implementation is more intricate compared to simpler data structures like arrays or hash maps.