Tries - n-ary Prefix Tree
Trie (pronounced as "try", derived from "retrieval") is a data-structure based on n-ary tree and is a specialized form of prefix trees. In this article, we will first see what is a trie and then we will discuss about the dictionary applications of tries.
* An Example
The best way to understand tries is to work out with an example. Lets us take few records and work out a way to store them in the prefix tree:
Record Data | Key
Mango | 123456
Banyan | 234568
Pine | 239568
Bamboo | 456789
Palm | 123789
The basic idea behind tries is that of digitizing the given keys and creating tree nodes based on a particular radix. Unlike normal trees, we dont store any data in the inner nodes. Instead, only the leaf node stores the whole string. The value for the inner nodes is derived from the ordering of the respective child node.
Let us choose a radix of 10 for the given example. Based on this radix, we have to deal with normal decimal numbers and thus we have 10 different digits 0-9. In other words, we will create a tree where each node can have upto 10 children. We will examine the digits from left to right and create different child nodes and edges based on the numerical key. Below is the generated n-ary tree :
* searching a Trie
To search a key in a trie, we start from the root and traverse the tree using the individual of numerical key as the path to navigate. For-example to search for the key 234568, we start from the root and the digit '2', we take the child node with index 2 and move forward to that child node. Now, from this node we will traverse the child node with index '3' as the next digit in the key is 3 and keep on moving to next child node till we reach the destination node which is a leaf '234568'. Go to that node and do a full comparison of the data of the leaft node with the input key.
Complexity of Search : We can always assume that we can get the next digit of the key in O(1), and thus the complexity to search a d-digit word, the complexity comes out to be O(d)
Note: We can also create nodes with variable length child nodes by adding a special character in the given array of child nodes.
* comparison with other data-structures
Comparison with BST(binary search Tree):
- Faster look-up for tries O(d) V/s O(log n) for BST
- tries are more space efficient if there are large number of short keys with common prefix.
- there is no need to take care of balancing in tries as we already have a limit on the number of child nodes
Comparison with Hash Tables
- tries are faster as they dont have to calculate complex hash functions
- Tries are faster on average while inserting data because at times when hash table gets full, they have to rebuilt the indexes
- Tries have the feature of "longest prefix match", which acts like the "closest match" for a given key, which is not possible with hash tables
* Applications of Tries
- Dictionary :
most common usage of tries is in the implementation of "mobile dictionary". Tries are used because of its ability to quickly search, add, delete words. It can be used to provide features like spell checks in dictionary.
- Radix Sort :
tries can be used for lexico-graphic sorting of keys. Simply insert all the keys in the tree and do a pre-order traversal to get the list of sorted words.
- Longest Prefix
tries can be used to implement longest preix match algorithms. By design, tries have the structure that facilitates to find the longest prefix that matches the given key. This algorithm forms the basis of LZW Compression algorithm.
- AutoCompletion Feature
Some times while typing the word "te" in a text editor, you would have seen that the editor supplies with you all the list of possible words that can be completed out of the given initial letters e.g. "tea", "ten" and "ted". This feature is called auto-completion. Let us see how tries can help us in this situation:
Suppose we have to autoComplete the initials "te", we can use the above given prefix tree to help us out. Traversing through the trie for "te", we will reach the node with three different leaf nodes having all the possible values which we can display to the user as an output for autocompletion.
Tries Implementation is fairly involved and can easily be found in standard text materials. Hereby, I have limited my self to just providing an over-view of the tries. If you need detailed explanation of any of the terms / topic realted to Tries, Please Leave a comment.