trying to make fast tries

One of the word games I really enjoy playing is Boggle. I enjoy it so much that I decided to build myself a training program for it. Boggle is a word game where you try to locate words on a 4x4 board of letters. You are then scored based on how many words you were able to find that the other players couldn’t. The longer the word, the more points you get.

For my boggle training program, the purpose is to simulate a boggle board and practice finding words using a boggle board. Once the timer is up, all the words on the board should be displayed so that one can see which words are missed.

Finding a single word on a boggle board is easy enough: from each possible square, check if you can make the potential word.

def traverse(square, word, visited):
    if square in visited:
        return

    if word == "":
        return True

    x, y = square

    # special handling for Q to act as a Qu
    if layout[x][y].upper() == 'Q':
        if word.upper().find('QU') != 0:
            return
        word = word[1:]
    elif layout[x][y].upper() != word[0].upper():
        return

    visited[square] = 1

    for i in [-1, 0, 1]:
        for j in [-1, 0, 1]:
            if i == 0 and j == 0:
                continue
            if x + i < 0 or x + i >= BOARD_SIZE:
                continue
            if y + j < 0 or y + j >= BOARD_SIZE:
                continue

            if traverse((x + i, y + j), word[1:], visited):
                return True


    del visited[square]

def find_word(line):
    for i in xrange(BOARD_SIZE):
        for j in xrange(BOARD_SIZE):
            if traverse((i, j), line, {}):
                return (i, j)

Finding all possible words on a board is more difficult. One option is to try and locate each word from the dictionary, but that can be a little slow.

def find_all_words_slow():
    found = []
    for word in words.OSPD:
        if find_word(word):
            found.append(word)

    return found

Using this method, it took about 3 seconds to find all words on the board. This is mostly because we had to do 16 DFS traversals for each word in the scrabble dictionary. This is pretty unacceptably slow for use in a program, though - who has time to wait 3 seconds after finishing their game?

The more traditional solution in Boggle is to use a trie and do a DFS from each starting square, stopping if the current DFS doesn’t lead to any words in the trie. To that end, I spent time implementing a trie in python and optimizing its construction.

class TrieNode:
    def __init__(self, v):
        self.value = v
        self.children = {}
        self.isword = False

    def insert_word(self, word):
        root = self        
        for c in word:
            if not c in root.children:
                root.children[c] = TrieNode(c)
            root = root.children[c]

        root.isword = True

def make_trie(words):
    root = TrieNode("")
    for word in words:
        root.insert_word(word)

    return root

The first attempt used classes to represent the trie and took about 1.9 seconds to build. Once the trie is built, it takes 50 - 100ms to find all the words on the board.

1.9 seconds is still pretty slow, though. My first thought for how to speed this up was: what about serializing the trie? Using pickle, it turns out that this trie takes about 30MB to store all the words in the scrabble dictionary. For context, the dictionary itself only takes 1MB. Loading the 30MB pickle file and restoring it back into memory actually took longer than building the trie in the first place.

The next idea was: what if the instantiation of the TrieNodes is taking all the time? I then switched to using plain dictionaries to represent the trie and the build time went from 1.9 seconds to 1.2 seconds:

class DictTrie(dict):
    def insert(self, words):
        for word in words:
            if not word:
                continue

            root = self
            for c in word:
                if not c in root:
                    root[c] = {}

                root = root[c]
                
            root["__word__"] = True

def make_trie(words):
    root = DictTrie()
    print "BUILDING TRIE", len(words)
    root.insert(words)

    print "BUILT TRIE TOOK", time.time() - start, "SECONDS"
    return root

Thinking I hit the limits of dictionaries, I decided to try out a python library that implemented tries in C++ called datrie. After some experimentation, I realized it was slower than using python dictionaries and gave up on that idea.

After thinking for a little while, another optimization I made was the following: sort the words in the dictionary, then when inserting a new word, if it was a parent of the previous word, we would use the previous word’s last node and start from there. For example, if we insert “cat” and then “category”, we can start “category” from the end of “cat” and just insert “egory”. This actually ended up working really well and further brought the time down from 1.2 seconds to about 700ms.

class DictTrie(dict):
    def insert(self, words):
        words.sort()
        prev = "_"
        for word in words:
            if word.find(prev) == 0:
                i = len(prev)
            else:
                i = 0
                root = self

            for x in xrange(i, len(word)):
                c = word[x]
                if not c in root:
                    root[c] = {}

                root = root[c]
                
            root["__word__"] = True
            prev = word

I also noticed that the longest word on a boggle board could only be 16 letters, which allows us to trim the word list to words that are 16 letters or less. I did some analysis and ran thousands of games to see how often 10 - 16 letter words come up and after experimenting a little, I was able to shave another 300ms off the trie build time by only inserting words at 12 letters or less. I decided that the likelihood I would actually catch any of those 12 letter words is microscopic.

okay