I haven't done any real work on learning Javascript and D3.js since my last attempt a couple months back. To keep at it, I thought I'd try using D3.js to visualize a simple algorithm: finding the largest couple of items in a list.

This problem comes up all the time when doing search and recommendation type tasks. Every time you query a search engine, it has to find the couple best scored results in all matching items. For example, Google finds 15 million results when querying for 'D3.js', but only shows you the 10 best scored of these. A naive solution for finding these 10 items would be to sort everything by score, but that ends up wasting a ton of time sorting results that will be discarded.

A better solution is to use a min-heap - a tree data structure where each node in the tree has a value smaller than all of its children. Its a fantastically useful data structure, that can be used to efficiently solve this problem. By comparing each item with the root element of an appropriately sized min-heap, and pushing onto the heap when its bigger - it picks out the just the largest items:

Note: you need Javascript enabled and a SVG compatible browser to view the diagrams here!

Its looking like you're missing javascript, or are running an old web browser that doesn't support SVG. If no diagrams show up then you might want to consider loading this up in a different browser. I've tested this with Chrome, Safari and Firefox only.

The items here are scrolling across the top, and one by one get compared to the root element of the heap. If the current number is bigger, it gets recursively swapped with each smaller element in the heap. After running through all the items, the heap will contain just the largest numbers.

The Algorithm

Heap operations are included in any language that has even a half assed standard library. In Python theres the heapq module, Java has java.util.PriorityQueue class, even C++ has heap operations in the algorithm header.

Unfortunately Javascript basically doesn't have a standard library, so we have to roll our own here. Its a pretty easy though, all thats needed is a function to restore the heap order on a heap that has had its root element modified:

/// restores the heap property on a min-heap that has had the root element
/// modified
function reheap(heap) {
    var parent = 0, end = heap.length, child = 1;
    // recurse down the minheap, swapping entries as needed to
    // keep the heap property
    while (child < end) {
        // get the index of the smallest child node
        var right = child + 1;
        if ((right < end) && (heap[right] < heap[child])) {
            child = right;
        }
    
        // if the parent is less than both children, we're done here
        if (heap[parent] < heap[child]) break;
           
        // otherwise swap, and restore heap property on child node
        var temp = heap[child];
        heap[child] = heap[parent];
        heap[parent] = temp;

        parent = child;
        child = 2 * parent + 1;
    }
}

To find the biggest items, just compare each item to the root of the heap, and swap and call reheap if its bigger.

Displaying in D3

Setting up the visualization in D3 was relatively straightforward, its just a simple state machine - with D3 handling animating the transitions between states. Because of this approach its a little wacky in that hitting 'pause' won't actually do anything until the next state is hit - but at decent speeds that delay isn't too long.

I've put the code up on github for people to take a look at if they want. I'm still pretty terrible at Javascript, but I'm having fun playing around with it anyways.


Published on 10 October 2013