This page was put together very well. It has interactive illustrations when needed (not excessive), and the explanations were informative yet concise. I also like how it brings up other uses of quadtrees, such as for images. This encouraged me to think about how they might be used elsewhere.
I used quad trees in a janky fractal compression schema starting big and only moving small if a feature couldnt be represented in the larger space. It kinda worked. And then you add motion into the mix with oct trees. Love this stuff.
I was thinking of building an interactive visualization of mountain prominence, by progressing down the contour lines till the current contour line encircles a peak that is taller than the one that I started with.
I think Quadtrees will come handy in this visualization, if a precomputed list of all the peaks were available.
Why would someone select "quad" trees in particular, instead of binary splitting at each level (in alternating dimensions; something like a K-D tree)? I.e., what are the tradeoffs? The article briefly mentions K-D trees at the very end, but doesn't elaborate on differences:
> The quadtree is the two-dimensional case of a broader family of space-partitioning data structures. Octrees extend the same idea to three dimensions (splitting cubes into eight children), KD-trees use alternating axis-aligned splits (splitting along x, then y, then x again), and R-trees group nearby objects into bounding rectangles. Each variant makes different tradeoffs between construction time, query speed, and update cost.
Finally, I'll add: the presentation is very high quality and served as a great introduction of the concept.
Not commenting on the quad tree specifically but I know a lot of reasoning that goes into non-binary tree structures comes down to performance gains from exploiting cache entry size. When each node lookup in the tree is going to trigger an 64-byte load from L1, it's often much better performance to have your tree have 4 or 8 pointers to children in that cache entry rather than two if it means you do on average half or quarter as many traversals.
KD-trees select their splits according to the contained points. That tends to make them better for static sets of points, but updates become expensive. Quadtrees are often used in e.g. physics engines.
A very famous application of QuadTrees was Bill Gosper's HashLife algorithm for computing Conway's Game of Life. The Life universe is implemented as a quadtree, taking advantage of precomputed smaller squares to compute larger squares.
Nice and concise description of quadtrees implemented with classical pointer chasing data structures.
A faster and (arguably) simpler way to construct quad/octrees is using morton codes, sorting and searching with flat arrays [0]. It's also probably easier to implement.
The gist of it is that you quantize the coordinates, do bit interleaving to construct morton code and then sort. The sorting is using numerical keys so you can use a radix sort for O(n) complexity which is much faster on a GPU (but on single CPU core a comparison based sort will probably win if the array isn't huge).
Now everything on the top half of the space is in the beginning of the array and the bottom half is in the end of the array (assuming yxyxyxyx morton code bits). Each of those halves is then split left/right and then each level below that alternates between vertical and horizontal splits.
To find the split point in the array, look at the first and last entry of the (sub)array you're looking at, and look for the first differing bit with (first ^ last).leading_zeros(). Then binary search for the first entry where that bit is high.
To traverse the quad/octree, repeat this process with the two halves you found. This can be done without recursion in O(1) memory using a fixed size stack because you know the depth of the "recursion" is at most half the number of bits in the morton code.
If you used radix sorting for building the array, you can avoid the binary search if you store the histograms from the counting phase of the sorting. Storing the whole histogram may be too much, but just a few highest order bits can already help.
Although I've found experimentally that for small (less that 10000 objects) just sorting and searching is faster if the whole array fits in L2 cache. On my 2015 laptop a single core can sort 10k objects with 64 bit keys in 1 millisecond. Traversing the tree for frustum culling is about 5x faster than just going through the entire array because a lot of geometry can be discarded very quickly.
With a good comparison based sort rebuilding the array after some changes is mighty fast because modern sorting algorithms are much faster for "almost sorted" inputs.
For range queries ("Find everything in the region" in the article), you can probably get better performance by using the BIGMIN/LITMAX method [1].
Now here's a brain teaser to delight your Friday: the article and the method I describe above is for storing points/centroids, but often you need to store axis aligned boxes instead (AABB). There is a very clever trick for this that I discovered independently but later found in some research papers. Can you come up with a (very) small change in the algorithm above to extend it to AABBs instead of centroids?
Discretize the AABB with a certain length proportional to the morton code grid size so that each sample point will land in a distinct but continuous sequence of morton codes enclosing the AABB, compute the hash of each of those points, then dump all of those morton codes into the sort and reverse lookup the other AABBs with the same morton code :)
The problem comes at scale, when you have many AABBs with different sizes. Memory pressure can be punishing and you are bottlenecked by the network for distributed solutions. Is there a faster way with fewer points? I would love to know!
For each AABB, you take the morton code of the minimum and maximum and store them in the array.
But for sorting and searching, construct a sort key from the pair of morton codes so that the most significant bits are the common prefix of the two morton codes and the least significant bits are the "level" (length of common prefix in bits).
E.g. for a 3d octree with 64 bit keys, you'd have 1 unused bit, 57 = 19 * 3 bits of morton code prefix and 6 bits of level. A 2d quadtree with 32b keys would have 1 unused bit, 26 = 2 * 13 bits of prefix and 5 bits of level.
You could also store just the sort key, but I've stored both AABB morton codes so that I can reconstruct the quantized AABBs for culling.
When you sort by this key, you'll get a linear octree where the beginning of the array is all the AABBs that overlap the subdivision plane, then everything on the left side of the plane and the rest is right side of the plane. When traversing the tree, you need two searches per node to find the entries belonging to this node and to find the split point between the left and the right side.
Having the "level" in the least significant bits works as a tiebreaker in the sorting to ensure that entries closer to the root of the tree are in the beginning of the array.
I wrote about this solution vaguely the last time this link came up (wasn't that long ago either, I think?) and you've filled in all the details and the essential Karras paper link; great post :)
Consider also looking into R-trees [1], which are like a balanced quadtree and is commonly used for indexing more complex spacial data (i.e. polygons/areas).
This looks rather interesting. I implemented a quadtree as part of writing a radiative transfer code during my masters using numpy/numba. Wasn't fun at all, but learnt a lot. But seeing someone try quadtrees in Python refreshed those memories
I guess I shouldn't be snarky - I appreciate the browser vendors claim they are working on getting all major browsers to behave the same. That said, I feel like they might be concentrating on the wrong areas. Like there are some super core areas that need to be covered.
Half a year ago made tiny "draw with pointer" demo. Tested on desktop Chrome/Safari/Firefox and iOS. Turns out it didn't work on Android. I don't have an Android to test on. But, it was literally the most basic pointerevent code. I wasn't doing anything out of the ordinary or near any edge cases.
reply