Neat idea! It's using the same generalized strategy as a Splay Tree. Not exactly the same of course, but there's an analogy here. The algorithm achieves a fast amortized (and average) time by continually shortening the "search paths." (In the case of the RH Hash, it's not actually a path, but a sequence of slots.)
Interesting observation - I haven't thought about splay trees for a while, but I had a similar initial thought, that this was effectively an MRU optimization. On reading further though, that's not the point of the RH hash, which is reducing variance in the probe length. The splay operation comes with a lot more overhead than a swap, and the MRU-specific benefits with RH hashing are going to be averaged out over time.
Found the CMU grad! I think this is very different from a splay tree, though. Open-address tables take advantage of the cache by removing linked lists from the data structure, which is why I think they tend to have good real-world performance.
Splay trees, on the other hand, can only really be implemented using pointers and a linked structure (splaying a tree that's been implemented with an array sounds absolutely miserable). They don't have good cache usage, which is why they generally perform poorly outside of big-O notation.
Also, splay trees turn every read into several memory writes. Maybe not a good idea if the data structure is read by multiple threads (cacheline ownership bouncing back and forth all the time, rather than each core having it's own cached copy).
Binary Trees are a bad idea from the POV of cache use, and from your analysis, Splay Trees are much worse. However, this has nothing to do with the generalized meta-strategy of continually reducing the amount of work it takes to fetch items. It has to do with why Binary Trees are bad from the POV of cache use.
I think this is very different from a splay tree, though. Open-address tables take advantage of the cache
Well, duh! That's why I said it's analogous. Of course something built on an array is likely going to have better cache behavior! My point is that you're using amortized work to shorten the average amount of work it takes to retrieve your data. My point was to talk about the generalized meta-strategy. Lose a point for poor reading comprehension because you so-badly-wanted-to-correct-someone.