OK, let's roll with that: you have a better idea for an in-memory data structure...

Retric · on May 2, 2012

A Hybrid data structure. How many DRAM cycles does it save? Depends on your data and access patterns, 10-20 is reasonable, but it also saves processing time etc.

EX: Let's say you want to store 1/2 to 100 million GUID's. They should have a random distribution so you don't have to go crazy just start with a lookup table for the first 16 bits (sized to fit into L2 cache) and a radix tree on the leaf nodes. http://en.wikipedia.org/wiki/Radix_tree or a AVL-tree or whatever you want it's just not that important because these small trees are unlikely be cached. You may want to use a mid range of lockup tables to cut down on DRAM round trips, but again that's the type of micro optimization that starts to get really specific to your usage patterns and the amount of memory on your system etc.

But wait, what if your using time-stamps? The first N digits are going to be the same right? Again, knowing the DRAM timings is pointless verify its in the range you want, mask some middle bits, use that as an index... etc. Or, if it's a pure lookup and it's ok to give up ordering you can flip the bit order to have a fairly random distribution. (Again a micro optimization based on data usage patterns but it may help out.)

And, here is the thing if your storing the data on a HDD the same idea is still useful, just use a larger index in RAM. The important things is relative latency's and storage space not specific timings.

PS: Still, this assumes speed is really important. Generally speaking there are far more important things to worry about like keeping all your data in RAM in the first place, space efficiency can be more important than access patterns. And you might be better creating a batch of query's and ordering them to be more efficient vs changing how your data is stored.

timtadh · on May 2, 2012

"you have a better idea for an in-memory data structure"

The problem with this question is (and your original example) is it entirely contrived. A better in-memory data structure for what? String indexings? Integers keys? Complex objects? Metric data? Region Data? What is your query load? Range queries? Exact queries? Substring queries? K-edits queries? Nearest Neighbors? etc....

Before worrying about DRAM performance on the particular machine you may be running your code on for a year or two in production you need to answer those questions. The performance of your system will depend far more on your understanding of the work load (read vs write), object type and query types than on the data structures understanding of the DRAM. That is the last optimisation you make, after you have done everything else because it is machine specific.

ajross · on May 2, 2012

That logic makes no sense to me. By that definition any focused question is "entirely contrived". So I'll throw it out to you: pick one. Come up with any large in-memory data store (any structure, any stored data, any indexing) and give me an answer about its performance that doesn't require knowledge of how DRAM works.

(Edit: signing off this ridiculous flame war. We're now in full circular logic mode: examples showing the utility of DRAM knowledge are contrived because DRAM knowlege is never useful. Sigh. I'll just point out that, because I know how this stuff works, you are wrong. You can do your jobs without being a good hacker, but you'll never be a great hacker if you don't know how your machinery works.)

barrkel · on May 2, 2012

To a first approximation, all focused questions that rely on knowing DRAM timing details are entirely contrived. That's because almost no programmers need to focus on performance on hardware that specific, outside of people targeting mass produced identical hardware, like consoles.

Trying to predict performance from first principles is a mug's game in any case; there will be things you didn't expect. It's far better to measure.

shabble · on May 3, 2012

Wouldn't you rather have at least an approximate map showing the more swampy areas on your random walk up Performance Mountain (or is it down into Performance Valley? The sight of all these top-rooted trees is confusing me)?

barrkel · on May 3, 2012

I'm criticizing basing your architectural decisions on things as low-level as specific DRAM timings, unless perhaps you're targeting a very focused, extremely mass market niche, like a console.

I never said that one must be ignorant of factors that can affect performance when measuring. Rather, don't try and predict from first principles without measuring.

cbsmith · on May 3, 2012

So, by this logic, when writing code gor a variety of CPU/memory/storage/GPU platforms, there really isn't much point in worrying about performance at all?

Good luck with that interview at Oracle/Google/Facebook/Twitter/Activision/Adobe/Microsoft/AnyoneWhoKnowsAnythingAboutSoftware....

barrkel · on May 3, 2012

I never said there's not much point worrying about performance. What I said is that working from first principles at the level of detail of DRAM refresh timings is likely to be less productive than measuring different approaches based on higher-level understanding, because there will likely be more low-level details than you expected. But with higher level stuff, like the basics of CPU caching / associativity / risks of cache sloshing etc., you can make some reasonable design decisions - as long as you measure and experiment, continuously, throughout the process.

The window within which caching effects occur varies from machine to machine - specifically, certain access patterns fall off a cliff once you go beyond the window - but with appropriate design, you can take reliable advantage of caching behaviour without needing to target a specific cache size.

cbsmith · on May 3, 2012

I don't think the question was looking for a precise answer, but rather an approximate one. As he said, most programmers don't even get in the ballpark, and you'll note his question didn't mention cache or cache size. In fact, the way the question is phrased, you could answer for a system that had no cache whatsoever if you wanted to, though I'd go with some common cache architecture.