> Relatude.DB is an open-source, C#-native database engine designed to provide a unified storage solution with everything you need to build the backend for your web applications.
Seems ambitious...
> The underlying storage is in its default configuration a binary append-only log file with an in-memory index system.
Wow that won't scale.
> The file format is binary and values are casted as best possible to the current schema on read.
That will lead to unpredictable behavior after a lot of schema changes.
---
I'd like to know more about the people making this database: Specifically, databases are hard to implement. What kind of experience do the people making this database have that demonstrates their experience in this kind of area?
Well, we are a Norwegian company that have been working on CMS systems for 25 years, and this is the data layer for our CMS that also is a general purpose framework for building web applications.
We have several live products using this database, including mobile apps.
This is of course, not a competitor to established databases, like Postgres, MS SQL server etc. It is an attempt at an "all in one system" for most .Net web applications. It is a simpler alternative to the conventional stack where you often have to combine several system to build the data layer you need. It has a built in ORM, image scaler, vector index, BM25 text search etc)
It is: all you need for "most" projects in one Nuget package...
(We are still developing on it every day, and do not consider it released yet )
Regarding scale, it will currently not scale to "Big Data" territory, that was never the intention. It is about saving you time as a developer on the "normal" projects where a limit of, lets say 10 mill objects, is ok. Having said that it is still early days, and the performance is so far very good and comparable to other more dedicated storage solutions.
For me, for day-to-day usage, I sit at a desk and use an external keyboard. The ergonomics of working directly on a laptop aren't very good for extended periods of time.
Case in point: I don't put a lot of stress on my laptop keyboards.
The purchasing power of $3.05 in 1914 would require $100 in today's time. A bag of "stuff" worth $100 today could have been purchased for $3.05 in 1914.
Technically, it does mean that $3.05 from 1914 is worth $100 today, but that's not a useful way of thinking about this. I.e., if your great-grandfather put $3.05 in an envelope in 1914 and you opened it today, it's still $3.05 worth of money (ignoring wheat pennies being a collectors items and whatnot).
I'm pretty sure you are right. Or to emphasize the devaluation, "$100 today would be worth only $3.05 in 1914."
I think it is astonishing that we accept that in a best case scenario of sustained 2% inflation, we are literally planning for the value of the dollar to be cut in half every 36 years.
>I think it is astonishing that we accept that in a best case scenario of sustained 2% inflation, we are literally planning for the value of the dollar to be cut in half every 36 years.
Our system is designed to encourage asset ownership, not cash saving. If you stuff it under a mattress for 36 years, yeah you'll get fleeced. But buying assets is the way to keep up; an investment of $100 in the S&P500 in 1990 and never touched would be worth $4,120.93 today.
"worth" can have two meanings in this context. $100 from 1917 can be worth exactly $100 today. Or it can be worth what you can buy with it.
Some folks will see a $100 bill from the era and see an old $100 bill. Some folks will imagine what that $100 took to save back then, and what it bought.
FWIW my brain automatically went with "the goods that can be bought with $100" - such as what I could buy in a grocery store today with $100 would be about what I could buy with $3 back then.
I never considered the other reading until this thread. It was obvious to me the author meant "you can buy 97% less stuff today with the same $100".
I think it's used to convey that the buying power has been reduced. If you have a $100 basket of goods (as measured in 1914 dollars), $100 in 1914 allows you to buy 1 basket of goods. Due to the devaluation, today spending $100 would only give you a $3.05 basket of goods (as measured in 1914 dollars).
It's a bit of an odd comparison since you're using two different units for dollars to compare the basket vs purchasing dollar. The clearer way to say it is that today's $100 basket of goods is equivalent to $3.95 basket of goods of 1914.
A lot of frameworks that use variants of "mark and sweep" garbage collection instead of automatic reference counting are built with the assumption that RAM is cheap and CPU cycles aren't, so they are highly optimized CPU-wise, but otherwise are RAM inefficient.
I wonder if frameworks like dotnet or JVM will introduce reference counting as a way to lower the RAM footprint?
Reference counting in multithreaded systems is much more expensive than it sounds because of the synchronization overhead. I don't see it coming back. I don't think it saves massive amounts of memory, either, especially given my observation with vmmap upthread that in many cases the code itself is a dominant part of the (virtual) memory usage.
If you use an ownership/lifetime system under the hood you only pay that synchronization overhead when ownership truly changes, i.e. when a reference is added or removed that might actually impact the object's lifecycle. That's a rare case with most uses of reference counting; most of the time you're creating a "sub"-reference and its lifetime is strictly bounded by some existing owning reference.
There are 2 unavoidable atomic updates for RC, the allocation and the free event. That alone will significantly increase the amount of traffic per thread back to main memory.
A lifetime system could possibly eliminate those, but it'd be hard to add to the JVM at this point. The JVM sort of has it in terms of escape analysis, but that's notoriously easy to defeat with pretty typical java code.
> Why would an allocation require an atomic write for a reference count?
It won't always require it, but it usually will because you have to ensure the memory containing the reference count is correctly set before handing off a pointer to the item. This has to be done almost first thing in the construction of the item.
It's not impossible that a smart compiler could see and remove that initialization and destruction if it can determine that the item never escapes the current scope. But if it does escape it by, for example, being added to a list or returned from a function, then those two atomic writes are required.
Incrementing or decrementing a shared counter is done with an atomic instruction, not with a locked critical section.
This has negligible overhead in most cases. For instance, if the shared counter is already in some cache memory the overhead is smaller than a normal non-atomic access to the main memory. The intrinsic overhead of an atomic instruction is typically about the same as that of a simple memory access to data that is stored in the L3 cache memory, e.g. of the order of 10 nanoseconds at most.
Moreover, many memory allocators use separate per-core memory heaps, so they avoid any accesses to shared memory that need atomic instructions or locking, except in the rare occasions when they interact with the operating system.
Atomic operations, especially RMW operations are very expensive, though. Not as expensive as a syscall, of course, but still a lot more expensive than non-atomic ones. Exactly because they break things like caches
Not only that, they write back to main memory. There's limited bandwidth between the CPU and main memory and with multithreading you are looking at pretty significantly increasing the amount of data transferred between the CPU and memory.
This is such a problem that the JVM gives threads their own allocation pools to write to before flushing back to the main heap. All to reduce the number of atomic writes to the pointer tracking memory in the heap.
Unlikely. Maybe I'm overly optimistic, but I think it's fairly likely that the RAM situation will have sorted itself out in a few years. Adding reference counting to the JVM and .NET would also take considerable time.
It makes more sense for application developers to think about the unnecessary complexity that they add to software.
That's not strictly true. Mark and sweep is tunable in ways ARC is not. You can increase frequency, reducing memory at the cost of increased compute, for example.
M&S also doesn't necessitate having a moving and compacting GC. That's the thing that actually makes the JVM's heap greedy.
Go also does M&S and yet uses less memory. Why? Because go isn't compacting, it's instead calling malloc and free based on the results of each GC. This means that go has slower allocation and a bigger risk of memory fragmentation, but also it keeps the go memory usage reduced compared to the JVM.
Compacting reduces memory usage - that's why it's called compacting.
The JVM uses a lot of memory a) because it's tuned for servers and not for low memory usage and b) because Java is a poorly designed language without value types.
No, it reduces memory fragmentation, which is why it's called compacting and not compression.
I do agree that the lack of value types is a big contributor to why Java uses so much memory. But it's not a server tuning thing that makes the JVM lean memory heavy.
The JVM uses moving collectors and that is the big reason why it prefers having so much memory available. Requesting and freeing memory blocks from the OS is an expensive operation which the JVM avoids by grabbing very large blocks of memory all at once. If you have a JVM with 75% old gen and 25% new gen, half that new gen will always be empty because the JVM during collection moves live data from one side of the new gen to the next. And while it does that, it slowly fills up old gen with data.
Even more modern collectors like G1 prefer a large set of empty space because it's moving portions of old gen to empty regions while it does young collection.
As I mentioned, the difference here between the JVM and python or go is that python and go do no moving. They rely heavily on the malloc implementation to handle grabbing right sized blocks from the OS and combating memory fragmentation. But, because they aren't doing any sort of moving, they can get away with having more "right sized" heaps.
Basically, the short answer is that most memory managers allocate more memory than a process needs, and then reuse it.
IE, in a JVM (Java) or dotnet (C#) process, the garbage collector allocates some memory from the operating system and keeps reusing it as it finds free memory and the program needs it.
These systems are built with the assumption that RAM is cheap and CPU cycles aren't, so they are highly optimized CPU-wise, but otherwise are RAM inefficient.
I know of one person who was born physically a woman, but has XY chromosomes. It is only due to modern medicine that we know that there is anything "unusual" with her gender. Otherwise, she is physically a woman with no observable clues to her condition.
(IE, in the past, she would have been infertile, and probably died young due to her situation.)
I'm not comfortable with saying that people like her need to compete with men.
Complete androgen insensitivity syndrome, I imagine? That's probably the one category of XY people who have undergone no hormonal masculinization throughout their lives, and the one case where I'd agree with them competing with women. Wikipedia says it's estimated to be "1 in 20,400 to 1 in 99,000".
Assuming you live in a "large" western home, it's impractical. Remember, Edison's first power grid operated at 110/220v DC to the home. If there was lower voltage (IE, 12 volts) going from the street to your walls, the line loss would be significant. It only works in RVs and shacks because the wires are short.
Thus, even if you had DC in the walls, it would be 100+ volts, and you'd still have conversion down to the lower voltages that electronics use. If you look at the comments in this thread from people who work in telco, they talk about how voltage enters equipment at -48V and is then further lowered.
Most of those faux pas come across as neurotic, like complaining about people who park too close to the lines, or complaining about someone farting in church.
Seems like someone could make a sketch comedy skit where someone does these faux pas, and most people don't notice except the one person who has a perpetual wedgie.
> Relatude.DB is an open-source, C#-native database engine designed to provide a unified storage solution with everything you need to build the backend for your web applications.
Seems ambitious...
> The underlying storage is in its default configuration a binary append-only log file with an in-memory index system.
Wow that won't scale.
> The file format is binary and values are casted as best possible to the current schema on read.
That will lead to unpredictable behavior after a lot of schema changes.
---
I'd like to know more about the people making this database: Specifically, databases are hard to implement. What kind of experience do the people making this database have that demonstrates their experience in this kind of area?
reply