> Relational databases are the right choice 95% of the time, non-relational stor...

simonpantzare · on Jan 18, 2017

Things such as joins, transactions, and means of enforcing data integrity are useful when solving a whole slew of problems. Not to mention the tooling and community you benefit from when you use a common RDBMS.

I never found data translation/serialization to be a big pain (just rely on a framework/lib that does it for you). It's a bigger pain to hand-roll joins that would be a one-liner in SQL or deal with issues that arise from having your data (unnecessarily) reside in many systems.

EdHominem · on Jan 19, 2017

I hear this data-integrity thing a lot but I don't run into these problems myself. I think it might be a functional-programming thing. It's much easier and safer to declare your constraints rather than trying to enforce them. If you aren't in a functional language I can see why you'd want to reach out to one but SQL in a separate process is just one of the options.

> Things such as joins, transactions, and means of enforcing data integrity are useful

If the domain needs transactions I'm already speccing them regardless of general usefulness. Yes all that stuff is useful, but all abstractions have a cost which isn't free just because it's hidden in the DB.

For a problem that didn't need transactions, but for which they were useful, why would you automatically want to couple the solution with your storage layer? If you're looking for the ability to express business logic clearly without cluttering it with error handling, for instance, software transaction memory would probably be a better level to work at.

zeptomu · on Jan 19, 2017

> If the domain needs transactions [...]

Considering we speak about daemon software (services exposing some API to readers and writers) that provides CRUD behavior to the user (end-user or other developer), isn't that nearly always the case to guarantee write access to concurrent writers without the risk of crippling your data?

Furthermore I am not sure how this relates to FP.

> It's much easier and safer to declare your constraints rather than trying to enforce them.

But that is a strong point of RDBMs implementing SQL. You have some kind of schema (think type) and use selected functions (select, update, delete, create, etc.) to transform the data.

EdHominem · on Jan 19, 2017

> > It's much easier and safer to declare your constraints rather than trying to enforce them.

> Furthermore I am not sure how this relates to FP.

I'm saying that FP is great for ensuring correctness.

> But that is a strong point of RDBMs implementing SQL.

Right. And if I didn't have other functional languages available that might be a bigger issue.

eklavya · on Jan 19, 2017

STM is awesome but a single node solution. Distributed transactions are a hard problem.

EdHominem · on Jan 19, 2017

I understand, but you can't just throw a DB at it and walk away. For instance, which DB? Setup how? Running on what type of hosts? What topological requirements does this have? How much does of a multiplier does it place on your data load.

I'd definitely use a trusted DB for storing bank accounts. The consistency is pretty much the first requirement and the data maps perfectly to tables. And 7B checking accounts isn't that big, compared to some problems so it'd probably scale pretty well even worst-case.

But I probably wouldn't for an MMO. Or at least, it wouldn't be where I stored every little thing going on around them - just the events (xp and gold earned) that they'd freak out about if we lost. But even just a log-structured DB would work well for that.

If there's no contention for a resource (in the bank case - the value in the account) there's much less reason for a transaction. I want the system to make its best effort but I don't want to wait around for the message if there isn't anything I can do on a failure anyways.

flukus · on Jan 19, 2017

Are there domains that don't need transactions?

worik · on Jan 18, 2017

Exactly.

I start with text files and for most purposes I do not bother with anything else.

Next is a key/value store. Simple.

Relational databases carry large overhead in translating data (as above) and also in design and maintenance of the structure and getting data into and out of them. I spent many years with them, like them a lot, but they are too much complication for most purposes.

Even with relational data RDBMS are only good if you are not certain of how you will be accessing the data. In most cases you are sure.

I am constantly stunned how people reach straight for MySQL or Postgres when flat text files with grep would work just as well and be much quicker to implement

pbreit · on Jan 19, 2017

I'm stunned that you're stunned that people generally don't use text files as data stores.

zeptomu · on Jan 19, 2017

How do you deal with concurrent write access, do you lock the file?

crucini · on Jan 19, 2017

He probably uses flock(2)

zeptomu · on Jan 19, 2017

But isn't that hard to get right? At least if you using something like sqlite you get consistency guarantees.

Consider your process writing to the file and dying during write() - do you recover and repair the file after you reschedule?

aalbertson · on Jan 19, 2017

Ramdisk maybe?

egportal2002 · on Jan 18, 2017

seriously curious -- what do you do when someone or some other "function" wants to query your data, wants to update it, etc.?

StillBored · on Jan 19, 2017

You can implement key/value stores in RDMS's too. It only take a few minutes to create a key/value table in most databases, combined with a few minutes in your favorite language to map it to an appropriate get/set routine. I find this particularly useful for variable attributes against another table, especially when its really a "foreign index, key, value" table. That way its still possible to join the values to other parts of the database. This paradigm really lends itself to multiple FK/key/value tables, where each one extends another particular table.

All that said, doing this requires careful thought, and DB normalization when its discovered that there is a 1:1 relationship between rows in a table and a particular key/value table. So, its not something that should be taken to extreme, but I find it aids in quick development, as every time you discover you need to store another piece of data for some edge condition it doesn't require lots of DB normalization. Also, I wouldn't really consider making the "value" field a blob, rather a very limited int or string.