Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Kafka-oriented streaming folks talk about stream-table duality; the idea that one form can be expressed as the other. There is usually a little lip service paid to this idea before some heavy hints that actually, the stream is the true reality are dropped.

My own view is that there are dimensions for any data of interest, expressing some ability to show an evolution of it. Frequently that dimension is time, or can be mapped onto time.

But neither the stream nor the table is the truest representation. The truest representation is whatever representation makes sense for the problem. Sometimes, I want to clone a git repo. Sometimes I want to see a diff. Sometimes I want to query a table. Sometimes I want a change data capture stream. Sometimes I want to upload a file. Sometimes I want a websocket sending clicks. Sometimes you need a freight truck. Sometimes you need a conveyor belt. Sometimes a photo. Sometimes a movie.

Sometimes I talk about space vs time using the equations for acceleration, or for velocity, or for distance. These are all reachable from each other via the "duality" of calculus, but none of them is the One Truest Formula.

And so it is for data. The representation that makes the most sense for that domain under those constraints is the one that is "truest".



I believe this is where CQRS plays nice with “event sourcing”. You write all your events in one model, but can read them in multiple ones, that is if you can tolerate some read latency... and most systems are usually ok with that.


CQRS also alleviates a lot of the pain people experience around event sourcing and distributed systems with event evolution and manipulation. There are many cases where it's inappropriate for consuming services to be aware of the internal representation of events. Giving them a 'snapshot-centric' view of the data can be a simplification in both directions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: