FWIW we went through a very similar process to that documented here by Github (~...

antirez · on Jan 10, 2017

Yes, this helps and totally makes sense to me. Thanks. I would do the same... In this case however it looks like there were certain high volume writes that could be handled in a simpler manner with Redis, however it is totally possible that while this looks like an important use case, it accounted for a small percentage of all the data, so we are back to the consolidation thing of moving everything to a single system that is in general a good idea.

sjeanpierre · on Jan 11, 2017

What method are you using to replicate from MySQL binlog to various other systems?

karmakaze · on Jan 11, 2017

FWIW, I've used github.com/siddontang/go-mysql to successfully replicate from MySQL to DynamoDB. Currently not using GTIDs and looking into that next.

parthdesai · on Jan 10, 2017

just asking for some info,

but how do you make sure that multiple of your db systems are in sync (specifically interested in MySql and elasticsearch)?

Hope it's alright to ask you that.

rthrfrd · on Jan 10, 2017

In the case of ES the short answer is; we don't. We have fault tolerance in our replication system to guarantee eventual consistency instead. I would say using ES as a consistent source of data isn't really playing to its strengths so we don't use it that way. The consistency you want is determined at read time: If you need consistency then hit MySQL, but for our use case that almost never happens as eventual consistency is usually instantaneous enough.

Our other tool is to decouple lookup (which objects to fetch) and population (what data to return for each object). You can mix and match, e.g. do a lookup against an inconsistent ES but still get consistent objects by populating from MySQL (or vice versa). As others have alluded to it depends entirely on the requirements for the result set.

jeffasinger · on Jan 10, 2017

Where I work we use several different MySQL replicas in production, where we don't expect them to be in sync.

So long as the source of truth (Master MySQL node) is up to date, it's okay.

For example, if we show a user how much money is in their account on every page, we can run query that on a replica, since it's fine if this is a few seconds delayed. However, immediately after an action changed their balance, on a confirmation screen, we'd want to show the value from Master.

It's entirely possible that any place elasticsearch is being used just don't need consistency.

bpicolo · on Jan 11, 2017

There are actually a few strong solutions out there for Mysql, most starting with change data capture like: https://github.com/shyiko/mysql-binlog-connector-java (I link that one in particular because he links to alternatives right in his readme!)

Pgsql is a bit harder, but if I needed to start somewhere it would be with:

https://github.com/debezium/debezium

or https://github.com/confluentinc/bottledwater-pg

These are the start of pretty sophisticated solutions where you need super real-time elasticsearch indexes and can bring up infra like Kafka.

For many applications, queueing an update when something hits your ORM to update, with the hourly/daily refresh is pretty satisfactory.

sacheendra · on Jan 10, 2017

If you need any kind of consistency guarantee, I think you would need to use some kind of distributed transactions.

If its not, you could tail the MySQL log and have a process making the same changes to elasticsearch. The elasticsearch may lag behind if there are problems.

gingerlime · on Jan 11, 2017

I'm facing a similar challenge, although at a much(MUCH!) smaller scale.

We have nearly everything in Postgres, and redis serves as both caching layer (non-persistent), but also for rails session storage and Sidekiq (persistent).

Having one source of truth can make things like failover much easier. I can handle PG failover, and also redis, but I'd rather not have to deal with both. Especially if you consider the potential of things going slightly out-of-sync (think a job in sidekiq that relies on an id in PG, one of which loses a few microseconds of data during replication etc, just speculating a scenario here)

Did anybody face similar challenges and care to share their thoughts?