Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My experience with the AWS RDS database product has been excellent.


We looked at RDS and had a call with some of their engineers, but we basically had our EC2 + raid'd EBS set up almost the same as they did, all best practices already being done.

Since RDS really is EC2 + EBS, they couldn't provide any real assurances it performed better than our own installation.

We ended up moving off of AWS as a whole. After several discussions about how we can continue to scale, the ultimate answer was without AWS.

EC2 is great for distributed stuff, but when need something that is heavy IO, for instance, it is a big problem. Scaling it ends up costing more to work around AWS's performance problems than to go elsewhere.


Did you guys find a better cloud service, or did you roll your own in a datacenter somewhere?


We went with a managed hosting provider who built us a private cluster. Basically a private cloud. But that way we could get a dedicated SAN and move our DB servers out to dedicated boxes with whatever disk configuration we desired.


Which provider? Can you email me? jedberg@reddit.com.


Interesting, I wonder if this becomes a trend as other startups and cloud customers discover these limitations and look for more custom solutions.


Yeah they have a few products (e.g. EMR, RDS) where they charge by the instance anyway so you're just paying them by the hour for the five minutes it would take you to set up the server once


Hmm. I think you underestimate the effort that is spent on those two. RDS has really good replication which is really hard to configure and set up yourself. And having configured Hadoop I know it takes more than 5 minutes :) Perhaps Whirr makes that easier. Also, EMR's Hadoop is tuned to work really well with S3, which you don't get with stock Hadoop (or even with Cloudera's).


The biggest issue I have with RDS is that I can't do a multi-master deployment to scale up writes. I've got a very write-heavy workload in my systems (roughly one write for every two reads).


You can't do multi-master with MySQL anyways, which until very recently has been the only "engine" RDS supports. Even if you could do multi-master, replication is still single threaded. You have to come up with your own sharding scheme. This is a limitation of MySQL not RDS.


You can't do multi-master with MySQL? News to me - we've been using circular replication between two servers, each a master and slave, for quite some time now.

Not possible with RDS, unfortunately, but works fine on two EC2 instances.


You can do it, but it won't scale your write load. In most cases it actually makes it worse due to the having to apply all the writes taken on the other master node in serial.


There is a difference between multi-master and circular replication. To me, Multi-master is that I can write to both masters at the same time, which implies there is a way to resolve conflicts. Databases like Cassandra (timestamps) and Riak (vector clocks) have this, MySQL does not. If you write to the same record on both masters bad shit happens and its very hard to sort out.


You can write to both masters at the same time in a MySQL multi-master circular replication setup. It's done via auto_increment_increment and auto_increment_offset configuration settings in my.cnf - each server generates autoincrement keys that are unique to that server.


This still doesn't protect you from UPDATE statements. I suppose you could pull it off if you could be absolutely certain that you were only CREATEing. You still have the problem that replication is single threaded, so this doesn't scale your writes beyond one thread.


Anything more you could share about it? We got scared off (at least for now) by the inability to replicate in from a self-hosted MySQL instance (for migration purposes), but would still love to hear more about your experiences.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: