Ok, as far as I can tell, the real story is that Engine Yard relies on a GFS/SAN setup that doesn't scale in the unique way that Github needs.
If you think about it, Github is one of the few sites that actually directly uses the filesystem heavily. Everyone else hits scaling issues on the DB first.
The sad thing of all of this is it's not really a matter
of scaling, and it never has been. Our bottleneck has
always been the file system. GFS just... sucks. I'm sorry,
but I have to say it. Case in point, your graph. The first
rebuild I ran timed out because of GFS. The second one ran
fine, took maybe a minute to process, if that. GFS impacts
everything... gem build failures due to cloning... GFS.
Network graphs taking long time to build... GFS. Caching
jobs not completing... GFS. I think you see where I'm
going here. There's no plans to deploy the new code to the
live servers, and I think the reason is that we're afraid
it'll make GFS performance worse, not better. But on the
new servers where we don't have to fight GFS, it's
amazing.
Funny thing is that we told github that gfs would not scale for them over a year ago, we also outlined how to move to a shared nothing chunk server architecture. They didn't take our advice so it's mostly their own architecture decisions that were holding them back with regards to gfs.
Anyway there seems to be plenty of airchair quarterback on this one. The real story is that we can't afford to host them for free anymore.
If you think about it, Github is one of the few sites that actually directly uses the filesystem heavily. Everyone else hits scaling issues on the DB first.