Ran into this old story on TC, just reinforced the point that disaster recovery is not just about careful planning but also about experience. Make sure you periodically, test your backup disaster recovery plan too.
Today, at over half a million members with CS2.0, they're more stable, and doing things like using AWS for static picture serving needs. And there are many more people working on keeping it up and running these days :)
MySQL does have built-in measures to prevent this kind of disaster! They should have enabled binary logging and use that to make low impact backups. They also could have used replication (also built in) to avoid all these problems and have a hot backup.
The only reason they retained the 'Gold Support Service' is because their own database administrators obviously had no idea what they were doing or how to use the software correctly.
There are always companies that will make use of commercial support for FOSS but MySQL will probably never make any money from me.
Besides the tragic (or incompetent?) tech failures, what do folks here think about the founder initially just calling it quits on the community?
Having followed this particular crash-recovery saga unfold and being involved with web 2.0 otherwise, it would be interesting to hear a few thoughts over this: what's the responsibility of the platform provider towards their possibly very engaged community - or is there any?
I think it's a difficult dilemma, but ultimately the founders don't owe the community unless people were paying for a service. Firstly, without the founder there would be no community, secondly, their work created and made the community possible. They have to be responsible to themselves, and that might mean taking a break.
If the community wants to remain a community then normally someone else will step up, that's the power of community, it's bigger than just a web site. In the case of couch surfing v2 seems to be going strong.
I would hope the founder of a company with a successful community would at least try to pass on the name and membership to community members or other entrepreneurs who would try to continue the community. I'm sure there are plenty of entrepreneurs and businesses who would have been happy to get nothing but the domain name and member list to try to start a new couchsurfing (though it appears that the service came back just fine on its own).
CouchSurfing International Inc was registered as a non-profit and is now a charity.
Knowing a bit about it (having been a volunteer for the org) I think the crash was extremely badly handled. For better or worse, many people were relying on the service to find places to stay, connect with people in foreign cities, and so on. Leaving those people high and dry is bad karma. Likewise, users should be encouraged to make "backups" in the form of phone nos, email addresses, etc if the reliability of the site can't be guaranteed.
I think the problem is a classic one in technology. The geeks at the helm are over confident. Their self belief clouds their judgement with regards to backups, failure planning, and so on.
Sometimes, it just seems like the universe is trying to send us a message. Take that in a spiritual way if you want, or take it in the sense that a cataclysmic (relatively speaking) forces us to reevaluate our priorities.
Certainly, if this happened to me and I came to that decision, I'd put up a forum for everyone so that they could engage and figure out post-startup plans. But that's me.
I think you make a great point ibsulon. If you did take the decision to close down the site, the least you could do is signpost users to somewhere else where they could continue to communicate.
As far as I'm aware, during the CouchSurfing crash, some volunteers set up a separate forum, but it wasn't sanctioned or initiated by the founder, I believe he simply walked out. :(
I can't find any prepackaged thing on the web. I'd like to address 2 needs : disaster recovery and point in time recovery.
For now I only dump the database from time to time, but it's far from a real solution, if I loose my whole server it would take quite some time to get its configuration back on track.
If it's a *nix system, why not setup a cron job to do it for you? I have a Python script in /etc/cron.daily that performs a mysql-dump, tars it, and uploads it to S3.
For total completeness and peace-of-mind, you could have the script add anything else you need backed up, like your projects or public_html directories, configuration files, etc.
Running mysqldump on a database the size of CouchSurfing's is a real problem. It's a resource hog. To properly back up a consistent data image you need to use locking, which prevents writes to your database while the backup is running. That causes downtime on every backup, not ideal.
Likewise, when backing up 100s of GiBs of data, a simple tar gz is not suitable. More sophisticated strategies need to be used when the data becomes that big.
I've found http://HighScalability.com to be a great source of information on building scalable applications.
For a smaller database, dmpayton's suggestion is spot on. Simply cron schedule a mysqldump and then push that offsite. Here's a few example code snippets I use:
I'm using --skip-lock-tables because on a ZenCart site, the lock tables option was adversely affecting site performance.
Then you could add something to automatically email or scp the file to another host.
For bigger, more complex databases, look into MySQL binary logging, DRBD replication, and MySQL slaves. That should give you some keywords to get started.
This was not the only mismanagement of Couchsurfing. It is a great idea - but the organisation is a bit authoritarian. Read more at: http://www.opencouchsurfing.org/
This sucks. When I first heard about CouchSurfing i wrote it down on my list of "Ideas that will change the world". Maybe someone can make a new version and integrate it with Facebook?