I'm amazed at how much this has knocked me on my arse.
I first attempted to redo the README for a service I've just open-sourced, before realising Github is down.
Then, I attempted to fix the company CI server (OOM errors because of Carrierwave needing more than 500MB of memory to run 1 spec in, for some unknown reason), which failed because it couldn't check out the code.
After giving up on that, I attempted to install Graphite to a company server, where I hit another roadblock because the downloads are hosted on Github, and so I had to use Launchpad, which I had an allergic reaction to.
Also, when I was shelling into the server, oh-my-zsh failed to update because, you guessed it, Github was down.
Still, shouts to the ops team in the trenches, we're rooting for you.
Your company runs its source revisions through Github without a backup solution? Do you really put all your eggs in a basket you have no control over?
I know that in theory a cloud solution should have a higher uptime than an amateuristic set up private server, but cloud solutions have a certain complexity and coherence that make them very vulnerable to these kinds of 'full failures' where nothing is in your control.
Maybe you should take this time to learn from this, and analyze what you could do to reduce the impact of this failure. For example, you could research what it would take for your company to move to another Git provider, perhaps even on your own server or a VM slice at some cloud provider.
I'm not saying you should drop github, because obviously they have great service, but be realistic about cloud service.
Cloud service is like RAID: it is not a backup.
The way RAID is nice for recovering from errors without downtime, there is a chance something bigger happens and you still lose your data cloud is nice for offering scalability and availability but there's a chance everything goes down and you still can't run your operations.
* Git is decentralised, if Github drops off the face of the earth, it will take the two of us about half an hour to fix it as we each have a copy of the codebase on our laptops.
* If I wanted the build server to point to our internal git mirror, I would configure it to point to our internal code mirror, but I want it to build off of Github webhooks.
* The "eggs in the basket" analogy is probably best saved for a situation where I'm dependant on a cloud service, such as Twilio.
* I would expect an amateuristic private server to have better uptime than a monolithic service such as Github because there are a lot less moving parts, and in our case, a lot less people doing things with it.
* The "company impact" of Github going down is next to nil. It's one o'clock in the morning on a Sunday, I'm eating string cheese, and I feel like being productive. The company is not paying me for this, and have encountered zero losses from it. We have a very simple mirror which we can use to push code to production if Github goes down, which we have never actually had to use.
Personally, I find the "keep a torch in every corner of every room because for 15 minutes last year the power was out" attitude to life is a bit over-rated, and you're planning for an edge case. I'd much rather remember where the torch is and learn to walk in the dark.
Add up your downtime over a three year period by relying on Git (or gmail, or AWS) versus the cost of trying to engineer some local-backup system, and the downtime associated with that going awry.
Outages happens - as long as we're talking hours a year, pretty much everyone but life-safety systems, critical infrastructure, and payment/high-traffic commerce sites are probably better off just letting third-party cloud vendors manage their systems. Take the downtime and relax.
(Now, if downtime consistently gets into the 10s of hours/year, it's time to look for a new cloud provider. )
You make a very good point, but it took me about three minutes to build a git mirror which we can push/pull to, can re-configure CI to if we need to, and can be used to run a full deploy from on the company VPS server.
* Create an unprivileged account & set a password that you don't need to remember -> sudo adduser git
* Add your public key from your laptop to the unprivileged user's authorized_keys file -> sudo su git; cd ~; mkdir .ssh; vim authorized_keys - then copy and paste your id_rsa.pub to that file
* Repeat that for all public keys on your engineering team
* In git's home directory, git init --bare <name of repo>.git
* On your local machine, git remote add doomsday git@<DOOMSDAY SERVER HOSTNAME>:<NAME OF REPO FOLDER>.git
* git push doomsday --all
* On colleague's box, git clone git@<DOOMSDAY SERVER HOSTNAME>:<NAME OF REPO>.git
Let me know if there is a better way of doing this, or if it's monumentally screwed somehow.
Yup. Github going down barely breaks my stride, but for a real production outage (e.g. Heroku going down), I pour myself a tall glass of scotch and thank my lucky stars I'm not the one who has to scramble around tailing logs and restarting servers. I'm pretty sure their ops team is better at this than I am anyway.
It's not about downtimes and outages. It's incomprehensible to me how lax businesses are with their backups, especially business where their clients data is everything. Yes, the brave new world of the cloud seems tantalizing, but even there, data can and will be lost. Don't just use only one way / provider / service / mechanism for backing up your data.
A tape / lto backup system doesn't cost the world. Yes, it introduces overhead and maintenance, but I'd rather be safe than sorry.
At my place of work we currently use a lot of virtual servers, hosted by external providers. We use their backup and snapshot mechanisms. But we also pull all data to our local back up server. From there we backup to tape on a daily basis.
I do have backups of all my (relevant) GH repos since that's just a "git pull" away and can be automated nicely. But I'd probably be out of luck running my regular CI-process with github down or do a deployment. Both rely on having a public-facing git server - having a backup does not imply that I have a full git server running. I could set one up and administer it, but it's just too much effort given GH uptime.
I doubt it. (Assuming by 2 digits you mean < 99.0%, ie that they don't have "two nines" (though I guess two nines could even be 98.5 with rounding)).
1% downtime is over three days. They've had some big outages, but I think this may be their longest of the year and it was a < 6 hours. They could have one of those every month and still only have 72 hours of downtime, which is 99.18%.
I agree with you but one thing: you don't have to build anything locally. Just pushing to bitbucket as backup would have done it. It does not have to be a locally hosted solution.
I first attempted to redo the README for a service I've just open-sourced, before realising Github is down.
Then, I attempted to fix the company CI server (OOM errors because of Carrierwave needing more than 500MB of memory to run 1 spec in, for some unknown reason), which failed because it couldn't check out the code.
After giving up on that, I attempted to install Graphite to a company server, where I hit another roadblock because the downloads are hosted on Github, and so I had to use Launchpad, which I had an allergic reaction to.
Also, when I was shelling into the server, oh-my-zsh failed to update because, you guessed it, Github was down.
Still, shouts to the ops team in the trenches, we're rooting for you.