Out of curiosity, what's your 7TB setup? I've got about 10TB in about the worst setup imaginable: just a bunch of individual drives on a first-gen Atom running Windows Server 2008 (long story). I've been searching for a good home storage solution for a while, reading about various RAID cards and soft solutions like ZFS, but there's nothing that I'm willing to trust. I'm curious what your setup is and what you think of it.
1. If you are using a hardware raid solution, then RAID1, and only if one of the drives can be pulled out and used by a standard SATA controller to get the data off. Controllers that munge the MBR just go in the trash.
2. Otherwise software raid1 or raid10.
Raid5 and Raid6 IMHO are more trouble than they are worth. Now, true, I'm not spending $6k on drives, so maybe at that quantity, Raid6 makes more sense. But there I would argue that the money you save on drives, you throw away as soon as it goes tits up. I've tried Raid5 with promise and areca cards and the performance was not as good as software raid10. Plus, on several occasions the transfer to the hot-spare failed and brought the array down. The data was there, but it required operator assistance (me). Until it went down, it was staggeringly slow.
If you really want to make sure that the array is available, use Raid1 with three or more drives. They are just so cheap. I was responsible for a windows server at a start-up a long while ago, and Raid5 sucked ass. Bonus: one day two drives failed. Tape restore is slow. Oh yeah: Don't just buy four drives and slap them in. Quite often a whole batch of drives will be bad. Buy drives from different manufacturers and beware that the sizes will be every so slightly different, so make your partitions smaller than the smallest.
However, all of this is irrelevant if you really want a lot of storage. For that it seems that a distributed, redundant file system would be best. If you want, you should be able to pull a drive or two out of every server, or all the drives out of a few servers, and still have all your data.
I've been doing basic RAID stuff for a while, most notably for a client that does physics modeling (and other things). So far, my favorite build is:
- A box from abmx.com. Really nice people, good hardware.
- An Areca RAID controller (might have to talk to them about this).
- FreeBSD, which has good support for Areca controllers, thanks to Areca's friendliness towards the open source community.
- ZFS.
This particular box has had almost no downtime in the past year, and the downtime it's had has been caused either by administration tasks (software updates or configuration modification) or a USB flash drive failure (we've been experimenting with BSD and Linux server configurations on flash media separate from the server's data storage, so that we can do things like show up with replacement or upgraded configurations, plug them in, and be done. USB flash media is not reliable enough for this though, even if you're not doing swap on it).
I just added a couple of 3TB Hitachi drives to the array last week. Resizing the volume was a little fiddly because the Areca firmware needed an update first before it could recognize the 3TB drives (not a problem you'd have on newer models), but otherwise, everything happened live, while the box was up and running and doing its job. When the Areca controller software finished resizing the volume, zfs happily said, "oh, I'm bigger now! I can handle that!"
OpenBSD's softraid is also pretty good stuff, but I don't think it makes sense to use it unless you're trying to cheap out with a small 2-bay box and no controller in a RAID 1 or something.
I have an encrypted 5TB setup using LVM, dm-crypt, and XFS in two identical full-size ATX 8-bay chassis, with nightly rsyncs from the primary to the secondary (with a simple versioning script that I wrote). Every year, I buy the biggest hard drive I can for $100 and replace the oldest drive in each system. LVM makes it extremely easy to move all of the blocks off of a drive and remove it. I also run smartd on both systems and replace a drive at the first sign of a serious error.
I used to run RAID hardware cards but changed to software-only solutions after a hardware card crashed. Unless you are going to buy two of the exact same RAID card (one as a hotspare), or unless performance is a big deal (which it shouldn't be on a backup system I would think), software-only solutions are the way to go.
I complement this with off-site continous backups of irreplaceable data (about 200GB) to CrashPlan.
I really wanted to use ZFS, but the Linux support isn't that great, and also I don't think it's possible to remove drives from ZFS arrays. Both of those were deal killers for me. In any event, LVM+XFS has worked out great. XFS is a very stable file system and has given me no trouble.