--- Warning: I will sound like a sales droid for a second ----- Speaking of data replication, please visit Constant Data's booth(#1090) at the LinuxWorld in SF August 4-7th. Constant Data, has free trial downloads of real-time bidirectional replication for Linux and other *NIX at www.constantdata.com. They support 1 to N and N to 1. --elhaddi On Fri, 25 Jul 2003, Carl Wilhelm Soderstrom wrote: > On Thu, Jul 24, 2003 at 04:10:00PM -0500, Adam Maloney wrote: > > I believe Carl is "the-man" on this subject, but I'll put in $.02 > > since I heard someone taking my name in vain, I suppose I ought to throw in > my opinion. > > > The cost of your back-up solution should be reflective of the monetary > > value of the data. > > first, most important rule, right there. > > there are times that it's worth building a whole replicated datacenter > connected via private fiber and fiber-channel repeaters. some of the > companies who had offices in the World Trade Center are probably glad they > had something like that. > > a whole lot of them sure wish they did. > > needless to say, if all you're backing up is your blog on your co-lo'ed > webserver; something less drastic is in order. :) > > I don't know what all kinds of data you're talking about, but keep in mind > that a lot of services are easily replicable; DNS and SMTP have failover > built into the protocol, and it's advantageous to have a DNS and a mail > server somewhere offsite. > > I belive AFS has replication/failover built into it, but I could be wrong. > (Amy?) In any case, AFS is more trouble than most people want to deal > with. :) > > > 70Gb burned to CD? Ick. > > I once looked at the economics of an automated backup solution using a CD or > DVD autoloader. aside from the cost of the burner itself (not too many $K), > the cost of media ends up making it more expensive than tape in not too long > a time. Tape is fast and reusable; CD-Rs are not. CD-RWs are even slower; > but one of the problems becomes the *huge* stacks of CDs that you'll need to > back-up your data. storing those things costs you money too. DVDs hold more > data; but they are marginally more expensive per byte. > > 70GB/4.7GB(per DVD) = 15 discs. > looks like DVDs are down to less than $1/disk > http://store.yahoo.com/blankcdcdr/dvdr-media-dvd-r.html); so I guess the > economics have changed a bit since I last looked; but even so, spending $15 > (plus the amortized cost of a $3000 DVD autoloader) per backup is not > something you'd want to do every night. > > I don't know how long it would take to burn those 15 DVDs either; but I'm > sure good tape drives would be notably faster. > > it's not a bad idea for occasional, long-term permanent storage tho. (look > at www.mondorescue.com). > > > Also, transferring 70Gb to your off-site location might take awhile. > > Over a T-1 it will take more than 100 hours (70,000MByte = 560,000 MBit / > > 1.5 MBit = 373,333 sec = 103h). > > this is why some sort of differential backup is a worthwhile thing. I've > built workable systems with rsync scripts; which only requires one full > transfer of the data to the backup server (much like Nate described in his > post), and ever after (at least in theory) only needs to transfer the files > that change that night. > > there's a couple of good pre-built systems that do this better than what > I've cobbled together. > > I took a good look at this one: > http://www.stearns.org/rsync-backup/ > and found it's pretty good. it's client-side-initiated; so it would be very > good for backing up laptops and other occasionally-connected devices. it > makes a nice live filesystem that you can browse, and you can even browse > previous days' backups as a live filesystem (it uses hardlinks to avoid > replicating identical files). > > some people didn't like it; because they belived that allowing the clients > to initiate the backups made the security weaker. it uses a chroot'ed jail > for each client's backup process tho; and in a lot of ways I'd rather that > the backup server was exposed to a limited number of clients, rather than > try to secure remote-initiation access to a large number of clients. > > I haven't tried these yet: > http://rdiff-backup.stanford.edu/ > http://stitch.bentlogic.net/ > but they look pretty good. I've heard good things about rdiff-backup. > > > DLT4 can do 35Gb raw/70Gb compressed on 1 tape. Tapes are about $60-$70 > > each (last I bought them anyways). I think you can get DLT4 drives for > > under $1,000 now. > > don't buy DLT. buy AIT. > AIT is *amazingly* fast to search, because it keeps an index of filemarks in > an NVRAM chip on the tape. this is OS-independent; and makes your restores > blazing fast. (which is handy when the CEO deletes his spreadsheet by > accident and wants it back 5 minutes ago, instead of 5 hours from now). > > also, AIT uses spinning read/write heads, so the tape doesn't have to move > as fast, which makes 'backhitching' or 'shoeshining' less of a problem, and > is less wear on the tape. > > last I knew, cost was comparable to DLT, but that might have changed. > > > > 1. copy some files nightly to a central server (that is out of the > > > datacenter, but in the same building :) ) and burn them to cd every now > > > and then. Its about 70 gigs of data right now. > > this is something like what I've done for one client in the past. it's a > good and workable scheme. just keep in mind (and I think you have it) that > you need *historical* backups as well as a replication. you can have > differential historical backups on disk (like rsync-backup uses); but if you > want to take it offsite, something more durable than a disk is desireable. > that's what tape is still good for (still the cheapest alternative for > short-term reliable offsite backup). > > then again, if you only do offsite backups once a week, and want them for > archival purposes, it may be worthwhile to get a DVD autoloader and just > burn yourself a stack of DVDs. > > > > 2. Put tapes on each machine, get lots of tapes. > > this is really expensive, considering how much tape drives cost, relative to > the price of a computer now. it's very convenient tho. possibly worthwhile > for centralized servers at remote (netwise) locations. > > > > > > > 3. Get a nicer tapedrive that can backup several machines on one tape > > considering the rate at which disk drives are growing (which makes people > sloppy about what they put on disk, which means the drives fill up); this > is becoming less and less viable. > > > > > > > are there other options that we should look at? > > I think rewriteable optical media will be the future of backups; but I don't > know if the big backup tool vendors are adding that capability into their > systems. I think we'll need the next generation of media (50-90GB disks) > before it becomes really viable for smaller operations. certainly Plasmon is > doing it right now; but their solutions are very expensive. (albeit very > fast and reliable, and with write-once media, largely tamperproof, which has > its advantages in some buisnesses). > > Carl Soderstrom. > _______________________________________________ TCLUG Mailing List - Minneapolis/St. Paul, Minnesota http://www.mn-linux.org tclug-list at mn-linux.org https://mailman.real-time.com/mailman/listinfo/tclug-list