SnapperTalk

June 4th, 2008

Archiving photos

Posted by Ben in Gear, Imaging, Macintosh, Software

Archiving photos is a tedious, time-consuming experience and the transition from storing negatives to digital files on CD, DVD, or hard drives hasn’t really improved matters all that much. On the other hand, losing images because they weren’t archived properly is even worse and the potential for losing large numbers of images is arguably even greater with digital.
Many friends and colleagues ask me how I archive my photos and what I recommend as a backup solution, so I wrote this post to illustrate the strategy I use:

Image sorting structure

I keep every image shot irrespective of what the subject is, because you can never know what will be important in the future. I also find going through and deleting so-called “unnecessary” photos just takes too much time.
Every card gets copied to my laptop using Photo Mechanic‘s Ingest function which is set-up to automatically sort the photos in a hierarchical folder structure according to year, month, date, and card – for example:

HD > Archive > 2008 > 2008-04 April > 20080401 > card-01

I then use the same structure to save all the photos which I actually transmit, e.g.:

HD > Transmit > 2008 > 2008-06 June> 20080612 > card-04

Having a clear structure like this is not just to make it easy and quick to find photos – it is also designed to reduce the chances of accidental deletion/overwriting.

Primary Photo Archive

Every week or two I manually copy the new images in these folders to my primary photo archive hard drive. It’s easy to know what has already been copied due to the strict folder structure. You may be wondering why don’t I use some kind of backup script to automate the process? Well, I’ve found automating this process to be just complex enough to run the risk of accidentally deleting or overwriting existing images, and so I feel safer doing it manually.

For my primary photo archive I use the Stardom SR3610 (model# SR3610-2S-SB2) external RAID-1 hard drive enclosure shown above (read a review here) which has both eSATA and USB 2.0 connections (or there’s a Firewire/USB unit if you prefer that – model# SR3610-2S-WBC).
I chose this model because it’s relatively affordable, has a hardware RAID-1 controller, hot swappable drive trays, eSATA for speed and USB 2.0 for compatibility, and most importantly has excellent cooling due to the temperature-regulated internal fans monitored by the controller. I am currently using 2 x 500GB 7200rpm Samsung SATA drives in a RAID-1 configuration inside giving me 500GB of useable space – which I chose because Samsungs tend to be cool-running and in my opinion very reliable. Saying that, these are almost full and I am looking to upgrade them to 1TB Samsung F1 drives.
In my opinion, particularly for those living in hot climates, overheating is a significant cause of hard drive failure so having a well-cooled enclosure is critical.
In addition, such countries often have irregular mains power therefore all my hardware is connected to this 1500VA APC UPS unit to prevent against data loss caused by power outages/blackouts, brownouts, overvoltage, surges, and line noise.
All my drives are formatted with the “Mac OS Extended (Journaled)” filesystem (now the system default) because this gives journalling protection should the system shutdown in the middle of copying files.

RAID is not the same as a Backup

There’s a widespread misconception that RAID is all you need to protect your data – it is not. RAID-1 will protect you against DISK FAILURE of a single drive – if one drive dies your data is safe because the other is always a perfect mirror. That’s all fine and necessary, but disk failure is not the only cause of data loss. User error e.g. accidentally deleting files or folders, applications corrupting the files themselves, backup up an “old” folder over a “new” folder, etc can also result in the loss of your images, and with any RAID system these errors will be replicated on all drives.
Therefore you also need a separate backup archive.

Secondary/Backup photo archive

One of the reasons I chose the SR3610 is that the drive trays are hot-swappable and interchangeable with other Stardom enclosures.

So for my backup photo archive I chose the Stardom iTank i302 shown above (read a review here) – a non-RAID single-SATA-drive enclosure with eSATA and USB 2.0 connections, which uses the exact same drive trays as the SR3610, so I can quickly swap drives around if I want to. Inside the enclosure I use the same model Samsung 500GB drive.

Maintaining the Backup

So we have a RAID-1-protected primary archive, and a secondary backup archive on a single disk – what is the best way to ensure that the first gets properly backed up to the second?
If software cost isn’t an issue I’d recommend using Retrospect. This is one of the best backup applications and is available for Macintosh and Windows. Initially it may seem hard to configure, but with a little time you’ll realise just how powerful it is.

What I would recommend is setting up a “Duplicate” script that clones the RAID volume to the secondary drive. This will compare all the files on the primary archive with the secondary one and then copy over just those images that have been added since the last backup (Note: and also delete any files on the secondary drive that are no longer present on the primary drive).
Why not just use any old backup program to do this? Retrospect has an important option called “byte-by-byte-verification” which once it has finished copying, goes back and compares every 1 and 0 between the original and copied files to ensure it has made a perfect copy with no file corruption. Many other backup programs verify the copy made but most do this simply by looking at the date-stamp and size of the file – which confirms the file has been copied but does not confirm that no corruption has occurred.
The only disadvantage of this method is it takes almost twice as long – but is probably worth it for peace of mind.

Other respected backup applications on Mac OS X that are worth considering:

Carbon Copy Cloner: This is an excellent and free (donations are welcomed) utility for MacOSX that lets you create a bit-for-bit copy of one volume or folder, to another volume or folder.

SuperDuper: Is another very well-respected general backup application.

For further Mac OS X backup applications look here.

What about Apple’s Time Machine? This is attractive because it keeps multiple versions of files if they do change, but in my mind isn’t quite ready for critical backups because of the lack of user-control over how it operates.

Improvements

If your level of data-loss paranoia is somewhat higher, you could use another SR3610 as the secondary drive instead of the iTank, providing increased protection against drive failure.

Another step that I would definitely recommend is to make an extra copy of all data on either a hard drive or DVDs and leave it in another physical location. This is really easy to accomplish with the above hardware because you can easily add more drives to the setup simply by purchasing extra drive trays as shown above.

Other storage ideas

Networked Attached Storage (NAS) of different RAID varieties, though particularly RAID-5, is an increasingly popular option for data storage. Personally I’m not yet convinced of the reliability of large-volume data-transfer over ethernet, and I don’t need 24/7 network availability of my photos, so I continue to prefer direct-attached storage.

If you do like the idea of accessing your entire library 24/7 – including remotely – then a NAS such as those made by Netgear, QNAP, and Synology might be for you. For great comprehensive info on different NAS models go to the SmallNetBuilder site.

In the future I think the ZFS file-system will become the preferred option for photo archiving as it has some really impressive next-generation features, and Apple says ZFS read/write support will be included in the next version 10.6 of OS X Server codenamed “Snow Leopard” – but at the moment it is way too experimental on OS X for real-world use.

Conclusion

This setup works nicely for me, your needs may vary – particularly if you have many terabytes of data to protect. In that case you are going to want to have multiple sets of drives that you rotate-out as they get filled up – which you could do with the above hardware by purchasing more drive trays. Or, you’re going to have to start looking at a multi-drive RAID-5 setup.
So far I’ve avoided RAID-5 because I prefer the simplicity of RAID-1. If a drive dies in a RAID-5 setup the RAID controller should be able to rebuild the volume, but if for instance the RAID controller itself dies, rebuilding is not so easy a task. With RAID-1 you can just remove the good drive and connect it directly to the computer – this I see as a very great advantage for the average user.

RAID drives

Well, if you’ve made it this far (!) I guess you are somewhat interested in this subject, so you might want to take a look at this “RAID for photographers” article I wrote back in 2006 which has some more in-depth information.

If you use a backup system that has advantages over this in terms of either workflow or hardware, or have any other photo archiving tips, please share your experiences by leaving a comment below.

404

Bad Behavior has blocked 3304 access attempts in the last 7 days.