My recent rambling through trying to back up and recover my digital data (read Amazon Glacier has been a serious pain in the ass!) have had me thinking about the relative ease or not of managing these kinds of data. One of the most pleasant and powerful software programs I’ve ever used is Calibre for eBook management. It has to be one of the most active open source projects going, certainly than I’ve ever seen. I swear, every time I launch the app, there’s a prompt for me to go download the latest release or update some of the myriad plugins. Like almost all great software, it started with one guy, Kovid Goyal, who is a freaking hero as far as I’m concerned. (His personal web site might be a little weird, but holy crap does the guy know how to write desktop software.) Calibre has grown to a pretty active community of contributors but managed to keep from becoming frankencode – at least on the surface.

Now, if we could just tune up Calibre to manage all digital file-based data and metadata, we’d be really cooking with gas! And photo management is where I’d start. Pretty much everything I’ve seen for managing photo files once they come off the device just really sucks in one way or another. All the actual apps, and especially the big popular guys like Apple iPhoto/Photos and Picasa, manage to just screw things up royally in terms of the actual data (the digital photos themselves). I have discovered one hero in this world, and that’s Phil Harvey who wrote exiftool. The great thing about just about all the binary image formats is that they can take lots of metadata right in the file so it goes along with image wherever it travels. That is until it travels into quite a number of the photo management apps out there that do stupid things like make a complete hash out of that wonderful metadata that might have been captured at the point of creation.

So, I’ve been playing around with how I can use exiftool and a few other methods to try and clean up the massive clusterf–k that is my photo collection. (Well, it’s actually more of my wife’s photo collection since she takes more pics around here than I do.) I won’t go into the long drawn out details of what I’m doing, but I will offer a few observations. I’ve decided that I really just need to know three things about our photos in order to lay them out really nicely for presentation and reminiscence, to discover important photos for some purpose, and to keep them well organized for the long haul. If I could just get time, place, and a flexible topic on everything, that would be fantastic. It’s a little confusing wading through the non-standard that is EXIF metadata and the various other attempts and community conventions, but these three concepts are pretty well covered in a few different fields of embedded metadata.

It’s really sweet when everything works at the image creation end and our devices give us point location and time. Place is getting better and better as well with attributes that tell me what direction I was facing and the basic extent of the viewshed covered in the image. Time is pretty straightforward as long as we don’t do something to strip it out of there. I noticed when I tried out Google’s new “unlimited” photo upload deal that they create a new downscaled image and don’t transfer over the creation information that you’d want to keep. They hold onto the point location, but they lose other parts of the spatial information.

I think we’re still lacking the topical context getting embedded from the beginning, but I’m sure someone is working on it and maybe already has it solved. With a little bit of voice interaction now, I should be able to immediately tag a photo with a few keywords spoken to my device to record the significance of whatever I’m snapping a picture of. We should also be able to do some interesting proximity connections to geographic features or interactions with other people’s devices to let me know who’s around (e.g., your device just told my device that I was probably taking a picture of you because the camera was aimed my way when it went off). Ultimately, there should be so many registered things in the proximity of an image being captured that we’ll need to filter down to those we care about and might key in on when taking a picture. And even that ought to have a leg up by capturing my habits and propensities through time to narrow the field (e.g., Sky does a lot of camping, hiking, and skiing so is more likely to care about natural features in the area instead of anthropogenic stuff). And a lot should happen through the context whereby an image is captured, so if I take a picture as part of sending a text to someone or through some other app, there should be all kinds of information about that context in terms of people and subjects that could be inferred.

For now, I’m at least getting away from all the crazy and totally unsustainable methods of organizing our photo collection. I’m relying on place and time being captured up front from most of our devices, with the exception of the older SLR that still takes the best images when we get the settings right and are using the good lenses.  I’m running through a bunch of scripts to throw a little bit of topical metadata into descriptions/captions, keywords, and location tags using exiftool and the convoluted set of folders and albums we had in various venues. All photos are going into one big directory after being named according to their creation date/time or a numeric sequence where that information has been lost. From there, I know I can rebuild any kind of folder structure I might need to, but I’m hoping I can find some user interface software that will honor all the hard work of embedding good metadata. The DSPhoto app that the Synology folks put together is really one of the best in terms of place and time. I’m not sure the cloud offerings from Amazon (and certainly not Google) have anything going for them other than “unlimited space for your photos,” so I’m going to have to keep poking around till I find the right combination.

But I think I’ve landed on the only really sustainable way to try and manage the flow of digital imagery into our household that ultimately captures an incredible amount of value about the flow of our lives. If we capture, embed, and retain at least time and place, that is often all we need to go on. If I can also come up with a way to add in topics of interest through the lifetime of browsing and using the images, then that will be great too. I’m intrigued by the face recognition software in that regard. Some of it’s great, and some if it sucks (or maybe I just don’t want to admit that I sometimes look like my mother-in-law). People are certainly one interesting specific topic that we look for in our photos. But I haven’t found one of those yet that actually embeds that information into the image files. It seems like we should be storing names in the image files and then indexing that out to a registry that might include some interesting features like linking to an authority for people names that is pertinent to “my registry.”

I’ll try to come back to this and write about what I figure out on next steps.