Saturday, July 25, 2009

Using Bridge to share an Image Library

To recap the situation.

The goal is to share a large and growing library of more than 107,000 images among a small group of designers and editors.

I spent some three months creating the central library of some 434 gigabytes on a hard drive, which was then moved to a 2-terabyte mirrored RAID drive. It will remain there until the client migrates to a full-blown digital asset management system in 6 months or a year.

All the images are tagged with metadata and properly keyworded and can be readily and easily accessed by multiple users.

So what's the challenge now?

"Searchbility." That is, can multiple users search the library using the existing metadata and our keywording strategy.

The short answer is a (very) qualified "yes."

Very serious tip: indexing is the key to searchability and the cache is the key to indexing.

Recent tests have shown us that sharing one large central cache that would be regularly updated by the image librarian is NOT workable. Cache structures differ across platforms and across versions of Bridge. And it seems that Bridge enjoys the best performance with a local cache readily at hand.

The first time Bridge access a folder of images it creates the necessary metadata, thumbs and previews and stores all this information in the cache for future use. That way the next time you go to the folder, Bridge can easily find the metadata and there is little waiting around for the thumbs to load and they metadata is readily available to allow searching through the folder. If all the folders in a given "superfolder" have been indexed, then Bridge can quickly search across all the subfolders.

The root problem lies with Adobe’s indexing feature. Or rather it lies with Bridge trying to index not only hundreds but thousands (or tens of thousands) of image files at one time.

Now match this with existing folder structures. In our case I had originally created dozens of folders and subfolders. We had literally thousands of duplicate images that needed to be weeded out in the early phases of creating the library; and in Bridge it was not possible to move thousands of image files around without putting stress on the software.

With an image library of 10,000 images, for example, indexing does not present much a challenge.

But with tens of thousands of image files Bridge’s index ability is really put to the test.

Through a series of tests ran during the past week it became clear that Bridge's indexing threshold is somewhere between 2500 and 8000 images. Our fastest machines running the latest version of Bridge could simply not index 8250 images files; it ground to a halt at about 1400 -/+ remaining. Without more systematic testing we will never know where that exact threshold lies -- and may never know considering the variations that exist across platforms and versions.

Conclusion no. 1: Do NOT expect Bridge to index one large superfolder of images.

So that means you have to create some number of folders in the library. In our case the existing library has 31 top-level folders with probably three times as many subfolders.

But in ordeer to index all the images and metadata each user has to manually go in and index each folder/sub- and sub-sub-folder.

Not reasonable.

Conclusion no. 2: Create a new folder structure that is manageable and workable for the intermediate term.

But wait a minute, how does this solve the indexing issue?

For one thing it helps to streamline the existing folders so they are more easily indexable by each user. But – and here’s the good part. . .

Very, Very Important: Remember when we tried to share one central cache among multiple machines? We learned that didn't work, and in fact created quite an indexing mess. But you can share cache files across machines, at least among Macs at any rate, and this will make indexing much easier and faster for multiple users.

This is how it works:

1. Using the "main library cache" on the primary (image librarian's) machine, all the library files are indexed on this one machine, folder by folder.

2. Once indexed, the cache is copied over to a second machine.

3. Launch Bridge and navigate to a (previously unaccessed/unindexed) folder of images and the thumbs will load almost instantly and the metadata is readily available for searching.

4. Repeat on other machines.

Note: this has only been tested on Macs and only on CS4. I tried two sets of folders, one with 634 image files and another with 1500 files. The first folder of thumbs and metadata loaded almost at once; the second folder took a few seconds longer loading the thumbs.

While this does not completely resolve the indexing problem, it does appear to eliminate the need for each new machine to have to create all the thumbs, previews and metadata files from scratch. And the other users do NOT have to manually access each folder the first time to index all those files. That's a step forward for us.

But what happens when you add images to the library?

Our plan is simply to add new folders -- which we would have to do anyway from the librarian's end -- and then have the other users index those new folders regularly. It's another byte to add to the workflow but indexing smaller number of files (say a few hundred or so every week) is manageable.

Caution: Please bear in mind that this recommendation is still in the preliminary phase. We'll know if there are any serious limitations in the next next week or so as we try this for real life across several machines.

And we must always, always remember that Bridge is really not designed for any of what we are trying to do.

Thanks to the people on the Adobe message boards for bouncing ideas back and forth, and especially to Ramon Castaneda for clarifying the whole cache business.

And thanks to my colleague Mark Baker. Mark took an incredible amount of time away from his own pressing projects to help me work through these issues. He helped me take a more systematic approach to the problems and in particular to the testing. Thanks Mark.

No comments:

Post a Comment