Welcome to the world of collecting, collating, cataloging and tagging digital photographs for the Image Library at Johnson & Wales University
Monday, October 19, 2009
Central image library update
And speaking of re-indexing, based on what I've seen so far, I urge each person in our department to develop a plan to continually re-index the library -- and that seems to have removed a few obstacles to fully successful searches so far.
We have seven designers, and four writers accessing and searching the library.
The one thing that will take time is for each person to develop his or her own searching strategy -- in other words, what variation of search criteria works best for any given search. This continues to evolve since we will undoubtedly be adding to and modifying the keyword list over the coming months.
Sunday, September 13, 2009
Progress on Sharing Images in a Central Library using Bridge
The remainder of the group's design teams, for both print and web will be indexed this week and searching the library.
I just added a new folder of more than 2k images this last Friday. After informing those designers who had already finished indexing they just went in and a few moments later had indexed the new set of image files -- again, no problem. So adding new folders of image files seems to present little if any challenge.
The next month or so will see virtually everyone in the creative services with access to the library. The next challenge will be fine tuning the keyword strategy.
Also, another ongoing set of issues surrounds the varying level of competency with computers, the specific OS and degrees of competence in picking up new software tools.
Wednesday, September 2, 2009
Important update on image file sharing!
My initial idea about copying the cache files to each machine' Bridge program is now dubious. I noticed on my second testing machine (a Mac Pro tower) that while on the face of it copying the cache files did indeed seem to expedite the process of indexing the many folders in the library, I observed that Bridge was creating a duplicate folder in both the 256 and 1024 (thumb and preview folders respectively) and also apparently duplicating the BridgeStor files as well.
Naturally this would double the already huge cache size.
Perhaps it was a phenomenon unique to that particular machine but at this point I don't have the time -- or the inclination -- to keep testing to see whether we can in fact share cache files.
We do know that once each member of the group accesses the library they can begin the process of indexing via their own machine/Bridge and ultimately be ready to search the library.
So, what I've opted to do is to simply provide the path to the new central library and then have each designer begin indexing the folders locally. This way it should keep the each person's cache clean -- or at least as clean as possible.
Saturday, August 22, 2009
Sharing image files on a network using Adobe Bridge
Here's what I've discovered over the past several weeks of testing:
Trying to share a centralized cache is quite out of the question. But, as it turns out, it appears that one share cache files across multiple computers; providing these are Macs using CS4 of course. This is not the best alternative but short of going with a cataloging system -- not always an choice for some organizations -- this is perhaps the only option available.
2. I indexed our image library on my laptop, which took the better part of a day. Organizationally it consists of 56 folders, containing some 104,000 images for more than 434 gigabytes of space.
3. I then copied the cache (and all files) to an external hard drive.
What I copied was on the following path: user>library>caches>Adobe?Bridge CS4>Cache
4. I then went to a second computer -- networked to connect to the central image library drive -- and before launching Bridge I deleted the old cache file, and then copied the new cache file to the second computer. I then launched Bridge, navigated to the image library in the folders panel, clicked on several folders. One-by-one all the thumbs and previews loaded quickly and the metadata was right there as well, allowing for fairly quick and easy searching. And for us "searchability" is the KEY to having a central image library
5. To add mages -- and here's where ity gets a bit klunky -- I intend to add subsequent folders as new images come in to be cataloged. I will then inform the designers that a new folder has been uloaded to the library -- they already have the path -- and that they will need to index that folder to make the images "searchable.
Hopefully this will work on all our designers' machines as well. I'll know more by the end of next week.
BUT, there are a couple of things to consider:
1. Once a folder is indexed it cannot be moved nor can any of the files inside that folder be moved or manipulated in any way. For example, if you returned to a file to add additional metadata, then each user would have to reindex that particular folder on their local copy of Bridge. Having said that, it's not terribly difficult or challenging. In our case, since the image library is permanent once these folders are indexed and "accessed" on each machine that's it. It's the initial indexing that can take so much time.
2. Make sure that permissions are in place before you golive. In our situation just me and my director have read/write permissions. All others have read only. This allows anyone to pull an image out of the library and use it as they see fit once the copy is on their desktop of course. They simply cannot make any changes, additions or alternations in the existing folders and files ON THE LIBRARY DRIVE.
That's it for the moment.
Stay tuned.
Saturday, July 25, 2009
Using Bridge to share an Image Library
To recap the situation.
The goal is to share a large and growing library of more than 107,000 images among a small group of designers and editors.
I spent some three months creating the central library of some 434 gigabytes on a hard drive, which was then moved to a 2-terabyte mirrored RAID drive. It will remain there until the client migrates to a full-blown digital asset management system in 6 months or a year.
All the images are tagged with metadata and properly keyworded and can be readily and easily accessed by multiple users.
So what's the challenge now?
"Searchbility." That is, can multiple users search the library using the existing metadata and our keywording strategy.
The short answer is a (very) qualified "yes."
Very serious tip: indexing is the key to searchability and the cache is the key to indexing.
Recent tests have shown us that sharing one large central cache that would be regularly updated by the image librarian is NOT workable. Cache structures differ across platforms and across versions of Bridge. And it seems that Bridge enjoys the best performance with a local cache readily at hand.
The first time Bridge access a folder of images it creates the necessary metadata, thumbs and previews and stores all this information in the cache for future use. That way the next time you go to the folder, Bridge can easily find the metadata and there is little waiting around for the thumbs to load and they metadata is readily available to allow searching through the folder. If all the folders in a given "superfolder" have been indexed, then Bridge can quickly search across all the subfolders.
The root problem lies with Adobe’s indexing feature. Or rather it lies with Bridge trying to index not only hundreds but thousands (or tens of thousands) of image files at one time.
Now match this with existing folder structures. In our case I had originally created dozens of folders and subfolders. We had literally thousands of duplicate images that needed to be weeded out in the early phases of creating the library; and in Bridge it was not possible to move thousands of image files around without putting stress on the software.
With an image library of 10,000 images, for example, indexing does not present much a challenge.
But with tens of thousands of image files Bridge’s index ability is really put to the test.
Through a series of tests ran during the past week it became clear that Bridge's indexing threshold is somewhere between 2500 and 8000 images. Our fastest machines running the latest version of Bridge could simply not index 8250 images files; it ground to a halt at about 1400 -/+ remaining. Without more systematic testing we will never know where that exact threshold lies -- and may never know considering the variations that exist across platforms and versions.
Conclusion no. 1: Do NOT expect Bridge to index one large superfolder of images.
So that means you have to create some number of folders in the library. In our case the existing library has 31 top-level folders with probably three times as many subfolders.
But in ordeer to index all the images and metadata each user has to manually go in and index each folder/sub- and sub-sub-folder.
Not reasonable.
Conclusion no. 2: Create a new folder structure that is manageable and workable for the intermediate term.
But wait a minute, how does this solve the indexing issue?
For one thing it helps to streamline the existing folders so they are more easily indexable by each user. But – and here’s the good part. . .
Very, Very Important: Remember when we tried to share one central cache among multiple machines? We learned that didn't work, and in fact created quite an indexing mess. But you can share cache files across machines, at least among Macs at any rate, and this will make indexing much easier and faster for multiple users.
This is how it works:
1. Using the "main library cache" on the primary (image librarian's) machine, all the library files are indexed on this one machine, folder by folder.
2. Once indexed, the cache is copied over to a second machine.
3. Launch Bridge and navigate to a (previously unaccessed/unindexed) folder of images and the thumbs will load almost instantly and the metadata is readily available for searching.
4. Repeat on other machines.
Note: this has only been tested on Macs and only on CS4. I tried two sets of folders, one with 634 image files and another with 1500 files. The first folder of thumbs and metadata loaded almost at once; the second folder took a few seconds longer loading the thumbs.
While this does not completely resolve the indexing problem, it does appear to eliminate the need for each new machine to have to create all the thumbs, previews and metadata files from scratch. And the other users do NOT have to manually access each folder the first time to index all those files. That's a step forward for us.
But what happens when you add images to the library?
Our plan is simply to add new folders -- which we would have to do anyway from the librarian's end -- and then have the other users index those new folders regularly. It's another byte to add to the workflow but indexing smaller number of files (say a few hundred or so every week) is manageable.
Caution: Please bear in mind that this recommendation is still in the preliminary phase. We'll know if there are any serious limitations in the next next week or so as we try this for real life across several machines.
And we must always, always remember that Bridge is really not designed for any of what we are trying to do.
Thanks to the people on the Adobe message boards for bouncing ideas back and forth, and especially to Ramon Castaneda for clarifying the whole cache business.
And thanks to my colleague Mark Baker. Mark took an incredible amount of time away from his own pressing projects to help me work through these issues. He helped me take a more systematic approach to the problems and in particular to the testing. Thanks Mark.
Wednesday, July 22, 2009
Test 2: File handling
OK, so we ruled out any possibility of multiple machines sharing a central cache. Next, we wanted to test how Bridge handled a very large number of files within one folder.
I created a test “Images” folder of 8250 image files, taking up 45+ gigabytes of space.
Machine 1 (PC/Bridge CS3)
Machine 2 (Mac/CS3)
Machine 4 (Mac tower/CS4)
Machine 4 (Mac lap/CS4)
Copying files within drive –
Machine 4. Copy files within library drive – 837 files/4.5 gigs:
Using Bridge copy/rename: “could not process some files” – failed to copy/rename
Using Bridge: “could not process some files” – failed to copy
Using Mac finder (drag and drop): 17 mins
Machine 1. Copy files within library drive -- smaller test of 202 files/800mb:
Using Bridge copy/rename: 17 mins
Using “window to window”: 4 mins
Conclusion: Move rather than copy files.
Performance within drive -
Machine 1. Navigated to image library and accessed “Images” folder – machine hung up at 40% indexing (-/+ 20 minutes) and stopped after 62 mins.
Clicked off onto another folder in library – which worked fine and loaded previews and metadata – and then clicked back on “Images” stalled again/hung up entire program after that.
Machine 2. Navigated to “Images” folder and Bridge loaded after 45 mins Bridge loaded only two image files with thumbs and metadata, all other previews remained blank. Progress wheel stopped but no files loaded.
Machine 1. Attempted to purge and then rebuild cache with the same result as Machine 2: loaded two images with thumbs and metadata but then stopped indexing.
Machine 3. Navigated to “Images” folder and it completed indexing in 15 mins – keyword indexing complete only by manually scrolling down through the image files.
Machine 4. Navigated to “Images” folder and it completed indexing in 18 mins – keyword indexing complete only by manually scrolling down through the image files.
Note: CS4 Preferences allow for adjustment of number of items stored in a given cache – CS3 on the PC does not -- further confusion here since one Mac/CS3 had a slider allowing for "smaller" to "larger" in the cache prefs but another Mac/CS# did NOT.
Conclusion: Significant disparity between CS3 and CS4 in accessing one folder with large number of image files.
Additionally, both late model Mac CS4 machines experienced a significant slow down for adding existing keywords/metadata to the last 1400 or so files of the "superfolder" of 8250 files.
Sunday, July 19, 2009
Test 1: Accessibility
First I tested Bridge's ability to access files using a centralized cache. Based on speculation in our office as well as discussions found online, we posted a central cache in the library itself and pointed the various machines to that one location.
1. On PC/CS3 I created a test image file, tagged it with a unique keyword and moved it into the “Working” folder on the library drive.
2. Placed cursor at root library level and ran search for unique keyword on new image: NO RESULTS. (Expected.)
3. Returned to “Working” folder to “index” the new image.
4. Back to top level and ran search again: NO RESULTS. (Unexpected.)
5. Back to “Working” folder and selected test image – noted metadata/keywords visible in the metadata panel in Bridge.
6. Back to top level and run search: NO RESULTS.
7. Back to “Working” and moved test file to “Hospitality” folder.
8. Back to top and run search: NO RESULTS. (Expected.)
9. Return to “Hospitality” and select image.
10. Run search at top level: NO RESULTS. (Unexpected.)
11. Run search of “Hospitality” keyword: NO RESULTS.
12. Return to “Hospitality” folder and attempted search: NO RESULTS.
13. Various search criteria attempts all produce NO RESULTS.
Repeated test on a Mac/CS3 and with a second PC/CS3 with the same results.
Attempted to repeat test on Mac laptop/CS4 – Bridge produced an error message stating that it could not see the cache it was originally pointed to and wanted to recreate a new LOCAL cache folder.
(Light bulb.)
1. Quit Bridge and disconnected from library drive.
2. Re-launched Bridge CS4 and let it build a new local cache.
3. Reconnected to library, and let it re-index a folder: it opened the thumbs almost immediately and indexed a 2.6 gig file of 564 images in 3-5 secs. I immediately commenced a search and it found the desired images in 4-6 secs.
Conclusion: Centralizing the cache to be shared by different versions of Bridge is a problem.
How/why remains unclear: online forum discussions lead me to believe CS3 and CS4 create different “types” of caches. Adobe has not confirmed.
In any case since each machine would have to be indexed on its own and continue its own indexing, having a central cache seems irrelevant and the benefits (?) don’t justify the potential risks
Saturday, July 18, 2009
Using Adobe Bridge to share image files across a network
Still, like many others on this forum Bridge is the tool of choice for creating a central image library and then organizing and tagging existing and new images. After a series of meetings with the other members of the Creative Services team I began the creation and organization phases in late March, finishing in early July.
But first the basics:
1. We're running a network of a half dozen machines using both Windows and Mac operating systems some with Bridge CS3, most with CS4.
2. The image library is made up more than 107,000 image files, taking up roughly 434 gigabytes on a 2-terabyte mirrored RAID system in our IT data center. (New images are being added nearly every day.)
3. Cataloging software is not an option and a move to a full-blown server-based DAM system is projected but details remains uncertain at the moment.
Lessons learned so far:
1. Creating a large library requires some form of initial or permanent folder structure in order to move files around easily and quickly -- and to permit accurate and effective metadata tagging. Attempting to move more than a few hundred image files at a time can be a challenge for Bridge and will require the peppiest of workstations.
2. Using the primary work machine I pointed all our machines toward a centralized cache files (including the camera raw cache files) as suggested in this forum: right in the library itself, in a specially designated file. I also use this to work on collections of images before placing them in their respective library folders.
Warning! We discovered what may be a serious issue here: a day after pointing our machines to a centralized cache each copy of Bridge could see thumbnails and see the metadata but they could not search using the metadata. Even when they could search, each time a machine opened Bridge and accessed the library the thumbs would load painfully slow. This is both CS 3 and CS4 across platforms.
3. BTW, sharing master keyword lists is a breeze in CS4: just go to keywords panel and export the list to the location of your choice. It creates a .txt file (on the Mac). You can change it like any other text file and then import it right back again. Importing is equally easy: just go to Keywords panel and click on Import. Navigate to the changed keyword list and that’s it.
4. The central problem with Bridge is that each machine needs to fully index the library the first time – you can already see how long that is going to take with tens of thousands of images.
Moreover, there is no effective way to update the library from one source (for example, from the image librarian’s machine) and have all the other copies of Bridge automatically update with the new images or modified images.
Remember! Every time you move a file or modify the metadata for a file each cache on each copy of Bridge needs to be updated. One senses this could easily become a logistical nightmare.
5. Next phase is to duplicate the library and push all the files into one large folder. This should achieve two goals:
a. This should allow for easy updated of each copy of Bridge.
b. All the image files can be renamed using an agreed-upon renaming convention. (Another issue is our image files have a wide variety of file names, many using unacceptable characters (asterisks, pound signs, ampersands) and with spaces, etc.
c. Then we test this across platforms and versions of Bridge.
I should say that as a freelance digital photographer I use Adobe Lightroom 2 for my own image library; however, my present client cannot/does not want to purchase multiple licenses for such an expensive program.
I have also tested Microsoft's Expressions Media 2 and have found this to be a reliable, inexpensive handy little program for creating catalogs as well as simple web galleries to share. And the cool thing is that MS distributes a cross-platform catalog reader for free!
Wednesday, July 8, 2009
Issue no. 614 - filename problems
Part of this is probably the quirkiness in migrating from the Mac to a PC. But it should also be noted that many of the filenames used -- the images come from a large number of sources -- themselves have unusual and generally unacceptable characters (pound signs, ampersands, spaces, and even tildes).
Lesson learned: clean up the filenames prior to migration.
Sunday, July 5, 2009
Hardware for less than $500?
1. Digitized photos
2. A computer
3. Cataloging software
4. A backup strategy
If you don't have the photos digitized then you'll have to either buy a scanner or have the images put on disk.
I'll assume you already have a computer. Ideally you would want a dedicated computer station for archiving photos -- particularly for very large collections. Moving hundreds or sometimes several thousand image files around at one time can tax an older machine; especially if it's bloated with games, lots of other software etc.
Aside from the computer the biggest cost is going to come from software and the backup system used to protect your image library. (Software will be discussed in our next episode.) In the ideal world a serious backup strategy centers 2+1:
a primary and a secondary external hard drive and a disk (CD or DVD)
However, backing up to CD is time-consuming and tedious and CD-Rs hold such a small amount of data. DVD-Rs are are a step upwards, but they, too, can easily be outpaced by the growth of your library. Blu-ray is the new disk standard -- each disk can hold up to ten times the amount of information on the average DVD-R and for the moment that makes for a sensible disk component in any backup system.
But for most amateurs and many professionals, a disk component is simply not a viable alternative. For cost reasons or time-management, most will rely on the external hard drive as the basis for a backup management program.
So, where does that leave us? In simple terms:
Scanner: $150
Hard drives, two 1-terabyte drives: $350
You can certainly spend less on a scanner but consider what you're scanning for. Also, if you have lots of slides and negatives that will also narrow your selection of available scanners -- and probably increase your cost as well. You'll want the best quality digital images so I'd rather err of spending a little too much than not enough.
The same is true of hard drives. Buy new, high-quality drives and get the largest size you can afford. Believe me you'll grow into them.
External hard drives are more reliable today than ever and their cost continues to fall. My suggestion is to get the largest pair of drives you can afford. And since many drives come with backup software as part of the purchase package, that's one less cost for you. Look for drives that have as many different types of connections as possible (3 or 4 is optimum but don't settle for just a single connection type.)
Also the drive costs noted above are for single drive systems, not for RAID or multiple drive configurations. (RAID stands for "redundant array of independent disks".) If you decide to boost your backup options through using a RAID mirrored then plan to increase your costs by half again as much.
For a larger image library and with a somewhat more generous budget, consider the following configuration:
scanner: $200
hard drives (on-site), 2+2-gigabyte RAID mirror: $800
blu-ray recorder: $300
blu-ray discs: $250 (cakebox of 50)
OR
hard drives (off-site, optional), 1-terabyte: $200
To find out more about RAID systems and whether it's right for you visit Wikipedia's in-depth discussion online.
Personally, I use both LaCie or OWC hard drives. No, they don't pay me to say that -- in fact they probably aren't even paying attention. Anyway, I've used LaCie for years and never had one fail yet -- and OWC is equally reliable. I like for their size and portability. Both come bundled with good backup software: LaCie bundles Genie and Intego backup; OWC bundles NovaStar for Windows and Prosoft's Data Backup for the Mac.
For example, right now I am using a pair of 500-gigabyte OWC "On-the-Go" drives for working and primary backup and then a 1-terabyte OWC desktop drive as a secondary backup drive for a client. Eventually we'll move their library to a 2-terabyte RAID-mirror system.
You can find LaCie online at: http://www.lacie.com/us/index.htm. And OWC can be found at: http://eshop.macsales.com/
Oh, and you can research and buy LaCie drives at OWC as well.
If you're the least paranoid -- and frankly you should be -- I'd recommend spending a bit more and using a desktop and portable off-site backup system. (Say in a safety deposit box or a basement.) We'll talk more of this later when we discuss "process" but you might consider using portable drives as a secondary backup.
OK, so that's the easy part.
Next week things get a bit harder as we dive into cataloging software. We'll also talk about using metadata and later we'll get to the really hard part: workflow and process.
Sunday, June 28, 2009
So you say you want to organize your photos?
Maybe some of the lessons I learned from those mistakes will help you navigate safely and easily around the world of digital asset management. I'll also provide you with some links to valuable online resources and talk about a few of the more handy print references as well.
Whether you're a serious amateur seeking to organize all those family and travel photos or a professional who has been tasked with making sense of a corporate image collection, there are three basic components to any digital asset management (DAM) program: hardware, software and process. We'll talk about these each in turn.
But before we do anything I want you to get out a piece of paper or open a text editor and jot down the answers to the following three questions:
1. Why do I want to organize my photos?
2. How much time can I spend creating my image library?
3. How much money can I spend?
If you're planning to organize a personal photo collection, the necessary tools will probably cost about $500 to set up properly. You will also have to allocate the time necessary to learn any new software as well as the time to actually do the cataloging and tagging of your photos.
A large corporate collection might cost no more than $2000 to set up but the real cost there is maintenance. Corporate entities will most likely have a full time asset manager as part of their print and editorial departments.
Naturally the digital asset managing work-flow you develop will arise out of the what suits you and your needs, not what works for me or anyone else. It's your time and your money. I just want to help you spend them both wisely and efficiently.
The rest is up to you.
Enough twaddle -- let's talk hardware!