Feature: Page (1) of 1 - 01/08/09 Email this story to a friend. email article Print this page (Article printing at MyDmn.com).print page facebook

Managing Media: Adobe CS4 Metadata Part 2

CS4 is one of the most sophisticated and forward thinking suites the digital content creation industry has seen By Mike Jones

Part one of Managing Media: Adobe CS4 Metadata can be found here

Soundbooth had a shaky start to life, seemingly being designed as an audio 'tweaker' specifically for those who know nothing about audio production (Flash and website designers seemed to have been the primary demographic). Subsequently the original Sb 1.0 was a hobbled, simplistic, under-powered waveform editor. Adobe already have in place the superbly powerful Audition (built on the firm foundation of Sytrillium's Cool Edit Pro which Adobe acquired a few years back) and so the place of Sb seemed to be only for those who weren't brave enough to open Audition.

Sb 2.0 however shows Adobe is committed to developing Soundbooth as both a very functional DAW in its own right, but also one with a distinctive bent that positions it with a quite different internal logic to that of Audition. Having moved beyond simple 2-track stereo, Soundbooth 2.0 is now a multitracker with a more complete toolset for editing, cleaning out noise and mastering audio files, but what sets it apart is its dedicated tools for generating and managing metadata.


Sb 2 includes an Audio-to-Speech transcription system that embeds the text transcribed from the audio recording as metadata in the file itself. With a variety of languages and dialects available (especially useful is UK English rather than just American English) the system analyses the audio waveform, identifies words and phrases, transcribes them to text and tags the individual words to individual audio peaks on the waveform.

Soundbooth. Transcribe language

From there this metadata text body becomes a searchable database for the audio itself allowing you to instantly locate a specific word or phrase. A handy search field allows you to simply type the keyword you're looking for and Soundbooth will drop the timeline play-head right on that spot. Moreover the system will also attempt to identify individual speakers through their voices. So in the case of dialogue scene or a documentary interview with multiple voices the system will tag the keywords as belonging to a particular speaker. Obviously such a system has huge ramifications and possibilities for documentary editors working with vast amounts of interview material.

I tested the audio-to-text system with a variety of audio recordings to see how accurate the results would be. If you have ever dabbled with other voice-to-text systems you'll know what I mean when I say sometimes the 'spirit is willing but the flesh is weak' - in other words the idea is fantastic and promise a great deal but the results can be underwhelming in regards to their accuracy in real-world use.

As a case in point, I took the famous Winston Churchill wartime speech "we shall never surrender." Running this section through the Speech-to-Text of Soundbooth I got a speech very far from that old Winston would remember. Now certainly Chruchill was a horrific mumbler with a distinct set of very British inflections and of course the recording quality itself is rather poor. Still, I was ready to dismiss the Soundbooth transcription systems as a nice idea as yet unfulfilled when I realized that despite the often word misinterpretation and grammar issues, the keywords were intact. The metadata that Soundbooth had applied to the file was not a useful literary transcription but it was a very useful set of keywords and tags that made the whole speech functionally searchable. The interpretation had totally misconstrued 'we shall fight them on the beaches' but it had picked up the words 'fight' and 'beaches'. The result therefore is a very useful search and tag system automatically applied to dialogue and voice over tracks.

For those still looking for a real solution and true transcription all is not lost either. Whilst the interpretation was very rough it is a head start on a manual transcription. Soundbooth allows the full text to be edited, spell-corrected and altered inside the application to be a more faithful reproduction. So whilst not eliminating a manual process it can certainly cut down the amount of time such a manual process would otherwise take.

Of course, as we touched on before, all the metadata in the world is useless if it can't be shared and exchanged and to this end the interaction between Soundbooth and Premiere Pro is very good. All the metadata generated from the transcription in Soundbooth is carried over into the metadata browser in Premiere Pro. So from here the editor can use the keywords to search for particular lines of dialogue and locate specific shots based on the audio rather than just a visual description, shot or scene numbers.

Premiere Pro audio transcript

If Premiere Pro can find a way to mesh this kind of broad-based functionality with a script-based editing solution such as Scriptsync from Avid they will really be pushing open some very exciting doors.

Aside from Soundbooth, the Adobe CS4 suite also offers another means of garnering metadata for media assets before they arrive at the NLE. OnLocation is a direct-to-disk recording software system that Adobe picked up under the guise of DVRack when they bought out Serious Magic a while back. Now fully integrated into CS4 and sharing common and consistent GUI traits, OnLocation provides a dynamic means of shooting video direct to laptop and employing a wide range of monitoring tools for the process. OnLocation is designed not just for the efficiency of shooting direct to computer and subsequently avoiding any capture and transfer downtime, but also for the power and precision that comes with detailed image monitoring and calibration tools. Going hand in hand with this process of direct acquisition is, of course, metadata. OnLocation provides all the same fields and metadata items available for access and manipulation in the rest of the CS4 suite, from Bridge to Premiere Pro.

OnLocation provides a live image monitor from camera along with vectorscopes, histograms and waveforms; the camera's record functions are controlled from the software. In tandem a dedicated file manager and metadata tagging window in OnLocation allows for all clips recorded direct to disk to be tagged with keywords, technical information, comments and descriptions, scene and take numbers on the fly. Assuming the shooting environment is conducive to using a direct to laptop process (namely studio based work, especially greenscreen and effects shots) the system provides an extraordinary level of production workflow efficiency. Shots are tagged with all relevant metadata as they are shot direct to hard drive, audio is processed and auto transcribed in Soundbooth, and then all those assets are imported into the Premiere Pro NLE environment where they retain full database search functionality of all their metadata.

ONLocation monitors

Over the past five years all the major software developers for creative tools have moved their products consistently toward offering a complete package rather than discreet units. Avid went on a buying spree purchasing other software tools, including audio and 3D assets, and then attempted to bundle them together. Apple went the whole hog and discontinued virtually all their creative applications as separate entities and instead created the singular Final Cut Studio. Sony for their part always had strong audio tools in SoundForge and Acid Pro to string with Vegas and to this they added DVD Architect and bundled third party tools like Magic Bullet plugins. Amid all this, Adobe has followed much the same path. On the surface they have been incredibly aggressive with their pricing; if you were to need say Photoshop and After Effects then you may as well buy the whole CS4 Production premium suite because the cost of the suite is virtually the same as buying any two apps separately. But cut throat business economics aside, looking deeper, Adobe has actually achieved tangibly and dynamically what their competitors have only played surface level lip service to.

Avid may have taken over a host of other companies but all their tools remain as disparate as ever. Final Cut may bundle all its apps into one FCS box but the level of true integration between the contained applications is rudimentary at best and at times non-existent. Sony's applications, powerful and forward thinking as each one is in its own right, have virtually no integration at all. Adobe seems to have been the lone soldier in this regard and while CS2 and CS3 showed promise but didn't quite deliver, CS4 shows the fruits of that labor. Project files can be swapped seamlessly between applications like Premiere Pro, After Effects, Encore, and Photoshop and a complete database-built metadata system is supported across all these applications for all the assets in use. The result is arguably the first truly integrated software suite built on that most powerful of contemporary digital production ideas - Metadata.

Adobe should be applauded for its efforts in CS4. The future holds bright possibilities for what other efficiencies and creative flexibility may be found as the concept of metadata management and integrated workflow is pushed even further. What we must all hope for now is that the other players - Avid, Apple, Sony- take Adobe's cue and move quickly to keep up. Right now CS4 stands head and shoulders as one of the most sophisticated and forward thinking suites the digital content creation industry has ever seen and if Adobe's competitors take heed and catch on we can all look forward to better tools and more powerful workflows across the board in the future.

Page: 1

Mike Jones is a digital media producer, author, educator from Sydney, Australia. He has a diverse background across all areas of media production including film, video, TV, journalism, photography, music and on-line projects. Mike is the author of three books and more than 200 published essays, articles and reviews covering all aspects of cinematic form, technology and culture. Mike is currently Head of Technological Arts at the International Film School Sydney (www.ifss.edu.au), has an online home at www.mikejones.net and can be found profusely blogging for DMN at www.digitalbasin.net

Related Keywords:adobe metadata, video editing, metabrowser, post production, workflow, audio editing


Our Privacy Policy --- @ Copyright, 2015 Digital Media Online, All Rights Reserved