Voice Detective at Work

Whether you use audio to sell cheese, catheters, or to warn the world about epidemics, there’s an effective process for creating the right sound for your audience. I start with some detective work.

I’m a high-tech “messenger”: I do voiceovers for a wide range of clients who want their message delivered with impact. Let’s say my client needs a voice for an animated presentation showing how a new-fangled system does lickety-split medical transcriptions. Not the sexiest assignment — but that’s why it’s a good example. Since I’ve been chosen to be the client’s “audio messenger” I discuss goals of the copy, the target audience, how and where the piece will be used, tech considerations, etc. In this case the client knows that he wants a lot of “enthusiasm” but beyond that doesn’t give much information about his expectations.

Digging deeper, I glean that the piece is for a kiosk at a trade show booth. I’ll be given a storyboard. Good. Now I know some key things: It’s a sales piece, and the people seeing it are likely unfamiliar with the product, but probably know something about medical devices and systems if they’re attending the trade show. With the storyboard I can also “see” what I’m selling.

These clues tell me:

  • The tone should be more upbeat with a bigger “smile” and lots of energy to compensate for somewhat dry material that has to be interesting (hence, “enthusiastic”);
  • Clear enunciation is critically important for brand name and technical terms;
  • Pacing needs to keep the piece moving for people who will be overloaded with info but need to remain engaged for the length of the piece (about 5 minutes);
  • Pauses between sections are critical so that the animator can more easily time the animation to the voice (in this case).

With this in mind, here’s the audition sample I submit:


Oops. I pronounced the company name incorrectly. (Writers:Make sure you provide phonetic information.) With the understanding the name will be corrected, the sample is approved, with the caveat that it be “much more enthusiastic.” Uh-oh. To me this means 1) he’s not really hearing what he wants though he’s approving the sample; and 2) he wants a “hard sell” though my gut tells me this is not the best approach to serve the piece. But, the customer is always right. However, to save a lot of second-guessing what he actually wants to hear, I offer to have the client direct while I record so that he can give immediate feedback. This provides a “what you hear is what you get” approach rather than doing retake after retake in a vaccuum. While the vast majority of my clients approve my samples and let me run with copy to final output, there will always be those who don’t quite know what they want until they hear it.

The client likes the “directing” option and before the session sends me final copy with words in bold that he wants emphasized. Fair enough – the client is clearer about what he hears in his head.

We do the session in no time – I record while he listens via muted speakerphone (a great alternative to fancier phone patch if your client is willing) and we make only minor copy changes (for “ear-friendliness”). I edit out the takes he doesn’t want, and presto. I send a finished, clean audio file to the animator. Easy, right? Umm…

When the client hears the approved v/o in the context of the animation, he doesn’t like the added emphasis he’d provided in bold. It’s too over-the-top; so my initial instinct was right – but now he can clearly hear it. So I re-record a few offending sections ‘to time’ (now that the animation is complete) and all is right with the world. Here is the “before” take with emphasis and the final one that was approved, Notice the emphasis in the first clip on the words unique, fast, accurate, etc.


In the second clip, these words have been “smoothed out” into the overall read, and the read is also softer, with less of a “smile.” This is the approach that was approved – fairly different from the original “approved” take.


One challenge in doing retakes is to maintain consistent audio levels, room sound and vocal tone quality. Sometimes re-writes or new copy comes weeks after a job is “complete.” If Barnes and Noble.com adds some new voice prompts they’d better match the ones I initially created!

I accomplish this by using the same mic as I did for the original job, keeping a reference file for matching, making sure my CPU is quiet so that no noise is added to new takes, and by giving a good listen before sending off a new file. I also make sure I do retakes after being warmed up vocally so there’s no difference in pitch.

It’s all very subjective, this business of hearing – so save yourself unnecessary work and client frustration by trusting your instincts, getting as much info up front as possible and then “acting” accordingly.

  • Share/Bookmark

Leave a Reply

footer image
LinkedIn Twitter Voice 123 Gerson Lehrman Group