In the May issue of EContent, we talked about text-based services and applications for handheld devices ("It's a Small World After All: Content for Wireless & Mobile Appliances"). This time around, we'll take a look at audio content—specifically, downloadable audio and dial-up voice browser applications, which offer the promise of Web-based services from any smartphone or voice-enabled PDA. The dream of accessing the Web anytime, anywhere may actually soon be a reality.
Without a doubt, audio-enabled mobile and Wireless Information Devices (WIDs) will be part of our near-term future. In fact, there are already a number of consumer and B2B services on the market—and many more poised to leap into the audio space just as soon as the hardware devices, standards, and wireless networks catch up.
The audio/voice arena is complex; it doesn't fit easily into neat, pigeonholed categories. As an input mechanism, there's voice-driven automated menu systems we're all already familiar with ("For customer service, press or say two"). There's passive pre-recorded audio content—which could be voice, music, radio broadcast, any sound—played over a handheld device in either downloaded mobile or streaming wireless format. There's pre-recorded consumer dial-in information services. And there are B2B enterprise applications for dial-in listening and voice updating of proprietary databases.
Some commercial services you may have already heard of: Tellme and Indicast for current news, stock quotes, and horoscopes; customizable Web-based "audio content portal" Live365.com; or wakeup/reminder ring-you-up service iPing. In the B2B space, dial-in solutions for ecommerce, business productivity, and staff management are becoming prevalent. While audio downloads and dial-in voice information are plentiful, true wireless applications are still in nascent form—on the drawing board or in beta testing.
Voice Web: Voice-Enabling Web Content
Web-based services are transformed into phone-accessible voice applications with "voice browsers" using speech recognition, pre-recorded audio, and speech synthesis (i.e., machine-generated voice). The real benefit of voice interaction is that it overcomes the limitations of tiny keypads and diminutive screens on most wireless devices. A news story that takes five minutes to read on a WAP phone can be heard in half the time using streaming audio. Voice interaction also holds the possibility of speech dialogs to interact with Web services, giving users the choice of responding by pressing a key or speaking a command. And, as aggregator and distributor Audible Inc. is fond of pointing out, it provides an accessible alternative "when the eyes are busy but the mind is free."
Short and Sweet
What types of content are appropriate for mobile and wireless audio delivery? Dynamic, changing information and brief snippets—email headers, custom news clips, stock quotes, sports scores, and movies listings, for instance. Longer offerings can include speeches, vintage radio shows, entertainment (jokes and comedy shows), and short stories or poetry. But, warns Jonathan Korzen, senior manager of media relations for Audible, "Don't ask your customers to download large files. Paying attention to compression and file size are the most important things…and a good compression algorithm is key." Audible resolves this issue by offering its customers four different formats in various qualities (AM radio quality, for instance, takes only 2MB of space for one hour of listening).
Of course, Korzen admits, it's not only what's short, but whatever people want to hear and are willing to pay for. "It's marketability—if we think people want it, we offer it to them."
Got Content? Two Recipes for Enterprise Audio
Initially, as with text conversion from HTML to, say, WML for wireless, I expected to find transcoding software solutions being employed to repurpose content from text to audio. In fact, that's not the case at all. Due to voice quality issues (synthesized voice is not yet considered prime time for consumers), most Web-to-voice conversion is occurring in customized enterprise solutions for internal use. Two companies who are developing applications for machine-generated voice on-the-fly are VocalPoint and Informio [See VocalPoint's Profile in the May 2001 issue of EContent, p. 56].
VocalPoint applies style sheets; Informio transcodes HTML to VoiceXML (a scripting language based on XML that defines voice segments and supports the creation of menu prompts to enable Internet access over smartphones and wireless PDAs).
VocalPoint: Style Sheets
VocalPoint describes its service as a "voice Web browser" targeting the B2B space for proprietary applications by enterprise customers. It focuses on healthcare, employee self-service (HR), utilities, the financial and insurance markets, and sales force automation (SFA).
Garry Chinn, VocalPoint CTO, clarifies, "We work with customers to figure out which content to voice-enable. Static HTML is not difficult to voice-enable, but in ecommerce sites, there are dynamic pages with database information. If you need to enable dynamic information, that's our specialty."
Using HTML on existing Web sites, VocalPoint extends style sheets to voice applications. To vocalize MyYahoo!, Chinn explains, you need to add about 30 lines of style sheet code. (Five to 30 lines is the standard amount, depending on the complexity of the page.) "We can embed at least part of the data with pre-recorded static info—this is put in the style sheet as ‘don't read this content, play the recording instead.' Dynamic information will still need to be synthesized, though."
One particularly interesting application is a dial-in, vocalized "employee self-service" solution. By dialing a toll-free number from any old telephone, employees can learn about their personalized HR benefits, find out how many sick days they have, get new membership cards, etc.
"Most companies have already built out their Web applications," Chinn explains, "so we leverage that into voice delivery. For instance, we're currently testing a mobility solution for a large utility-energy company. We've voice-enabled their sales force so they can call in for up-to-date customer and product information."