A speech-recognition industry expert and old friend of mine, PJ at Nuance Communications, helped us out with a few questions about Google Voice and the troubling civil-liberties issues mass adoption of this technology might present.
BELLTOWN MESSENGER: It really does take a compu-genius to try and wade through this higher-level telco stuff. As an example: I don’t do texting, but Google Voice not only sends you an email but also a text message, with no option to turn off the feature. Finally I got T-Mobile to disable my expensive text message service. Complicated crap. Would surely be easier if I just switched over to the Google phone (not a free service), as Google so gently prods one to do when signing up for Google Voice.
PJ: I’ll jump in and say that you CAN opt to have Google not SMS you. My wife uses Google Voice, but I’m too cheap to buy her an SMS plan, so I went through this. It isn’t completely obvious, though. It took me a bit of searching through message boards before I figured it out:
- Google Voice/Settings
- Click Edit button under your mobile phone
- Uncheck “Receive SMS on this phone” (mobile phones only)
BELLTOWN MESSENGER: Should we be afraid of biometrics? Do governments not use this voice-recognition technology to track the speech of citizens on a mass scale?
PJ: Biometrics, such as using a “voiceprint,” can of course be helpful. In typical “Speaker Verification” applications, companies like banks can use your voice as another way to make sure that it really is you before you access your account online, for example. This can add an extra layer of protection against identity theft. In “Speaker Identification,” someone might attempt to identify you simply by matching a sample of your voice against a database of voiceprints. This is a much more difficult task technically, but the government is playing with it (for example, this is one technique they use to determine if that latest audio recording really is Osama bin Laden).
At the end of the day, biometric technology can be helpful or harmful depending on who is using it and what they are trying to do. There are much easier ways for governments to figure out who is using telephone technology than biometrics (for example, by tracking your Caller ID), so I think it’ll be a while before the government tracks you simply by your voiceprint. (And I’m already putting together my plan for a vocal cord replacement business when they do.)
-Interview by Rex Lameray
See more on speech recognition technology:
BELLTOWN MESSENGER: Now that I’ve gotten the paranoia out of the way, I mean, these days you should take for granted that everything you say and do, on the Internet and off, could be recorded in some form and stored for eternity. So ... is there a live person, say in an office in Bangalore, listening to the voicemail and transcribing it? I notice Google Voice has a delay, and even gives you messages like “your voicemail is being transcribed.”
PJ: I should first say that I don’t know the details of how Google does their transcription, but there are some generally accepted methods for how transcription works. In all but the cheapest systems, transcription gets better as it “learns” your voice. For example, if you call me five times from the same phone number, the fifth call should do better, since the system can use what it learned about your voice on the first four calls. Checking back into Paranoia Mode, this means that some entity (Google, the phone company, the government) is building up a model of your voice which might make the Speaker Identification problem I talked about earlier more manageable technically.
For commercial customers of voicemail transcription, they may sign up for a higher quality of service, which might mean that live people are listening to the voice-mail you left. The person might always get involved for the first five or so calls (until your voice model is built up) or they might “check the work” of the automated transcription system if the confidence score is low.
A friend of mine was curious if Indian transcriptionists were being used on the voicemail transcription system he subscribed to, so he called up and left a message to the effect of “I will meet John Thompson and Fred McMurray at the corner of Chikkapete Street and Doddapete Street,” (these being street names in Indian cities). He found that the street names were transcribed perfectly, whereas the names were misses.
Going back to what you said earlier, I think that you should always assume that someone you don’t know may be listening to the message you just left.
BELLTOWN MESSENGER: It’s illegal in many states to record someone’s conversation without their permission, but I’m guessing that if someone leaves you a voice message there’s applied consent: obviously they know they are being recorded. Here’s the legal question: if some random person leaves me a voice-mail message, can I then post it on the Internet, through Google Voice (which makes it easy) or otherwise?
PJ: I’m a technology guy and not a lawyer, but I’m going to go out on a limb here and say that this is probably a “legal gray area.” Expect to see lots of privacy-related cases before the courts in the coming years as access to personal information becomes more ubiquitous.
BELLTOWN MESSENGER: When will Google Voice improve to the point where I can get a fairly good transcription of recordings, in all conditions?
PJ: It should get better over time as the people who call you frequently leave more messages and get their “models” built up more completely. Having said that, it’ll never work well in situations that an ordinary person would struggle with, i.e. calling from a subway station when the train is going by.
There was a company out there who marketed their transcription as giving you the “gist” of the message – not every word would be transcribed perfectly – and I think this is a great way to use this technology. I’ve been using it for years, and 99% of the time I get enough that I know if I need to call the person back or not. These days I think people who actually dial into a voice-mail system to listen to their messages are chumps. (In a worst case, I’ll just click on the audio file that the transcription service sent me).
BELLTOWN MESSENGER: If one were to decide to launch a career as an underground left-wing talk radio DJ but would like his words of wisdom converted into text as well, what software would you recommend? For Mac users.
PJ: Nuance’s Dragon Dictation is definitely the best tool out there, but it doesn’t run on the Mac. There is a product called MacSpeech that uses the underlying Dragon Technology. From what I understand, it’s always going to be 1-2 years behind technology-wise from the Dragon product, but it should still be pretty good.
Dragon for iPhone might work for you. There are actually two apps for that one. One lets you do a web search, and that works really, really well. The dictation app is kind of hindered by the ability of the iPhone to run only one app at a time. So you have to dictate into the tool and then make a selection to copy the text to an email or to
BELLTOWN MESSENGER: I’ll stick with Google Voice for now. Maybe break down and buy a cheap PC to run Dragon on someday. Not.
PJ: Hey, everyone knows that left-wing talk radio DJs have deep pockets. Buy a PC!
BELLTOWN MESSENGER: Nuance purchased
Seattle-based jott.com in July. Can you tell me how much Jott was acquired for, and why? What did they have that Nuance wanted?
I did a quick Google search and determined that “terms of the deal were not disclosed,” so, sorry, I can’t tell you. Jott is an interesting
company. They have a voice-mail transcription product (as do Nuance, Google and a few other companies; there’s a race to see who the leaders in that
space will be), but they also took a unique approach on what you could do with that information. People use Jott to essentially allow them to make notes to
themselves any time inspiration strikes. I think that is compelling, and we’re interested in seeing where you can go with that idea.
I’ve interacted with a few of the Jott folks since the acquisition, and I can say that they are lovely people, like most
Seattleites I know.
-Debunking Belltown's Bad Reputation
-Google Voice is a Balderdash Engine
-A Garbled Word or Two About Google Voice
Search the Belltown Messenger Archives