Java Buzz Forum - Playing with Speech Recognition

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Java Buzz Forum
Playing with Speech Recognition

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Elliotte Rusty Harold

Posts: 1573
Nickname: elharo
Registered: Apr, 2003

Elliotte Rusty Harold is an author, developer, and general kibitzer.

Playing with Speech Recognition

Posted: Jun 29, 2008 9:20 AM

This post originated from an RSS feed registered with Java Buzz by Elliotte Rusty Harold.
Original Post: Playing with Speech Recognition Feed Title: Mokka mit Schlag Feed URL: http://www.elharo.com/blog/feed/atom/? Feed Description: Ranting and Raving	Latest Java Buzz Posts Latest Java Buzz Posts by Elliotte Rusty Harold Latest Posts From Mokka mit Schlag

Can we type and wordpad? Yes we can.

I’m testing out Windows speech recognition. I last used speech recognition about 10 years ago on the Macintosh with a now defunct product called Power Secretary. I even wrote an entire book using Power Secretary. That was the first edition of Java Network Programming. However I gave up on it fairly quickly because it was simply too difficult.

The first thing I noted when trying out Windows speech recognition today was that it doesn’t seem to work in Firefox. I have to dictate into Wordpad and then copy the results into Firefox to post it on the blog here.

Window speech recognition on my new 2.2 GHz dual core Dell system with a couple of gigabytes of memory is much more accurate than Power Secretary ever was even with minimal training. Even when I get something wrong, it’s much easier to correct it than correcting mistakes in Power Secretary was. (the word was does seem to confuse windows speech recognition fairly frequently I’ve had to correct it several times in this article all ready.

I can actually type fairly quickly despite talking like this. Having to stop at the end of every sentence just to insert the punctuation marks, though, is tricky. I can tell I’m going to have to do a lot of editing on this article to make it worthy of publication. Or perhaps I should just leave it in a its unedited uncorrected form. The punctuation will probably get better as I remember and learn to think that, as with my voice was I don’t normally do one of the things that is not normally considered in working with speech recognition is that speaking as a different way of thinking than talking sorry than writing.

I can already see that this article is going to be rather poor compared to the ones that I compose by typing them, and I think that has more to do with the way home one thinks when speaking a vs. writing a rather than with a failure of the speech recognition system to accurately transcribe by words.

Besides the enhanced accuracy, one thing I’m noticing about windows speech recognition is that the corrections in just the whole user interface are much more fluid, much more intuitive they and the old power secretary commands ever were. For example, when I need to spell a word, as I did with intuitive in the previous sentence, because speech recognition recognized it as in to it if”, I simply spell out the letters, I space. In power secretary, I had to use the radio who alphabet instead of saying A I would say alpha; instead of saying B, I would say Bravo; instead of saying C, I would say Charlie. Etc.

It’s pretty obvious that speech recognition has come a hell of a long way in the last 12 years. I’m not even giving it an especially fair test. I’m using a fairly crappie microphone that I got for essentially free. And I’m also using what is known not to be the best speech recognition program available. Pretty much all reviews unanimously agree that Nuance Dragon Naturally Speaking is the superior program, and if I really want to start doing this than I should probably buy a copy of that.

Still, I could see getting used to this. I’ll have to learn to think more clearly and to speak articles rather than writing them. I suspect they would still benefit heavily from a full edit cycle with the mouse and a keyboard rather than with my voice. I think you can see that from the rhetoric disconnected stream of consciousness approach you find in this article itself. I might also have to consider getting an office with a door because I’m really not sure my wife wants to listen to me talk to each of my individual articles. It may also be relatively challenging to speak more code heavy articles like the ones I write for java .net. Nonetheless as a 20 words per minute typist at best, I can certainly crank out the text much much faster using voice recognition that I can while typing.

Read: Playing with Speech Recognition

Previous Topic

Next Topic


	Web Artima.com