Geeks With Blogs
Ulterior Motive Lounge UML Comics and more from Martin L. Shoemaker (The UML Guy),
Offering UML Instruction and Consulting for your projects and teams.
I wrote Dee Jay as an example for a proposed talk for the Ann Arbor Day of .NET, and as a way to learn more about the Managed Speech API in Microsoft Windows Vista. Dee Jay works with M-SAPI and Windows Media Player to give you a totally voice-controlled way to play your music. You simply say a command like "Dee Jay, play some Dire Straits", and it searches your song catalog for songs by Dire Straits, picks one, and plays it. Or you can name a specific title, or even a genre. If there are multiple matches for a given name or title, Dee Jay will list them until you choose one by saying "Play." And there are a number of other commands, which you can learn by saying "What can I say?"

Now Dee Jay is available as a free download. Just download the zip file, unzip it, and run Setup.exe. I can't promise any support for it right now, but I can try to answer questions. And I look forward to your feedback. I'm already enjoying the freedom of voice-controlled music on my daily commute, and I hope you will enjoy it, too!

Now to forestall the obvious first questions... No, it doesn't work on any OS but Vista (or if it does, it's news to me). It doesn't work with any media software but Windows Media Player. I wrote this code for a demo for a one hour presentation. It had to be simple; and with Vista, Microsoft has made speech recognition programming extremely simple. While I've been thinking about this program for about three weeks, I wrote the actual code in my spare time over the past work. And I billed 62 hours this week, plus probably 8 hours of travel, so there wasn't a lot of spare time. And of that coding time, over 75% of it was spent writing code to catalog your music library! The speech code was so easy, it felt like cheating. (I programmed .NET speech recognition with SAPI 5.1. Now that was a challenge. I would've needed weeks, maybe months to do this same work with SAPI 5.1.)

This is why I upgraded to Vista: not for Dee Jay, but for the ability to write Dee Jay and other voice-controlled applications. There have been pretty decent commercially available speech recognition tools out there for a while, but they were a royal pain to program. With Vista, writing speech applications just got as easy as writing desktop applications (and the recognition accuracy took a giant leap, too). Designing a good speech grammar and a good conversation model takes some work (maybe even some UML to think through it), but implementing that design is nearly effortless. I'll be exploring the code in subsequent blog posts; but for those who don't want the gory techie details, just download Dee Jay, start it up, and say "What can I say?" Dee Jay will talk you through the rest.

It's a great time to be a programmer!

(P.S. If anyone has Vista and a really large song library, I would be curious to know how long the Dee Jay catalog takes to build. My catalog loads in less than a second, but I've only got 135 albums.)

UPDATE: In response to a question from Ben Day, I've added this list of the Dee Jay commands. Note that you can change Dee Jay's name, so replace "Dee Jay" with your chosen name in these commands.


  • Dee Jay, Play MUSICKEY. Plays a song, an album, or a named collection. Replace MUSICKEY with a phrase that identifies a song. (See below for details on MUSICKEY.) If there are multiple matches for the MUSICKEY, Dee Jay lists them one at a time, giving you a chance to say "Play" (which also ends the list),"Back up", "Next", or "Cancel".
  • Dee Jay, Play Some MUSICEY. Dee Jay picks one song from the MUSICKEY at random.
  • Dee Jay, Play Any MUSICKEY. Same as Play Some.
  • Dee Jay, Play All MUSICKEY. Plays all songs from a MUSICKEY, in a random order.
  • Dee Jay, Add MUSICKEY. Adds a single song to the current playlist.
  • Dee Jay, Add Some MUSICEY. Dee Jay adds one song from the MUSICKEY at random to the current playlist.
  • Dee Jay, Add Any MUSICKEY. Same as Add Some.
  • Dee Jay, Add All MUSICKEY. Adds all songs from a MUSICKEY to the current playlist, in a random order.
  • Dee Jay, Pause. Pauses play.
  • Dee Jay, Resume. Resumes play.
  • Dee Jay, Next. Skips to the next song in the play list.
  • Dee Jay, Back. Jumps to the previous song in the play list.
  • Dee Jay, 5 Stars. Rates the current song as 5 stars. Other commands (of course) are 4 Stars, 3 Stars, 2 Stars, and 1 Star.
  • Dee Jay, Louder. Raise volume by 10%.
  • Dee Jay, Softer. Lower volume by 10%.
  • Dee Jay, Hush. Drop volume to 10%.
  • Dee Jay, Shout. Raise volume to 100%.
  • Dee Jay, About. Describe Dee Jay and its current version.
  • Dee Jay, Exit. Exit Dee Jay.
  • Dee Jay, Hello. Dee Jay greets you.
  • Dee Jay, Rescan. Looks for new music.
  • Dee Jay, What's playing? Identifies the current song.
  • Dee Jay, Rename NAME. Changes the name Dee Jay responds to. Replace NAME with your Dee Jay name.
  • Dee Jay, Reset Name. Changes the name back to Dee Jay.
  • Reset Name. Same as Dee Jay, Reset Name. I figured people might forget their Dee Jay name and need a way to default it.
  • Dee Jay, What can I say? Describes the commands.
  • Dee Jay, Help. Same as Dee Jay, What can I say?
  • What can I say? Same as Dee Jay, What can I say?
  • Help. Same as Dee Jay, What can I say?
A MUSICKEY is a phrase which helps identify a song, an album, or a collection. (It also ought to identify play lists, but I forgot to implement that.) Dee Jay scans your music library and finds the following information for each song (not ever song has all of these fields):

  • Title. This doesn't form a collection (see below for collections), but is used to uniquely identify a song. (What if two songs have the same name? See below...)
  • Album. This doesn't form a collection, but is used to identify all songs in a single album.
  • Author.
  • Artist.
  • Composer.
  • Conductor.
  • Publisher.
  • Category. No, I don't know what this means; but it's one of the fields Media Player will report.
  • Genre.
  • Language.
  • Mood. Another one that Media Player reports, but I don't know where it's defined.
  • Period. Another one that Media Player reports, but I don't know where it's defined.
  • User Rating. This is one a 0 to 100 scale; but I convert it to 1 to 5 stars, like the Media Player UI does. This is supposed to define 5 different collections; but honestly, I haven't rated enough of my songs to test it yet.
Except for Title and Album (as described above), each of these fields is used to define collections of rated songs, one collection per value. So for example, my library includes songs by Pat Benatar, Kronos Quartet, and Adrianna Culcanhotto (among others); and it also includes comedy albums by Bill Cosby and Bob Newhart. From these examples, Dee Jay would create the following collections:


  • Pat Benatar.
  • Rock.
  • Kronos Quartet.
  • Classical.
  • Adrianna Culcanhotto.
  • World.
  • Bill Cosby.
  • Bob Newhart.
  • Comedy.
It would create a lot of other collections as well, for publisher, composer, star rating, etc. Then all collections, songs, and albums are entered into a phrase map which will recognize a particular phrase and find the corresponding music.

Note also that, thanks to the magic of M-SAPI, you don't have to precisely match phrases in the phrase map. You simply have to get some of the non-articles right and in sequence. If you have the song "After All [Love Theme from Chances Are]", no user is going to remember that whole title (I can't, and it was Sandy's and my wedding song); but they don't have to. Dee Jay will recognize any of these phrases as possible matches for that title:


  • After All [Love Theme from Chances Are].
  • After All.
  • Chances Are.
  • Love Theme from Chances Are.
  • Love Theme.
  • Theme from Chances.

But it won't recognize a jumbled phrase, like "After Are All Chances Love". (M-SAPI does include a mode which would recognize that; but I decided that it was better to require the user to get the words in the right sequence. Otherwise, a lot of songs with similar titles can too easily be confused.)
Posted on Saturday, November 15, 2008 4:28 PM .NET , M-SAPI | Back to top


Comments on this post: Dee Jay: A Voice-Controlled Juke Box for Windows Vista!

# re: Dee Jay: A Voice-Controlled Juke Box for Windows Vista!
Requesting Gravatar...
Hi Martin,

I'd love to try DeeJay out - I've been trying to kludge something similar using WSR macros and not really succeeding. Is there an alternate link to the file? The inkon website appears to be down.

Thanks
Left by VR on Dec 04, 2008 10:44 PM

# re: Dee Jay: A Voice-Controlled Juke Box for Windows Vista!
Requesting Gravatar...
VR,

Ill try to fix that link. Thanks!
Left by Martin L. Shoemaker on Dec 05, 2008 2:15 PM

Your comment:
 (will show your gravatar)


Copyright © Martin L. Shoemaker | Powered by: GeeksWithBlogs.net