Tuesday, April 2, 2013

Introduction to Speech Capabilities in Windows Phone 8 – Part 1

After a long..I am writing blog, I hope and I wish I will resume blogging like I use to in past. Lots of things happened in past few months. I changed my job,got married and what not ! Well, Life !

Today I am going to share few things about Speech Capabilities in Windows Phone 8, Although I haven’t talked about it in past for Windows Phone 7-7.5 just because there were lots of limitations in this area in terms of APIs and Accuracy as well.With Phone 8 things are totally different. Earlier till 7.5 it was totally dependent on Bing Service which has to be online and network or internet connection was mandatory to have.Now it works offline without having any data/internet connection. Thanks to Microsoft for this improvement. Little Thanks to Microsoft MVPs like me ! (little pat on back)..surprised? Well I was a volunteer and part of a Secret mission, Proud to get associated with it, Although our contribution was small compare to efforts taken by Microsoft Product Group Members but it was got recognized in recent Microsoft TechEd 2013 at Pune, India by Sanket Akerkar,Managing Director, Microsoft India at Microsoft.

SanketAkerkar

Well, Lets come back to main topic, So I am actually planning to write a big article but now plan to break it in few,So today let’s build a Hello World type App to understand TTS (Text To Speech) Capabilities.

Initial Work :

Open a brand new Windows Phone Project from Visual Studio 2012

Open

Choose Windows Phone OS 8.0

OSChoice

Design :

<phone:Pivot Title="Speech Capability">
            <!--Pivot item one-->
            <phone:PivotItem Header="howdy">
                <Button x:Name="TTSHowdy" Content="Hello World !" HorizontalAlignment="Left" Width="456" Height="87" VerticalAlignment="Top" Margin="0,82,0,0" Click="TTSHowdy_Click"/>
            </phone:PivotItem>

</phone:Pivot>

I am actually putting it in a Pivot Navigation as I wish to demonstrate couple of more features of Speech within a single app, In your design you can very well change the layout.

Namespace Required :

using Windows.Phone.Speech.Synthesis;
using Windows.Phone.Speech.Recognition;

C# Code :

private async void TTSHowdy_Click(object sender, RoutedEventArgs e)
        {
            var TTS = new SpeechSynthesizer();
            await TTS.SpeakTextAsync("Welcome to Microsoft TechEd India 2013 in Pune");
        }

So SpeakTextAsync basically an async method which take 2 parameters as Content and Content and ObjectState. So similarly we can pass big string or textblock data to this method so that it will speak the content for you with the default voices installed on your phone.

Here is the output : (On actual device/emulator, you can hear the Sound )

howdy

Now after this Hello World, Lets build another Pivot which will display as well as play all the voices installed on your phone. To showcase this, I am making use of “Long List Selector” on my UI.

Design :

<phone:PivotItem Header="voices" DoubleTap="LoadTTSAllVoices">
               <phone:LongListSelector  x:Name="llstNames" HorizontalAlignment="Left" Width="456" Height="232" VerticalAlignment="Top" Margin="0,3,0,0"/>              
           </phone:PivotItem>

C# Code :

List<string> lstVoices = new List<string>();

private async void LoadTTSAllVoices(object sender, RoutedEventArgs e)
       {           
           //Get all the Voices
           foreach (var voice in InstalledVoices.All)
           {
               lstVoices.Add(voice.DisplayName + ", " + voice.Language + ", " + voice.Gender);
               using (var text2speech = new SpeechSynthesizer())
               {
                   text2speech.SetVoice(voice);
                   await text2speech.SpeakTextAsync("Hello world! I'm " + voice.DisplayName + ".");
               }

               llstNames.ItemsSource = lstVoices.ToList();
           }
       }

Basically, This async methods loops over collection of Voices installed and add each one to the List<T>. So once the voice is picked and set in the SetVoice Method, We can then use the same method SpeakTextAsync which we used above to read the text content.So after reading via each of the voice, We add the voice reader information to a List<T> and bind it further to Long List Selector. So it reads the content and add each voice to the list one after the another.

Here is the Output : (On actual device/emulator, you can hear the Sound )

voices

So that all I want to cover in Part –1, I will post another interesting stuff in upcoming parts, I am actually planning to post 2-3 more.Meanwhile you can try this and check the point to remember or conclusion :

1. Your PC/Laptop Speakers should be on to experience the voices coming out

2. There is no separate SDKs or Tools to be installed, These Speech APIs comes by default with the Phone SDK.

3. You need to Turn On Microphone and Speech Capability option from WMAppManifest.xml like this :

Capabilities 

So that all I want to cover in Part –1 , I am already in progress for Part 2 and expect few more deep dive stuff on Speech Capabilities in coming parts as we progress. Do enjoy and try out the above capabilities and feel free to share your feedbak.

Vikram.

No comments: