Tuesday, April 2, 2013

Introduction to Speech Capabilities in Windows Phone 8 – Part 2

Hope you enjoyed my last article on Speech Capability in Windows Phone 8, Today I am posting another part or you can say little extension to what I did in Part 1.

In the first part we saw how we can incorporate the built in Speech Capability with the given set of Speech APIs in Windows Phone 8 SDK and how they have edge over earlier Windows Phone builds like 7 and above.We saw I simple Hello World kind of demo, Today I am going to demonstrate how we can leverage the SSML (Speech Synthesis Markup Language) using Speech APIs in Windows Phone.

What is SSML ? :

As per W3C, SSML can be defined as :

SSML is part of a larger set of markup specifications for voice browsers developed through the open processes of the W3C. It is designed to provide a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications.

Possible Scenarios of SSML Implementation : This is very useful in a multilingual app where you need to implement Text to Speech of the content in different languages. Also it provides high level control over the grammer, choice of language, voice of male or female etc. with the help of tags defined in SSML.So let’s see a simple demo of incorporating SSML in Windows Phone 8, Then how you will use that in your app, Its your call !

Namespaces :

using Windows.Phone.Speech.Synthesis;

Design (XAML) :

<phone:PivotItem Header="SSML" DoubleTap="LoadSSML">
                <TextBlock x:Name="TTSSSML" HorizontalAlignment="Left" Height="500" Margin="33,26,0,0" TextWrapping="Wrap" VerticalAlignment="Top" Width="389"/>
</phone:PivotItem>

C# Code :

private async void LoadSSML(object sender,RoutedEventArgs e)
{ … }

I am using an async method here which have 2 parts, First will just display the Text on the Textblock and second part will actually reading of that SSML markup using Speech Synthesizer, Here is the first part :

//Speech Synthesis Markup Language for Display
          TTSSSML.Text = @"<speak version=""1.0""
           xmlns=""http://www.w3.org/2001/10/synthesis"" xml:lang=""ja-JP"">
           <voice gender=""male"">       
               趣味は日本語を勉強することです
               趣味はいろんな新しい食べ物に挑戦することです
               パソコンいじりが得意なので、何か手伝えることがありましたら声をかけて下さい。               
           </voice>                       
           </speak>";

Here you can see the SSML Markup, I agree, I am not SSML Expert and I have taken this piece of SSML tags by doing some research over internet and I spend little time to convert it to Japanese (I actually can read and write Japanese :) ..its a different story ) instead of keeping it in simple English. In your scenario all you need to do is change the “ja-JP” attribute to your own language like en-US etc and try out with that specific language content.You can also change gender to male or female with <voice gender=”<value>> attribute. All assumption is you have Speech enabled on your phone and also you have marked or enabled Speech in manifest file as I have demonstrated in my first article. Then rest is just routine coding nothing else.Now I am showing part two of this snippet, After looking at it, you will realize that I hardly making any changes here :

//Actual Speech in Japanese Language using SSML
            var ttsJP = new SpeechSynthesizer();
            await ttsJP.SpeakSsmlAsync(@"<speak version=""1.0""
            xmlns=""http://www.w3.org/2001/10/synthesis"" xml:lang=""ja-JP"">
            <voice gender=""male"">       
                趣味は日本語を勉強することです
                趣味はいろんな新しい食べ物に挑戦することです
                パソコンいじりが得意なので、何か手伝えることがありましたら声をかけて下さい。
            </voice>                      
            </speak>");

All set ! Now just press F5 and Enjoy ! here are few screenshots if you are trying to visualize how it will look on device.

In English version of SSML :

SSML

In Japanese version of SSML

JPSSML

That’s all ! Hope you like this part, Till now in both parts we actually saw Text To Speech Capability in a nutshell, In my next article which might be last in the short speech capability series, I am going to talk on Speech To Text. Post these parts, I will move to Maps for a while and then will come back with few more interesting and deep dive articles.Till then..enjoy Windows Phone 8

Vikram.

No comments: