none
Phoneme - AI vowel RRS feed

  • Question

  • Hi,

     

    I'm having trouble with one of the phonetic vowels - AI. The documentation says that the word "hive" should be written "H AI V" so the phoneme tag would be :

     

    statementActivity1.MainPrompt.AppendSsmlMarkup("Bees live in a bee <phoneme alphabet=\"x-microsoft-ups\" ph=\"H AI V\">hive</phoneme>");

     

    This sounds fine, however, we have a town "Carlisle" whereby the "i" should be pronounced the same as it is in "hive". Speech Server is pronouncing the "i" as "ee" as in "keep" so I am trying to utilise the phoneme tag:

     

    statementActivity1.MainPrompt.AppendSsmlMarkup("The town is <phoneme alphabet=\"x-microsoft-ups\" ph=\"K AA L AI L\">Carlisle</phoneme>");

     

    However, it still pronounces the "i" as "ee" (UPS Label "I"). I checked the pronunciation using the lexicon pronunciation lookup and it suggested "K AA L AI L" as one of the options.

     

    One thing I have noticed in testing this is that if you have a valid value in the ph property, the text within the tag is completely ignored. So:

     

    statementActivity1.MainPrompt.AppendSsmlMarkup("Elephants are <phoneme alphabet=\"x-microsoft-ups\" ph=\"F AA S T\">slow</phoneme>");

     

    ...actually reads "Elephants are fast". So, to test it further I tried:

     

    statementActivity1.MainPrompt.AppendSsmlMarkup("Birds live in a <phoneme alphabet=\"x-microsoft-ups\" ph=\"H AI V\">nest</phoneme>");

     

    And it reads "Birds live in a nest" even though the ph is using the "hive" example from the documentation. This suggests that in the first example ("Bees live in a bee hive"), it was just reading the text also. Another example in the documentation is the word "price" and this seems to have a similar problem:

     

    statementActivity1.MainPrompt.AppendSsmlMarkup("The <phoneme alphabet=\"x-microsoft-ups\" ph=\"P RA AI S\">cost</phoneme> is £10");

     

    Reads "The cost is £10".

     

    Is this a bug or am I using this incorrectly?

     

    I'm running version 2.0.61285.0 with the SIP Debugging Phone.

     

    Thanks in advance,

     

    Kev

    Thursday, May 17, 2007 3:28 PM

Answers

  • Kevin,

    I'm using the en-us lang pack, which I would imagine has the same or similar phonemes as en-uk, which you are using.

     

    As it turns out, the TTS engine (which I believe was acquired from Nuance), uses a subset of the UPS phone set. We're still looking into this, so we'll keep you posted.

    Mark

    Monday, May 21, 2007 7:34 PM

All replies

  • Hi Kevin,

     

    It looks like you have two questions here. Regarding the first (Carlisle), I'm having some trouble reproducing it. Are you using a language pack other than en-us?  Strangely enough, for me Carlisle is pronounced correctly without the phoneme tag, but I can alter the tag to get bad pronunciations (e.g. I can set it to "K AA L I L" to reproduce what I think you're hearing).

     

    Regarding your second question, you are correct that the text in the ssml tag is ignored. The documentation states that:

     

    "Even though the TTS engine ignores the content of the ssmlStick out tonguehoneme element (instead producing a pronunciation of the string specified in the ph attribute), place the text content in the element so that devices without speech capability can still render something intelligible in place of speech."

    Friday, May 18, 2007 3:55 PM
  • I'm not sure I believe the statement from the docs in all cases. The following code shows the TTS engine first ignoring the plain text in the <phoneme> element, and then using the plain text alternative in that element when it is unable to pronounce the AI phoneme, which leads me to think that the TTS engine might have a problem with the AI phoneme, as Kevin mentioned earlier.

    Here's the body of a TurnStarting handler for a StatementActivity:

     

    statementActivity1.MainPrompt.AppendSsmlMarkup("The <phoneme alphabet=\x-microsoft-ups\" ph=\"L O + UH F\">life</phoneme>");

    // Says "The loaf", using UPS transcription in <ph> tag, ignoring plain text of <phoneme> tag.

    statementActivity1.MainPrompt.AppendSsmlMarkup("The <phoneme alphabet=\x-microsoft-ups\" ph=\"L AI F\">loaf</phoneme>");

    // Says "The loaf", uses plain text of <phoneme> tag, ignoring UPS transcription in <ph> tag.

    Friday, May 18, 2007 7:41 PM
  • I'm using the English UK Language pack Shane.

     

    I've tried the code samples you attached Mark and I'm getting the exact same results as you which, to me, points at an issue with the AI phonemel. Which Language pack do you have installed?

     

    Thanks,

     

    Kev

    Monday, May 21, 2007 8:46 AM
  • Kevin,

    I'm using the en-us lang pack, which I would imagine has the same or similar phonemes as en-uk, which you are using.

     

    As it turns out, the TTS engine (which I believe was acquired from Nuance), uses a subset of the UPS phone set. We're still looking into this, so we'll keep you posted.

    Mark

    Monday, May 21, 2007 7:34 PM
  • Hi Mark,

     

    Sorry to be the bearer of bad news but I'm having trouble with a couple of the other vowels too. In the documentation, the examples for rose (R O + U Z) and lake (L EI K) don't seem to work if you do something like this:

     

    The flower is a <phoneme alphabet=\"x-microsoft-ups\" ph=\"R O + U Z\">daisy</phoneme>

     

    (says "The flower is a daisy")

     

    I swam in the <phoneme alphabet=\"x-microsoft-ups\" ph=\"L EI K\">river</phoneme>

     

    (says "I swam in the river")

     

    Obviously I'm just using the above as examples, there are real life situations where we would use this functionality, I promise you I'm not just being "difficult"! Surprise)

     

    Kind regards,

     

    kev

    Thursday, May 24, 2007 3:49 PM
  • Kevin,

    No, I don't think you're being difficult. We appreciate hearing about things that aren't working correctly. Through my own research and the efforts of others, weI've found that using UPS, several vowel phones don't work in the Nuance Scansoft voices available in the en-us language pack (and probably the en-gb pack as well).

     

    We tried using a different voice, namely Microsoft Samantha, which is one of the desktop voices. Using UPS as the language, Samantha was able to pronounce the AI diphthong, but pronounce it incorrectly as "ee." However, using IPA, Samantha pronounced this sound correctly. The prompt below renders as "The life." The 3rd character in the ph attribute is Latin Letter Small Capital I, U+026A. One way to enter it into your code is to find it in Character Map, select it, copy it, and then paste it into the ph attribute string. The editor for Visual Studio needs to be set up to use a Unicode font -- I used Lucinda Sans Unicode.

     

    Code Snippet

    statementActivity1.MainPrompt.AppendSsmlMarkup("The <phoneme alphabet=\"ipa\" ph=\"laɪf\">loaf</phoneme>");

     

     The fellow I've been working with has opened a bug to fix the maps in the Nuance voices. I'll forward him the other diphthong you had problems with.

    Probably the easiest alternative (and one I think you said you had tried) was to create a lexicon of the words whose pronunciations you want to control.

    Thanks,
    Mark 

    Friday, May 25, 2007 10:44 PM
  • Hi Mark,

     

    Thanks very much for all your efforts on this, very much appreciated. I will try using IPA and see how that pans out.

     

    Thanks again,

     

    Kevin

    Tuesday, May 29, 2007 6:44 AM