We are a team in Carnegie Mellon University doing research on speech recognition. While looking at the data in the RecognitionResult of the SpeechRecognitionEngine, we have the following questions about the Alternates property and the confidence score of each of the RecognizedPhrase in the Alternates:
We noticed that the top-1 choice in the Alternates property always has the highest confidence score, but the rest of the list in not sorted by confidence.
- What is the right way to think about the rank of hypotheses in the Alternates property?
- What is the right way to think about confidence scores?
- Which is more informative? For example, hypothesis B is in position 10 in the Alternates property, but has a higher confidence score than hypothesis C which is in position 2. Which hypothesis is more likely to be correct?