Raymond Sastraputera HW5 Speech Recognition Problem 1: Generating the model from an4_train corpus features. Problem 2: Testing the an4_test corpus feature on the trained model. Overall result: SENTENCE RECOGNITION PERFORMANCE sentences 130 with errors 56.2% ( 73) with substitions 51.5% ( 67) with deletions 14.6% ( 19) with insertions 6.2% ( 8) WORD RECOGNITION PERFORMANCE Percent Total Error = 17.7% ( 137) Percent Correct = 83.3% ( 644) Percent Substitution = 14.0% ( 108) Percent Deletions = 2.7% ( 21) Percent Insertions = 1.0% ( 8) Percent Word Accuracy = 82.3% The most occurrences of confusion pairs: 1: 11 -> o ==> oh 2: 5 -> d ==> t 3: 4 -> eight ==> h 4: 4 -> m ==> n 5: 4 -> n ==> m 6: 4 -> o ==> l There are 7 Insertion which each of the insertion occurs at the same amount There are two main deletions:

Unformatted text preview: 1: 7 -&amp;gt; e 2: 6 -&amp;gt; a For substitution, o substitution happens much more often than the others. This might be caused by ambiguity of the model for o, which in this case recognizing as incorrect word. The most occurrences of substitution: 1: 20 -&amp;gt; o 2: 7 -&amp;gt; a 3: 6 -&amp;gt; e 4: 6 -&amp;gt; eight The most often falsely recognized: 1: 14 -&amp;gt; oh 2: 12 -&amp;gt; t 3: 10 -&amp;gt; eighty Summary: The result of the test shows that most of the error caused by the incorrect recognition of the model o. This might be caused by insufficient o data in the training set or the nature of o which is similar to other model. To reduce the error rate, we can train with a better or more train data containing o....
