Engineering: Scoring technique
Engineering: Scoring technique
Limitations of strict interpretation of TIMIT labeling
- Vowel-vowel boundaries (1/2 and 1/2)
- Vowel-semivowel boundaries (2/3 and 1/3)
- Lexical knowledge influence
- Ambiguities, "year" as /y iy r/ or /y iy er/
Solution: Reference Vowel Landmarks (RVLMs)
- Generated from TIMIT aligned labeling
- Syllabified with TSYLB (ref Kahn 1980)
- Result is sequence of syllabic peaks and dips
- Extension for optional matching
- Each RVLM can match between 1 and N DVLMs
- For singleton vowels, N=1
- For sequences of vowels, N=number of vowels in sequence
Matching Detected Vowel Landmarks (DVLMs) to RVLMs
- For each RVLM, find all DVLMs between dips
- If zero DVLMs, mark one deletion error
- If 1 to N DVLMs, no error
- If M > N DVLMs, mark M - N insertion errors
- Compute Token Error Rate (TER)
- deletion errors plus insertion errors
- expressed as percentage of RVLMs
Corpus
- TIMIT sentences ending in 8 or 9
- Original train - test division kept
- Train: 619 utterances, 7578 RVLMs
- Test: 373 utterances, 4404 RVLMs