Phonetic Similarity Matching of Non-Literal Transcripts in Automatic Speech Recognition (2009)
Abstract / truncated to 115 words
Large vocabulary continuous speech recognition (LVCSR) systems require large amounts of labelled audio data for training. While such literal transcriptions of audio recordings, i.e., highly accurate textual reproductions of the utterances are expensive and therefore only avail- able in limited amounts, non-literal field data from commercial automatic dictation systems can be collected on large scale but with quality limitations. Automatic draft transcriptions from the dictation system contain misrecognitions and the manual corrections of the draft transcriptions produced by professional transcriptionists have been reformulated to comply with stylistic guidelines. In this work, phonetic similarity matching is utilised to bridge this gap between literal and non-literal text resources such that large amounts of non-literal transcripts can be ...
speech communication – automatic speech recognition – phonetic similarity – edit distance – pronunciation modelling – dictation
Information
- Author
- Petrik, Stefan
- Institution
- Graz University of Technology
- Supervisors
- Publication Year
- 2009
- Upload Date
- Aug. 31, 2025
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.