Phonetic Similarity Matching of Non-Literal Transcripts in Automatic Speech Recognition

Abstract / truncated to 115 words (read the full abstract)

Large vocabulary continuous speech recognition (LVCSR) systems require large amounts of labelled audio data for training. While such literal transcriptions of audio recordings, i.e., highly accurate textual reproductions of the utterances are expensive and therefore only avail- able in limited amounts, non-literal field data from commercial automatic dictation systems can be collected on large scale but with quality limitations. Automatic draft transcriptions from the dictation system contain misrecognitions and the manual corrections of the draft transcriptions produced by professional transcriptionists have been reformulated to comply with stylistic guidelines. In this work, phonetic similarity matching is utilised to bridge this gap between literal and non-literal text resources such that large amounts of non-literal transcripts can be ... toggle 6 keywords
speech communication – automatic speech recognition – phonetic similarity – edit distance – pronunciation modelling – dictation

Information

Author

Petrik, Stefan

Institution

Graz University of Technology

Supervisors

Publication Year

2009

Upload Date

Aug. 31, 2025

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.

Phonetic Similarity Matching of Non-Literal Transcripts in Automatic Speech Recognition (2009)

Abstract / truncated to 115 words (read the full abstract)

Information

First few pages / click to enlarge