Phonetic Similarity Matching of Non-Literal Transcripts in Automatic Speech Recognition
Large vocabulary continuous speech recognition (LVCSR) systems require large amounts of labelled audio data for training. While such literal transcriptions of audio recordings, i.e., highly accurate textual reproductions of the utterances are expensive and therefore only avail- able in limited amounts, non-literal field data from commercial automatic dictation systems can be collected on large scale…
