Deep Learning for i-Vector Speaker and Language Recognition

Abstract / truncated to 115 words (read the full abstract)

Over the last few years, i-vectors have been the state-of-the-art technique in speaker and language recognition. Recent advances in Deep Learning (DL) technology have improved the quality of i-vectors but the DL techniques in use are computationally expensive and need speaker or/and phonetic labels for the background data, which are not easily accessible in practice. On the other hand, the lack of speaker-labeled background data makes a big performance gap, in speaker recognition, between two well-known cosine and Probabilistic Linear Discriminant Analysis (PLDA) i-vector scoring techniques. It has recently been a challenge how to fill this gap without speaker labels, which are expensive in practice. Although some unsupervised clustering techniques are proposed to estimate the ... toggle 12 keywords
deep learning – speaker recognition – language recognition – i-vector – deep neural network – deep belief network – restricted boltzmann machine – relu – variable relu – i-vector backend – speaker embedding – nist i-vector challenge

Information

Author

Ghahabi, Omid

Institution

Universitat Politecnica de Catalunya

Supervisor

Javier Hernando

Publication Year

2018

Upload Date

Dec. 14, 2018

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.

Deep Learning for i-Vector Speaker and Language Recognition (2018)

Abstract / truncated to 115 words (read the full abstract)

Information

First few pages / click to enlarge