IKCEST

Abstract

In typical x-vector-based speaker recognition systems, standard linear discriminant analysis (LDA) is used to transform the x-vector space with the aim of maximising the between-speaker discriminant information while minimising the within-speaker variability. For LDA, it is customary to use all the available speakers in the speaker recognition development dataset. In this study, the authors investigate if it would be more beneficial to estimate the between-speaker discriminant information and the within-speaker variability using the most confusing samples and the most distant samples (from the target speaker mean), respectively, in the LDA-based channel compensation. The between-speaker variance is estimated using a pairwise approach where the most confusing non-target speaker samples are found based on the Euclidean distance between the speaker mean and adjacent speaker's samples. The within-speaker variance is estimated using the mean of each speaker and the furthermost samples in the speaker sessions. Experimental results demonstrate the proposed LDA approach for an x-vector-based speaker recognition system achieves over 17% relative improvement on equal error rate over standard LDA-based x-vector speaker recognition systems on the NIST2010 corext-corext condition.

Original Text (This is the original text for your reference.)

Study on pairwise LDA for x-vector-based speaker recognition

In typical x-vector-based speaker recognition systems, standard linear discriminant analysis (LDA) is used to transform the x-vector space with the aim of maximising the between-speaker discriminant information while minimising the within-speaker variability. For LDA, it is customary to use all the available speakers in the speaker recognition development dataset. In this study, the authors investigate if it would be more beneficial to estimate the between-speaker discriminant information and the within-speaker variability using the most confusing samples and the most distant samples (from the target speaker mean), respectively, in the LDA-based channel compensation. The between-speaker variance is estimated using a pairwise approach where the most confusing non-target speaker samples are found based on the Euclidean distance between the speaker mean and adjacent speaker's samples. The within-speaker variance is estimated using the mean of each speaker and the furthermost samples in the speaker sessions. Experimental results demonstrate the proposed LDA approach for an x-vector-based speaker recognition system achieves over 17% relative improvement on equal error rate over standard LDA-based x-vector speaker recognition systems on the NIST2010 corext-corext condition.

+More

Keywords

betweenspeaker discriminant information ldabased xvector speaker recognition systems equal error rate pairwise approach linear discriminant analysis ldabased channel the xvector space withinspeaker variability

Cite this article

APA

MLA

Chicago

A. KanagasundaramS. SridharanS. GanapathyC. Fookes,.Study on pairwise LDA for x-vector-based speaker recognition. 55 (14),813-816.

Language

International

Translate engine

Article's language

Action

Recommended articles

Report