Speaker Recognition

Speaker Recognition
The BGU 2018 NIST Speaker Recognition Evaluation System Git '''Basic knowledge of speaker recognition: additional reading:
 * Speaker recognition by machine and human
 * AUTOMATIC SPEECH RECOGNITION (ASR) lecture about Speaker verification
 * A tutorial on speaker verification

Frontend: '''Feature Extraction (MFCC):
 * The dummy’s guide to MFCC
 * Mel Frequency Cepstral Coefficient (MFCC) tutorial

Voice Activity Detection (VAD):
 * Voice Activity Detection (VAD) Tutorial

X-VECTOR and TDNN architecture-in this section is best to watch each YouTube tutorial before reading, or read and watch tutorial simultaneously
 * A time delay neural network (TDNN) architecture for efficient modeling of long temporal contexts
 * YouTube explanation on TDNN
 * X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION
 * YouTube summary of X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION article

Backend:  Linear Discriminant Analysis(LDA) - in this section is best to watch the YouTube tutorial before reading.
 * YouTube explanation of Linear Discriminant Analysis (LDA)
 * LINEAR DISCRIMINANT ANALYSIS - A BRIEF TUTORIAL

 Probabilistic Linear Discriminant Analysis(PLDA)- in this section first article is enough for understanding PLDA, read addition articles to get more extensive knowledge. Mandatory reading: additional reading:
 * Discriminatively trained Probabilistic Linear Discriminant Analysis for speaker verification
 * Probabilistic Linear Discriminant Analysis- good explanation but most example for image recognition and not speaker recognition
 * Nonparametrically Trained Probabilistic Linear Discriminant Analysis for i-Vector Speaker Verification
 * From single to multiple enrollment i-vectors: practical PLDA scoring variants for speaker verification

Speaker diarization:
 * Domain Adaptation and Speaker Diarization for Speaker Recognition

DOCKER for Speaker recognition
Here are steps to build a DOCKER that support KALDI, SOX, Python and other speaker recognition tools It is very recommended to buil the docker on a server and mounted it to disk with enough space to include the KALDI project, the data and the Automatic speaker recognition components

1. Download Image docker pull kaldiasr/kaldi 2. Docker Initialization - In order to open a new docker use the following command sudo nvidia-docker run -it --mount type=bind,source=/*path to the disk*,target=/common_space_docker/ -p 1234:22 --name  --runtime-nvidia kaldiasr/kaldi:gpu-latest
 * -p 1234:22 is mean the connected ports in the server and in the virtual machine. check first if it free

3.Elias to python3 - the "python" command on the docker refer to python2, we will change it to python3 alias python='python3'

4. Using the SSH to run the docker ssh -p  root@ for example: ssh -p 1234 root@132.72.48.87