Deep_Speaker-speaker_recognition_system

by Walleclipse

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker r...

158 Stars 54 Forks Last release: Not found 43 Commits 0 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

Deep Speaker: speaker recognition system

Data Set: LibriSpeech
Reference paper: Deep Speaker: an End-to-End Neural Speaker Embedding System
Reference code : https://github.com/philipperemy/deep-speaker (Thanks to Philippe Rémy)

This code was trained on librispeech-train-clean dataset, tested on librispeech-test-clean dataset. In my code, librispeech dataset shows ~5% EER with CNN model.

About the Code

train.py

This is the main file, contains training, evaluation and save-model function
models.py

The neural network used for the experiment. This file contains three models, CNN model (same with the paper’s CNN), GRU model (same with the paper's GRU), simplecnn model. simplecnn model has similar performance with the original CNN model, but the number of trained parameter dropped from 24M to 7M.
select_batch.py

Choose the optimal batch feed to the network. This is one of the cores of this experiment.
triplet_loss.py

This is a code to calculate triplet-loss for network training. Implementation is the same as paper.
test_model.py

This is a code that evaluates (test) the model, in terms of EER...
eval_matrics.py

For calculating equal error rate, f-measure, accuracy, and other metrics
pretaining.py

This is for pre-training on softmax classification loss.
pre_process.py

Load the utterance, filter out the mute, extract the fbank feature and save the module in .npy format.

Experimental Results

This code was trained on librispeech-train-clean dataset, tested on librispeech-test-clean dataset. In my code, librispeech dataset shows ~5% EER with CNN model.

More Details

If you want to know more details, please read deepspeakerreport.pdf (English) or deep_speaker实验报告.pdf (中文).

## Simple Use 1. Preprare data.
I provide the sample data in

audio/LibriSpeechSamples/
or you can download full LibriSpeech data or prepare your own data.
2. Preprocessing.
Extract feature and preprocessing:
python preprocess.py
.
3. Training.
If you want to train your model with Triplet Loss:
python train.py
.
If you want to pretrain with softmax loss first:
python pretraining.py
then
python train.py
.
Note: If you want to pretrain or not, you need to set
PRE_TRAIN
(in
constants.py
) flag with
True
or
False
.
  1. Evaluation.
    Evaluate the model in terms of EER:
    test_model.py
    .
    Note: During training,
    train.py
    also evaluates the model.
  2. Plot loss curve.
    Plot loss curve and EER curve with
    utils.py
    .
    import constants as c
    from utils import plot_loss
    loss_file=c.CHECKPOINT_FOLDER+'/losses.txt' # loss file path
    plot_loss(loss_file)
    

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.