End-to-End Neural Diarization
No Data
EEND (End-to-End Neural Diarization) is a neural-network-based speaker diarization method. - BLSTM EEND (INTERSPEECH 2019) - https://www.isca-speech.org/archive/Interspeech_2019/abstracts/2899.html - Self-attentive EEND (ASRU 2019) - https://ieeexplore.ieee.org/abstract/document/9003959/
The EEND extension for various number of speakers is also provided in this repository. - Self-attentive EEND with encoder-decoder based attractors - https://arxiv.org/abs/2005.09921
cd tools make
tools/kaldi
bash cd tools make KALDI=This option make a symlink at
tools/kaldi
tools/miniconda3, and creates conda envirionment named 'eend'
/usr/local/cuda/
bash cd tools make CUDA_PATH=/your/path/to/cuda-8.0This command installs cupy-cudaXX according to your CUDA version. See https://docs-cupy.chainer.org/en/stable/install.html#install-cupy
egs/mini_librispeech/v1/cmd.shaccording to your job schedular. If you use your local machine, use "run.pl". If you use Grid Engine, use "queue.pl" If you use SLURM, use "slurm.pl". For more information about cmd.sh see http://kaldi-asr.org/doc/queue.html. ### Data preparation
bash cd egs/mini_librispeech/v1 ./run_prepare_shared.sh### Run training, inference, and scoring
bash ./run.sh
run.shto use
config/eda/{train,infer}.yaml
RESULT.mdand compare with your result.
egs/callhome/v1/cmd.shaccording to your job schedular. If you use your local machine, use "run.pl". If you use Grid Engine, use "queue.pl" If you use SLURM, use "slurm.pl". For more information about cmd.sh see http://kaldi-asr.org/doc/queue.html.
egs/callhome/v1/run_prepare_shared.shaccording to storage paths of your corpora.
cd egs/callhome/v1 ./run_prepare_shared.sh # If you want to conduct 1-4 speaker experiments, run below. # You also have to set paths to your corpora properly. ./run_prepare_shared_eda.sh
./run.sh
local/run_blstm.sh
./run_eda.sh
[1] Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, Shinji Watanabe, " End-to-End Neural Speaker Diarization with Permutation-free Objectives," Proc. Interspeech, pp. 4300-4304, 2019
[2] Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe, " End-to-End Neural Speaker Diarization with Self-attention," Proc. ASRU, pp. 296-303, 2019
[3] Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Kenji Nagamatsu, " End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors," Proc. INTERSPEECH, 2020
@inproceedings{Fujita2019Interspeech, author={Yusuke Fujita and Naoyuki Kanda and Shota Horiguchi and Kenji Nagamatsu and Shinji Watanabe}, title={{End-to-End Neural Speaker Diarization with Permutation-free Objectives}}, booktitle={Interspeech}, pages={4300--4304} year=2019 }