Facebook AI Research's Automatic Speech Recognition Toolkit
The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:
The toolkit started from models predicting letters directly from the raw waveform, and now evolved as an all-purpose end-to-end ASR research toolkit, supporting a wide range of models and learning techniques. It also embarks a very efficient modular beam-search decoder, for both structured learning (CTC, ASG) and seq2seq approaches.
Important disclaimer: as a number of models from this repository could be used for other modalities, we moved most of the code to flashlight.
This repository includes recipes to reproduce the following research papers as well as pre-trained models: - [NEW] Pratap et al. (2020): Scaling Online Speech Recognition Using ConvNets - [NEW SOTA] Synnaeve et al. (2020): End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures - Kahn et al. (2020): Self-Training for End-to-End Speech Recognition - Likhomanenko et al. (2019): Who Needs Words? Lexicon-free Speech Recognition - Hannun et al. (2019): Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions
Data preparation for our training and evaluation can be found in data folder.
The previous iteration of wav2letter can be found in the: - (before merging codebases for wav2letter and flashlight) wav2letter-v0.2 branch. - (written in Lua)
First, isntall flashlight with all its dependencies. Then
mkdir build && cd build && cmake .. && make -j8If flashlight or ArrayFire are installed in nonstandard paths via
CMAKE_INSTALL_PREFIX, they can be found by passing
-Dflashlight_DIR=[PREFIX]/usr/share/flashlight/cmake/ -DArrayFire_DIR=[PREFIX]/usr/share/ArrayFire/cmakewhen running
See the CONTRIBUTING file for how to help out.
wav2letter++ is BSD-licensed, as found in the LICENSE file.