Facebook AI Research's Automatic Speech Recognition Toolkit
Future wav2letter development will occur in Flashlight.
To build the old, pre-consolidation version of wav2letter, checkout the wav2letter v0.2 release, which depends on the old Flashlight v0.2 release. The
wav2letter-luaproject can be found on the
wav2letter-luabranch, accordingly.
For more information on wav2letter++, see or cite this arXiv paper.
This repository includes recipes to reproduce the following research papers as well as pre-trained models: - Pratap et al. (2020): Scaling Online Speech Recognition Using ConvNets - Synnaeve et al. (2020): End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures - Kahn et al. (2020): Self-Training for End-to-End Speech Recognition - Likhomanenko et al. (2019): Who Needs Words? Lexicon-free Speech Recognition - Hannun et al. (2019): Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions
Data preparation for training and evaluation can be found in data directory.
First, install Flashlight with the ASR application. Then, after cloning the project source:
shell mkdir build && cd build cmake .. && make -j8If Flashlight or ArrayFire are installed in nonstandard paths via a custom
CMAKE_INSTALL_PREFIX, they can be found by passing
shell -Dflashlight_DIR=[PREFIX]/usr/share/flashlight/cmake/ -DArrayFire_DIR=[PREFIX]/usr/share/ArrayFire/cmakewhen running
cmake.
wav2letter++ is BSD-licensed, as found in the LICENSE file.