Github url


by pytorch

pytorch /fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

8.4K Stars 2.1K Forks Last release: 7 months ago (v0.9.0) MIT License 1.4K Commits 11 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

MIT LicenseLatest ReleaseBuild StatusDocumentation Status

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers:

List of implemented papers

### What's New:


  • multi-GPU training on one machine or across multiple machines (data and model parallel)
  • fast generation on both CPU and GPU with multiple search algorithms implemented:
  • large mini-batch training even on a single GPU via delayed updates
  • mixed precision training (trains faster with less GPU memory on NVIDIA tensor cores)
  • extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers

We also provide pre-trained models for translation and language modelingwith a convenient


interface: ```python en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model') en2de.translate('Hello world', beam=5)

'Hallo Welt'

See the PyTorch Hub tutorials for [translation](\_fairseq\_translation/) and [RoBERTa](\_fairseq\_roberta/) for more examples. # Requirements and Installation \* [PyTorch]( version \>= 1.4.0 \* Python version \>= 3.6 \* For training new models, you'll also need an NVIDIA GPU and [NCCL]( \* \*\*To install fairseq\*\* and develop locally: ```bash git clone cd fairseq pip install --editable ./ # on MacOS: # CFLAGS="-stdlib=libc++" pip install --editable ./
  • For faster training install NVIDIA's apex library:
    bash git clone cd apex pip install -v --no-cache-dir --global-option="--cpp\_ext" --global-option="--cuda\_ext" \ --global-option="--deprecated\_fused\_adam" --global-option="--xentropy" \ --global-option="--fast\_multihead\_attn" ./
  • For large datasets install PyArrow:
    pip install pyarrow
  • If you use Docker make sure to increase the shared memory size either with ```
  • -ipc=host
  • -shm-size
    as command line options to 
    nvidia-docker run

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

We also have more detailed READMEs to reproduce results from specific papers: - Training with Quantization Noise for Extreme Model Compression- Neural Machine Translation with Byte-Level Subwords (Wang et al., 2020)- Multilingual Denoising Pre-training for Neural Machine Translation (Liu et at., 2020)- Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)- Levenshtein Transformer (Gu et al., 2019)- Facebook FAIR's WMT19 News Translation Task Submission (Ng et al., 2019)- RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al., 2019)- wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)- Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)- Pay Less Attention with Lightweight and Dynamic Convolutions (Wu et al., 2019)- Understanding Back-Translation at Scale (Edunov et al., 2018)- Classical Structured Prediction Losses for Sequence to Sequence Learning (Edunov et al., 2018)- Hierarchical Neural Story Generation (Fan et al., 2018)- Scaling Neural Machine Translation (Ott et al., 2018)- Convolutional Sequence to Sequence Learning (Gehring et al., 2017)- Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)

Join the fairseq community


fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.


Please cite as:

@inproceedings{ott2019fairseq, title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling}, author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli}, booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations}, year = {2019}, }

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.