Need help with tacotron_pytorch?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

263 Stars 75 Forks Other 36 Commits 2 Opened issues


PyTorch implementation of Tacotron speech synthesis model.

Services available


Need anything else?

Contributors list

# 9,583
The Jul...
33 commits


Build Status

PyTorch implementation of Tacotron speech synthesis model.

Inspired from keithito/tacotron. Currently not as much good speech quality as keithito/tacotron can generate, but it seems to be basically working. You can find some generated speech examples trained on LJ Speech Dataset at here.

If you are comfortable working with TensorFlow, I'd recommend you to try instead. The reason to rewrite it in PyTorch is that it's easier to debug and extend (multi-speaker architecture, etc) at least to me.


  • PyTorch
  • TensorFlow (if you want to run the training script. This definitely can be optional, but for now required.)


git clone --recursive
pip install -e . # or python develop

If you want to run the training script, then you need to install additional dependencies.

pip install -e ".[train]"


The package relis on keithito/tacotron for text processing, audio preprocessing and audio reconstruction (added as a submodule). Please follows the quick start section at and prepare your dataset accordingly.

If you have your data prepared, assuming your data is in

(which is the default), then you can train your model by:

Alignment, predicted spectrogram, target spectrogram, predicted waveform and checkpoint (model and optimizer states) are saved per 1000 global step in

directory. Training progress can be monitored by:
tensorboard --logdir=log

Testing model

Open the notebook in

directory and change
to your model.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.