Need help with NeuralDialog-ZSDG?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

131 Stars 28 Forks Apache License 2.0 20 Commits 1 Opened issues


PyTorch codebase for zero-shot dialog generation SIGDIAL 2018, It is released by Tiancheng Zhao (Tony) from Dialog Research Center, LTI, CMU

Services available


Need anything else?

Contributors list

# 61,689
17 commits
# 227,928
2 commits

Zero-shot Dialog Generation (ZSDG) for End-to-end Neural Dialog Models

Codebase for Zero-Shot Dialog Generation with Cross-Domain Latent Actions, published as a long paper in SIGDIAL 2018. Reference information is in the end of this page. Presentation slides can be found here.

This work won the best paper award at SIGDIAL 2018.

If you use any source codes or datasets included in this toolkit in your work, please cite the following paper. The bibtex are listed below:

  title={Zero-Shot Dialog Generation with Cross-Domain Latent Actions},
  author={Zhao, Tiancheng and Eskenazi, Maxine},
  journal={arXiv preprint arXiv:1805.04803},


python 2.7
pytorch >= 0.3.0.post4


The data folder contains three datasets: - SimDial Data: synthetic multi-domain dialog generator. The data generator can be found here - Stanford Multi-domain Dialog: human-woz task-oriented dialogs.

Getting Started

The following scripts implement 4 different models, including: - Baseline: standard attentional encoder-decoder and encoder with pointer-sentinel-mixture decoder (see the paper for details). - Out Models: cross-domain Action Matching training for the above two baseline systems.


Run the following to experiment on the SimDial dataset


Run the following to experiment on the Stanford Multi-Domain Dataset


Switching Model

The hyperparameters are exactly the same for the above two scripts. To train different models, use the following configurations. The following examples are for, which also apply to

For baseline model with attetnion decoder:

python --action_match False --use_ptr False

For baseline model with pointer-sentinel mixture decoder:

python --action_match False --use_ptr True    

For action matching model with attetnion decoder:

python --action_match True --use_ptr False

For action matching model with attetnion decoder:

python --action_match True --use_ptr True    


The following are some of key hyperparameters:

  • action_match: if or not using the proposed AM algorithm for training
  • targetexamplecnt: the number of seed response from each domain used for training.
  • use_ptr: if or not using pointer-sentinel-mixture decoder
  • black_domains: define which domains are excluded from training
  • blackratio: the percentage of training data from blackdomains are excluded. Range=[0,1], where 1 means removed 100% of the training data.
  • forward_only: use existing model or train a new one
  • load_sess: the path to the existing model
  • rnn_cell: the type of RNN cell, supporting LSTM or GRU
  • dropout: the chance for dropout.

Test a existing model

All trained models and log files are saved to the log folder. To run a existing model, you can:

  • Set the forward_only argument to be True
  • Set the load_sess argument to te the path to the model folder in log
  • Run the script

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.