Need help with seqGAN?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

suragnair
488 Stars 120 Forks 31 Commits 2 Opened issues

Description

A simplified PyTorch implementation of "SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient." (Yu, Lantao, et al.)

Services available

!
?

Need anything else?

Contributors list

No Data

seqGAN

A PyTorch implementation of "SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient." (Yu, Lantao, et al.). The code is highly simplified, commented and (hopefully) straightforward to understand. The policy gradients implemented are also much simpler than in the original work (https://github.com/LantaoYu/SeqGAN/) and do not involve rollouts- a single reward is used for the entire sentence (inspired by the examples in http://karpathy.github.io/2016/05/31/rl/).

The architectures used are different than those in the orignal work. Specifically, a recurrent bidirectional GRU network is used as the discriminator.

The code performs the experiment on synthetic data as described in the paper.

You are encouraged to raise any doubts regarding the working of the code as Issues.

To run the code:

bash 
python main.py
main.py should be your entry point into the code.

Hacks and Observations

The following hacks (borrowed from https://github.com/soumith/ganhacks) seem to have worked in this case: - Training Discriminator a lot more than Generator (Generator is trained only for one batch of examples, and increasing the batch size hurts stability) - Using Adam for Generator and Adagrad for Discriminator - Tweaking learning rate for Generator in GAN phase - Using dropout in both training and testing phase

  • Stablity is extremely sensitive to almost every parameter :/
  • The GAN phase may not always lead to massive drops in NLL (sometimes very minimal) - I suspect this is due to the very crude nature of the policy gradients implemented (without rollouts).

Sample Learning Curve

Learning curve obtained after MLE training for 100 epochs followed by adversarial training. (Your results may vary!)

alt tag

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.