Using Fast Weights to Attend to the Recent Past

This repo is a TensorFlow implementation of

Jimmy Ba, Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu
NIPS 2016,

Specifically, we follow the experiments in

Sec 4.1 Associative retrieval
and try to reproduce the results in Table 1 and Figure 2. The fast weights model can achieve 100% accuracy (0% error rate) on R=50 setting in ~30K iterations.

Running result as follows:

Fast Weights(with layernorm):

Fast Weights(without layernorm):

LSTM(with layernorm):

LSTM(without layernorm):

Both trained on GTX 980 Ti, with TensorFlow 0.11rc1.

Setting on R=50, using ADAM optimizer with default parameters.

Train the fast weights model


Evaluate the fast weights model


Run the

baseline model in similar ways.


Fan Wu ([email protected])

