Need help with ubottu?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

npow
132 Stars 45 Forks 133 Commits 2 Opened issues

Description

Next Utterance Classification

Services available

!
?

Need anything else?

Contributors list

# 214,870
JavaScr...
Python
Shell
119 commits

Ubottu

This repository contains the source code for the models used in the following paper:

The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems arXiv:1506.08909.

Dependencies

  • Python 2.7
  • Theano bleeding-edge
  • Lasagne
  • Pyprind

Usage

Fetch the pickled data:

cd src
wget http://dataset.cs.mcgill.ca/ubuntu-corpus-1.0/ubuntu_blobs.tgz
tar zxvf blobs.tgz

To reproduce the results in the original paper, use the following incantations.

RNN:

python main.py --encoder rnn --batch_size=512 --hidden_size=50 --optimizer adam --lr 0.001 --fine_tune_W=True --fine_tune_M=True --input_dir dataset_1MM

LSTM:

python main.py --encoder lstm --batch_size=256 --hidden_size=300 --optimizer adam --lr 0.001 --fine_tune_W=True --fine_tune_M=True --input_dir dataset_1MM

TFIDF:

python tfidf.py

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.