Need help with DeepUtteranceAggregation?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

cooelf
167 Stars 32 Forks 6 Commits 0 Opened issues

Description

Modeling Multi-turn Conversation with Deep Utterance Aggregation (COLING 2018)

Services available

!
?

Need anything else?

Contributors list

# 175,941
Python
Shell
bert-mo...
srl
6 commits

Code and sample data accompanying the paper Modeling Multi-turn Conversation with Deep Utterance Aggregation.

Dataset

We release E-commerce Dialogue Corpus, comprising a training data set, a development set and a test set for retrieval based chatbot. The statistics of E-commerical Conversation Corpus are shown in the following table.

| |Train|Val| Test | | ------------- |:-------------:|:-------------:|:-------------:| | Session-response pairs | 1m|10k| 10k | | Avg. positive response per session|1|1|1| | Min turn per session|3|3|3| | Max ture per session|10|10|10| | Average turn per session|5.51|5.48|5.64 | Average Word per utterance|7.02|6.99|7.11

The full corpus can be downloaded from https://drive.google.com/file/d/154J-neBo20ABtSmJDvm7DK0eTuieAuvw/view?usp=sharing.

Data template

label \t conversation utterances (splited by \t) \t response

Source Code

We also release our source code to help others reproduce our result

Instruction

Our code is compatible with python2 so for all commands listed below python is python2

We strongly suggest you to use conda to control the virtual environment

  • Install requirement

    pip install -r requirements.txt

  • Pretrain word embedding

    python trainword2vec.py ./ECDsample/train embedding

  • Preprocess the data

    python PreProcess.py --traindataset ./ECDsample/train --validdataset ./ECDsample/valid --testdataset ./ECDsample/test --pretrainedembedding embedding --savedataset ./ECD_sample/all

  • Train the model

    bash train.sh

Tips

If you encounter some cuda issues, please check your environment. For reference,

Theano 0.9.0
Cuda 8.0
Cudnn 5.1

Reference

If you use this code please cite our paper:

@inproceedings{zhang2018dua,
    title = {Modeling Multi-turn Conversation with Deep Utterance Aggregation},
    author = {Zhang, Zhuosheng and Li, Jiangtong and Zhu, Pengfei and Zhao, Hai},
    booktitle = {Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018)},
    pages={3740--3752},
    year = {2018}
}

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.