Need help with ReChorus?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

THUwangcy
223 Stars 51 Forks MIT License 80 Commits 2 Opened issues

Description

“Chorus” of recommendation models: a light and flexible PyTorch framework for Top-K recommendation.

Services available

!
?

Need anything else?

Contributors list

No Data

logo

License

ReChorus is a general PyTorch framework for Top-K recommendation with implicit feedback, especially for research purpose. It aims to provide a fair benchmark to compare different state-of-the-art algorithms. We hope this can partially alleviate the problem that different papers adopt non-comparable experimental settings, so as to form a "Chorus" of recommendation algorithms.

This framework is especially suitable for researchers to compare algorithms under the same experimental setting, and newcomers to get familiar with classical methods. The characteristics of our framework can be summarized as follows:

  • Swift: concentrate on your model design in a single file and implement new models quickly.

  • Easy: the framework is accomplished in less than a thousand lines of code, which is easy to use with clean codes and adequate comments.

  • Efficient: multi-thread batch preparation, special implementations for the evaluation, and around 90% GPU utilization during training for deep models.

  • Flexible: implement new readers or runners for different datasets and experimental settings, and each model can be assigned with specific helpers.

Structre

Generally, ReChorus decomposes the whole process into three modules:

  • Reader: read dataset into DataFrame and append necessary information to each instance
  • Runner: control the training process and model evaluation
  • Model: define how to generate ranking scores and prepare batches

logo

Getting Started

  1. Install Anaconda with Python >= 3.5
  2. Clone the repository
git clone https://github.com/THUwangcy/ReChorus.git
  1. Install requirements and step into the
    src
    folder
cd ReChorus
pip install -r requirements.txt
cd src
  1. Run model with the build-in dataset
python main.py --model_name BPRMF --emb_size 64 --lr 1e-3 --l2 1e-6 --dataset Grocery_and_Gourmet_Food
  1. (optional) Run jupyter notebook in

    data
    folder to download and build new datasets, or prepare your own datasets according to Guideline in
    data
  2. (optional) Implement your own models according to Guideline in

    src

Arguments

The main arguments are listed below.

| Args | Default | Description | | --------------- | --------- | ------------------------------------------------------------ | | modelname | 'BPRMF' | The name of the model class. | | lr | 1e-3 | Learning rate. | | l2 | 0 | Weight decay in optimizer. | | testall | 0 | Wheter to rank all the items during evaluation. | | metrics | 'NDCG,HR' | The list of evaluation metrics (seperated by comma). | | topk | '5,10,20' | The list of K in evaluation metrics (seperated by comma). | | numworkers | 5 | Number of processes when preparing batches. | | batchsize | 256 | Batch size during training. | | evalbatchsize | 256 | Batch size during inference. | | load | 0 | Whether to load model checkpoint and continue to train. | | train | 1 | Wheter to perform model training. | | regenerate | 0 | Wheter to regenerate intermediate files. | | randomseed | 0 | Random seed of everything. | | gpu | '0' | The visible GPU device (pass an empty string '' to only use CPU). | | buffer | 1 | Whether to buffer batches for dev/test. | | historymax | 20 | The maximum length of history for sequential models. | | numneg | 1 | The number of negative items for each training instance. | | testepoch | -1 | Print test set metrics every test_epoch during training (-1: no print). |

Models

We have implemented the following methods (still updating):

General Recommender

Sequential Recommender

The table below lists the results of these models in

Grocery_and_Gourmet_Food
dataset (151.3k entries). Leave-one-out is applied to split data: the most recent interaction of each user for testing, the second recent item for validation, and the remaining items for training. We randomly sample 99 negative items for each test case to rank together with the ground-truth item (also support ranking over all the items with
--test_all 1
).

| Model | [email protected] | [email protected] | Time/iter | Sequential | Knowledge | Time-aware | | :----------------------------------------------------------- | :----: | :----: | :-------: | :----------: | :----------: | :----------: | | BPRMF | 0.3574 | 0.2480 | 2.5s | | | | | NeuMF | 0.3248 | 0.2235 | 3.4s | | | | | BUIR | 0.3639 | 0.2542 | 3.3s | | | | | GRU4Rec | 0.3664 | 0.2597 | 4.9s | √ | | | | NARM | 0.3621 | 0.2586 | 8.2s | √ | | | | Caser | 0.3576 | 0.2518 | 7.8s | √ | | | | SASRec | 0.3888 | 0.2923 | 7.2s | √ | | | | TiSASRec | 0.3916 | 0.2922 | 35.7s | √ | | √ | | CFKG | 0.4228 | 0.3010 | 8.7s | | √ | | | SLRC+ | 0.4514 | 0.3329 | 4.3s | √ | √ | √ | | Chorus | 0.4739 | 0.3443 | 4.9s | √ | √ | √ | | KDA | 0.5174 | 0.3876 | 9.9s | √ | √ | √ |

For fair comparison, the batch size is fixed to 256, and the embedding size is set to 64. We strive to tune all the other hyper-parameters to obtain the best performance for each model (may be not optimal now, which will be updated if better scores are achieved). Current commands are listed in run.sh. We repeat each experiment 5 times with different random seeds and report the average score (see exp.py). All experiments are conducted with a single GTX-1080Ti GPU.

Citation

This is also our public implementation for the following papers (codes and datasets to reproduce the results can be found at corresponding branch):

git clone -b SIGIR20 https://github.com/THUwangcy/ReChorus.git
git clone -b TOIS21 https://github.com/THUwangcy/ReChorus.git

Please cite this paper if you use our codes. Thanks!

@inproceedings{wang2020make,
  title={Make it a chorus: knowledge-and time-aware item modeling for sequential recommendation},
  author={Wang, Chenyang and Zhang, Min and Ma, Weizhi and Liu, Yiqun and Ma, Shaoping},
  booktitle={Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
  pages={109--118},
  year={2020}
}

Contact

Chenyang Wang ([email protected])

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.