by Zhongdao

Zhongdao / gcn_clustering

Code for CVPR'19 paper Linkage-based Face Clustering via GCN

252 Stars 59 Forks Last release: Not found MIT License 19 Commits 0 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

Linkage-based Face Clustering via Graph Convolution Network

This repository contains the code for our CVPR'19 paper Linkage-based Face Clustering via GCN, by Zhongdao Wang, Liang Zheng, Yali Li and Shengjin Wang, Tsinghua University and Australian National University.


We present an accurate and scalable approach to the face clustering task. We aim at grouping a set of faces by their potential identities. We formulate this task as a link prediction problem: a link exists between two faces if they are of the same identity. The key idea is that we find the local context in the feature space around an instance(face) contains rich information about the linkage relationship between this instance and its neighbors. By constructing sub-graphs around each instance as input data, which depict the local context, we utilize the graph convolution network (GCN) to perform reasoning and infer the likelihood of linkage between pairs in the sub-graphs.


  • PyTorch 0.4.0
  • Python 2.7
  • sklearn >= 0.19.1

Data Format

Firstly, extract features for IJB-B data, and save the features as an NxD dimensional

file, in which each row is a D-dimensional feature for a sample. Then, save the labels as an Nx1 dimensional
file, each row is an integer indicating the identity. Lastly, generate the KNN graph (either by brute force or ANN). The KNN graph should be saved as an Nx(K+1) dimensional
file, and in each row, the first element is the node index, and the following K elements are the indices of its KNN nodes.

For training, featrues+labels+knngraphs are needed. For testing, only features+knngraphs are needed, but if you need to compute accuracy the labels are also needed. We also provide the ArcFace features / labels / knn_graphs of IJB-B/CASIA dataset at OneDrive and Baidu NetDisk, extract code: 8wj1


python test.py --val_feat_path path/to/features --val_knn_graph_path path/to/knn/graph --val_labels_path path/to/labels --checkpoint path/to/gcn_weights

During inference, the test script will dynamically output the pairwise precision/recall/accuracy. After each subgraph is processed, the test script will output the final B-Cubed precision/recall/F-score (Note that it is not the same as the pairwise p/r/acc) and NMI score.


python train.py --feat_path path/to/features --knn_graph_path path/to/knn/graph --labels_path path/to/labels

We employ the CASIA dataset to train the GCN. Usually, 4 epoch is sufficient. We provide a pre-trained model weights in



If you find GCN-Clustering helps your research, please cite our paper:

  title={Linkage-based Face Clustering via Graph Convolution Network },
  author={Zhongdao Wang, Liang Zheng, Yali Li and Shengjin Wang},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},


I borrowed some code on pseudo label propagation from CDP, many thanks to Xiaohang Zhan!

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.