Need help with gcn_clustering?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

313 Stars 79 Forks MIT License 19 Commits 28 Opened issues


Code for CVPR'19 paper Linkage-based Face Clustering via GCN

Services available


Need anything else?

Contributors list

# 46,802
16 commits

Linkage-based Face Clustering via Graph Convolution Network

This repository contains the code for our CVPR'19 paper Linkage-based Face Clustering via GCN, by Zhongdao Wang, Liang Zheng, Yali Li and Shengjin Wang, Tsinghua University and Australian National University.


We present an accurate and scalable approach to the face clustering task. We aim at grouping a set of faces by their potential identities. We formulate this task as a link prediction problem: a link exists between two faces if they are of the same identity. The key idea is that we find the local context in the feature space around an instance(face) contains rich information about the linkage relationship between this instance and its neighbors. By constructing sub-graphs around each instance as input data, which depict the local context, we utilize the graph convolution network (GCN) to perform reasoning and infer the likelihood of linkage between pairs in the sub-graphs.


  • PyTorch 0.4.0
  • Python 2.7
  • sklearn >= 0.19.1

Data Format

Firstly, extract features for IJB-B data, and save the features as an NxD dimensional

file, in which each row is a D-dimensional feature for a sample. Then, save the labels as an Nx1 dimensional
file, each row is an integer indicating the identity. Lastly, generate the KNN graph (either by brute force or ANN). The KNN graph should be saved as an Nx(K+1) dimensional
file, and in each row, the first element is the node index, and the following K elements are the indices of its KNN nodes.

For training, featrues+labels+knngraphs are needed. For testing, only features+knngraphs are needed, but if you need to compute accuracy the labels are also needed. We also provide the ArcFace features / labels / knn_graphs of IJB-B/CASIA dataset at OneDrive and Baidu NetDisk, extract code: 8wj1


python --val_feat_path path/to/features --val_knn_graph_path path/to/knn/graph --val_labels_path path/to/labels --checkpoint path/to/gcn_weights

During inference, the test script will dynamically output the pairwise precision/recall/accuracy. After each subgraph is processed, the test script will output the final B-Cubed precision/recall/F-score (Note that it is not the same as the pairwise p/r/acc) and NMI score.


python --feat_path path/to/features --knn_graph_path path/to/knn/graph --labels_path path/to/labels

We employ the CASIA dataset to train the GCN. Usually, 4 epoch is sufficient. We provide a pre-trained model weights in



If you find GCN-Clustering helps your research, please cite our paper:

  title={Linkage-based Face Clustering via Graph Convolution Network },
  author={Zhongdao Wang, Liang Zheng, Yali Li and Shengjin Wang},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},


I borrowed some code on pseudo label propagation from CDP, many thanks to Xiaohang Zhan!

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.