Project

(Deep) Image Clustering

Graph degree linkage (GDL) [1] is a hierarchical agglomerative clustering based on cluster similarity measure defined on a directed K-nearest-neighbour graph.

GDL is a better alternative to conventional algorithms, such as k-means, spectral clustering and average linkage. GDL provides strong baselines on popular image clustering datasets, such as MNIST, USPS, Fashion-MNIST.

Download reproducable code from GACluster [github].

Benchmark Results

GDL [1] without deep representation achieves close performance to state-of-the-art deep clustering algorithms.
The code to reproduce the following results can be downloaded from GACluster [github].

Table 1 Clustering performances of different algorithms in terms of NMI/ACC.

DataSet	MNIST	MNIST-test	USPS	Fashion-MNIST
GDL [1]	0.910/0.964	0.864/0.933	0.860/0.922	0.660/0.627
Deep Clustering
DEC [9]	— –/0.843	— –/ — –	— –/ — –	— –/ — –
JULE [3]	0.913/0.964	0.915/0.961	0.913/ — –	— –/ — –
DEPICT [4]	0.917/0.965	0.915/0.963	0.927/0.964	— –/ — –
VaDE [10]	— –/0.945	— –/ — –	— –/ — –	— –/ — –
DAC [11]	0.935/0.978	— –/ — –	— –/ — –	— –/ — –
DBC [12]	0.917/0.964	— –/ — –	0.724/0.743	— –/ — –
ConvDEC-DA [7]	0.960/0.985	0.958/0.983	0.962/0.987	0.636/0.586
DDC-DA [13]	0.941/0.969	0.927/0.970	0.939/0.977	0.661/0.609
DSC-DAN [18]	0.941/0.978	0.946/0.980	0.857/0.869	0.645/0.662
ClusterGAN [19]	0.921/0.964	— –/ — –	0.931/0.970	— –/ — –

Citations

Please cite the following papers, if you find the code is helpful.

@inproceedings{zhang2012graph,
title={Graph degree linkage: Agglomerative clustering on a directed graph},
author={Zhang, Wei and Wang, Xiaogang and Zhao, Deli and Tang, Xiaoou},
booktitle={European Conference on Computer Vision},
pages={428–441},
year={2012}
}

@article{zhang2013agglomerative,
title={Agglomerative clustering via maximum incremental path integral},
author={Zhang, Wei and Zhao, Deli and Wang, Xiaogang},
journal={Pattern Recognition},
volume={46},
number={11},
pages={3056–3065},
year={2013}
}

(Deep) Image Clustering Literature

W. Zhang, X. Wang, D. Zhao and X. Tang. Graph degree linkage: Agglomerative clustering on a directed graph. ECCV, 2012.
W. Zhang, D. Zhao and X. Wang. Agglomerative clustering via maximum incremental path integral. Pattern Recognition, 46(11), pp.3056-3065, 2013.
J. Yang, D. Parikh and D. Batra. Joint unsupervised learning of deep representations and image clusters. CVPR, 2016. [paper] [code]
K.G. Dizaji, A. Herandi, C. Deng, W. Cai, H. Huang. Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. ICCV, 2017. [paper] [code]
S.A. Shah and V. Koltun. Robust continuous clustering. PNAS, 114(37), pp.9814-9819, 2017. [paper] [code]
S.A. Shah and V. Koltun. Deep Continuous Clustering. Arxiv, 2018. [paper] [code]
X. Guo, E. Zhu, X. Liu and J. Yin. Deep embedded clustering with data augmentation. ACML, 2018. [paper] [code]
X. Guo, L. Gao, X. Liu and J. Yin. Improved deep embedded clustering with local structure preservation. IJCAI, 2017. [paper] [code]
J. Xie, R. Girshick and A. Farhadi. Unsupervised deep embedding for clustering analysis. ICML, 2016. [paper]
Z. Jiang, Y. Zheng, H. Tan, B. Tang and H. Zhou. Variational deep embedding: An unsupervised and generative approach to clustering. IJCAI, 2017. [paper] [code]
J. Chang, L. Wang, G. Meng, S. Xiang and C. Pan. Deep adaptive image clustering. ICCV, 2017. [paper] [code]
F. Li, H. Qiao and B. Zhang. Discriminatively boosted image clustering with fully convolutional auto-encoders. Pattern Recognition, 83, 2017. [paper]
Y. Ren, N. Wang, M. Li and Z. Xu. Deep Density-based Image Clustering. arXiv preprint arXiv:1812.04287, 2018. [paper]
M. Caron, P. Bojanowski, A. Joulin and M. Douze. Deep clustering for unsupervised learning of visual features. ECCV, 2018. [paper]
W. Hu, T. Miyato, S. Tokui, E. Matsumoto and M. Sugiyama. Learning Discrete Representations via Information Maximizing Self-Augmented Training. ICML, 2017. [paper] [code]
U. Shaham, K. Stanton, H. Li, B. Nadler, R. Basri and Y. Kluger. SpectralNet: Spectral Clustering Using Deep Neural Networks. ICLR, 2018. [paper] [code]
X. Guo, X. Liu, E. Zhu, X. Zhu, M. Li,X. Xu and J. Yin. Adaptive Self-paced Deep Clustering with Data Augmentation. IEEE TKDE, 2019. [paper] [code]
X. Yang, C. Deng, F. Zheng, J. Yan and W. Liu. Deep Spectral Clustering using Dual Autoencoder Network. CVPR, 2019. [paper]
K.G. Dizaji, X. Wang, C. Deng and H. Huang. Balanced Self-Paced Learning for Generative Adversarial Clustering Network. CVPR, 2019. [paper]
X. Ji, J. F. Henriques and A. Vedaldi. Invariant information distillation for unsupervised image segmentation and clustering. ICCV, 2019. [paper]
J. Wu, K. Long, F. Wang, C. Qian, C. Li, Z. Lin and H. Zha. Deep comprehensive correlation mining for image clustering. ICCV, 2019. [paper] [code]
J. Huang, S. Gong and X. Zhu. Deep Semantic Clustering by Partition Confidence Maximisation. CVPR, 2020. [paper] [code]

Many Other Applications of Graph Degree Linkage (GDL)

GDL has been demonstrated as a good alternative of conventional clustering algorithms, such as k-means, DBSCAN, mean-shift, normalized cut, spectral clustering, linkage, ward, etc. Since its inventiona, GDL has been applied to many research areas, including:

Computer vision: image clustering [1], face grouping [1, R17], image matching [1, R7], image segmentation [R3, R10], image search [R2], person re-identification [R4, R14], crowd analysis [R5, R6], saliency detection [R11], action recognition [R1];
Medical imaging [R8, R16];
Data mining [R12, R13], community detection [R9, R18], compiler optimization [R15].

[R1] Directed Acyclic Graph Kernels for Action Recognition. ICCV, 2013.
[R2] Visual semantic complex network for web images. ICCV, 2013.
[R3] Object co-segmentation based on directed graph clustering. VCIP, 2013.
[R4] Learning mid-level filters for person re-identification. CVPR, 2014.
[R5] Scene-independent group profiling in crowd. CVPR, 2014.
[R6] Crowd tracking with dynamic evolution of group structures. ECCV, 2014.
[R7] A Low-Dimensional Representation for Robust Partial Isometric Correspondences Computation. Graphical Models 76(2), March 2014.
[R8] Hierarchical organization of the functional brain identified using floating aggregation of functional signals. ISBI, 2014.
[R9] Considerations about multistep community detection. PAKDD Workshops 2014.
[R10] Constrained directed graph clustering and segmentation propagation for multiple foregrounds cosegmentation. TCSVT, 2015.
[R11] Saliency Detection Based on Graph-Structural Agglomerative Clustering. ACM MM, 2015.
[R12] Spatial and temporal distribution and pollution assessment of trace metals in marine sediments in Oyster Bay, NSW, Australia. Bulletin of Environmental Contamination and Toxicology, 2015.
[R13] Spatial distribution of sediment particles and trace element pollution within Gunnamatta Bay, Port Hacking, NSW, Australia. Regional Studies in Marine Science, 2015.
[R14] Person re-identification based on hierarchical bipartite graph matching. ICIP, 2016.
[R15] Micomp: Mitigating the compiler phase-ordering problem using optimization sub-sequences and machine learning. TACO, 2017.
[R16] Suprathreshold fiber cluster statistics: Leveraging white matter geometry to enhance tractography statistical analysis. NeuroImage, 2018.
[R17] Merge or not? learning to group faces via imitation learning. AAAI, 2018.
[R18] CDlib: a Python Library to Extract, Compare and Evaluate Communities from Complex Networks. Applied Network Science Journal. 2019. CDlib - Community Discovery Library

Footnote: The link in [1] (http://mmlab.ie.cuhk.edu.hk/research/gdl/) is not available any more.

Stats