Deep Clustering with Robust Autoencoder (DCRA)

Authors

  • Connor Lee Saratoga High School
  • Albert Wang
  • Stefano Rizzo Saratoga High School

DOI:

https://doi.org/10.47611/jsrhs.v11i2.2722

Keywords:

Machine learning, Deep learning, Clustering

Abstract

Accordingly to Science Daily, 90 percent of all the data in the world has been generated in the last two years. However, the world is analyzing less than 1 percent of its data so far. With the advancement of high-performance computing, deep learning methods are readily applied  to analyze large-scale high dimensional datasets. These machine learning methods have achieved significantly efficient training and inferencing as well as producing much more accurate predicted results. Clustering is an unsupervised machine learning method of identifying and grouping similar data points into the same cluster. Clustering plays a fundamental role in the data mining and machine learning community for grouping data into structures so that similar data points are assigned to similar groups. Furthermore, to process these huge amounts of high-dimensional data, deep learning becomes a key technique to learn and perform feature representation of data in latent space for many real world applications. In this paper, we propose deep clustering with robust autoencoder (DCRA), which jointly utilizes robust auto-encoder and deep clustering to perform feature representation and cluster assignments simultaneously. Multiple experiments using open public datasets have been conducted to evaluate our model’s performance. Our results show DCRA is capable of generating high quality clusters with high clustering accuracy of 90% above in high dimensional datasets. The decreasing training and test loss with increasing number of epochs also validates our results. 

Downloads

Download data is not yet available.

Author Biography

Stefano Rizzo, Saratoga High School

Advisor

References or Bibliography

https://en.wikipedia.org/wiki/Deep_learning

https://en.wikipedia.org/wiki/Artificial_neural_network

Dor Bank, Noam Koenigstein, Raja Giryes, Autoencoders, https://arxiv.org/abs/2003.05991

https://en.wikipedia.org/wiki/Cluster_analysis

Junyuan Xie, Ross Girshick, Ali Farhadi, Deep Embedding Clustering, ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, June 2016 Pages 478–487

Xifeng Guo, Xinwang Liu, Jianping Yin , Deep Clustering with Convolutional Autoencoders, ICONIP 14 November 2017

Zhihao Zheng, Pengyu Hong, Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks, Advances in Neural Information Processing Systems 31 (NeurIPS 2018)

Kui Ren, Tianhang Zheng, Zhan Qin ,Xue Liu, Adversarial Attacks and Defenses in Deep Learning, https://www.sciencedirect.com/science/article/pii/S209580991930503X#!

Iqbal H Sarker , Machine Learning: Algorithms, Real-World Applications and Research Directions, DOI: 10.1007/s42979-021-00592-x

Maryam M Najafabadi, Flavio Villanustre, Taghi M Khoshgoftaar, Naeem Seliya, Randall Wald & Edin Muharemagic , Deep learning applications and challenges in big data analytics, Journal of Big Data volume 2, Article number: 1 (2015)

Jeff Heaton, Applications of Deep Neural Networks, https://arxiv.org/abs/2009.05673

Jung-Hua Wang, Jen-Da Rau and Wen-Jeng Liu, Two-stage clustering via neural networks, IEEE Transactions on Neural Networks 14(3):606-15

Yazhou Ren, Ni Wang, Mingxia Li, Zenglin Xu , Deep Density-based Image Clustering, https://arxiv.org/abs/1812.04287

Jianlong Chang, Lingfeng Wang, Gaofeng Meng, Shiming Xiang, Deep Adaptive Image Clustering, 2017 IEEE International Conference on Computer Vision (ICCV)

Stephan Zheng, Yang Song, Thomas Leung, Ian Goodfellow, Improving the Robustness of Deep Neural Networks via Stability Training, CVPR 2016, https://doi.org/10.48550/arXiv.1604.04326

Tommaso Dreossi, Shromona Ghosh, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia, A Formalization of Robustness for Deep Neural Networks, https://doi.org/10.48550/arXiv.1903.10033

MNIST, https://en.wikipedia.org/wiki/MNIST_database

FashionMNIST: https://paperswithcode.com/dataset/fashion-mnist

BSD Dataset, https://paperswithcode.com/dataset/bsd

Parsons L, Haque E, Liu H: Subspace Clustering for High Dimensional Data: a Review. SIGKDD Explor Newsl. 2004, 6: 90-105. 10.1145/1007730.1007731.

Jörnsten R, Vardi Y, Zhang CH: A Robust Clustering Method and Visualization Tool Based on Data Depth. 2002, Basel: Birkhäuser

Junyuan Xie, Ross Girshick, Ali Farhadi, Unsupervised deep embedding for clustering analysis, ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning, Volume 48

Published

05-31-2022

How to Cite

Lee, C., Wang, A., & Rizzo, S. (2022). Deep Clustering with Robust Autoencoder (DCRA). Journal of Student Research, 11(2). https://doi.org/10.47611/jsrhs.v11i2.2722

Issue

Section

HS Research Projects