Classifying the Objects of the Universe with Machine Learning

Authors

  • Justin Wu Greenhill School
  • Abdulla Kerimov
  • Steve Szabados

DOI:

https://doi.org/10.47611/jsrhs.v13i4.7515

Keywords:

Machine Learning, Galaxy, Quasar, Star, Astronomy, Light, Artificial Intelligence, Redshift

Abstract

With the increase in space exploration and the search for other planets by many across the globe, identifying astronomical objects is an incredibly important task. It is one that will allow us to potentially find habitable planets around stars or asteroids with important minerals. Thus, the goal of my research was to discover the best way to use machine learning in order to identify these celestial objects. The dataset from the Sloan Digital Sky Survey from 2017 was used in this study. The key features of this data were the photometric values of each object, its redshift, and its label as a Galaxy, Quasar, or Star. Different baseline models were trained, tuned, and tested including logistic regression, decision tree, random forest, ridge classifier, and neural network. The best performing model was the tuned random forest model which had the highest f1-score, precision, and accuracy. The average accuracy was 99%, the f1 score for galaxies was 99%, for quasars was 97% and for stars was 100%. Different neural network architectures were trained and tested as well. However, none of the designed architectures could beat the hyperparameter tuned random forest. Thus, I achieved my goal by discovering that the random forest was incredibly accurate in identifying astronomical objects. This model could be potentially used for aiding astronomers in identifying objects across the universe.

Downloads

Download data is not yet available.

References or Bibliography

Wu, Y. (2021). MACHINE LEARNING CLASSIFICATION OF STARS, GALAXIES, AND QUASARS. MATTER: International Journal of Science and Technology, 6, 102–122. doi:10.20319/mijst.2021.63.102122

Makhija, S., Saha, S., Basak, S., & Das, M. (2019). Separating stars from quasars: Machine learning investigation using photometric data. Astronomy and Computing, 29, 100313. https://doi.org/10.1016/j.ascom.2019.100313

Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.2307/1267351

Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85–117.

Rokach, Lior & Maimon, Oded. (2005). Decision Trees. 10.1007/0-387-25465-X_9.

Sperandei S. (2014). Understanding logistic regression analysis. Biochemia medica, 24(1), 12–18. https://doi.org/10.11613/BM.2014.003

Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/a:1010933404324

Published

11-30-2024

How to Cite

Wu, J., Kerimov, A., & Szabados, S. (2024). Classifying the Objects of the Universe with Machine Learning. Journal of Student Research, 13(4). https://doi.org/10.47611/jsrhs.v13i4.7515

Issue

Section

HS Review Articles