Classifying the Objects of the Universe with Machine Learning
DOI:
https://doi.org/10.47611/jsrhs.v13i4.7515Keywords:
Machine Learning, Galaxy, Quasar, Star, Astronomy, Light, Artificial Intelligence, RedshiftAbstract
With the increase in space exploration and the search for other planets by many across the globe, identifying astronomical objects is an incredibly important task. It is one that will allow us to potentially find habitable planets around stars or asteroids with important minerals. Thus, the goal of my research was to discover the best way to use machine learning in order to identify these celestial objects. The dataset from the Sloan Digital Sky Survey from 2017 was used in this study. The key features of this data were the photometric values of each object, its redshift, and its label as a Galaxy, Quasar, or Star. Different baseline models were trained, tuned, and tested including logistic regression, decision tree, random forest, ridge classifier, and neural network. The best performing model was the tuned random forest model which had the highest f1-score, precision, and accuracy. The average accuracy was 99%, the f1 score for galaxies was 99%, for quasars was 97% and for stars was 100%. Different neural network architectures were trained and tested as well. However, none of the designed architectures could beat the hyperparameter tuned random forest. Thus, I achieved my goal by discovering that the random forest was incredibly accurate in identifying astronomical objects. This model could be potentially used for aiding astronomers in identifying objects across the universe.
Downloads
References or Bibliography
Wu, Y. (2021). MACHINE LEARNING CLASSIFICATION OF STARS, GALAXIES, AND QUASARS. MATTER: International Journal of Science and Technology, 6, 102–122. doi:10.20319/mijst.2021.63.102122
Makhija, S., Saha, S., Basak, S., & Das, M. (2019). Separating stars from quasars: Machine learning investigation using photometric data. Astronomy and Computing, 29, 100313. https://doi.org/10.1016/j.ascom.2019.100313
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.2307/1267351
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85–117.
Rokach, Lior & Maimon, Oded. (2005). Decision Trees. 10.1007/0-387-25465-X_9.
Sperandei S. (2014). Understanding logistic regression analysis. Biochemia medica, 24(1), 12–18. https://doi.org/10.11613/BM.2014.003
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/a:1010933404324
Published
How to Cite
Issue
Section
Copyright (c) 2024 Justin Wu; Abdulla Kerimov, Steve Szabados

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.


