Cooperated Supervised and Semi-supervised Machine Learning for Identification of Exoplanet Habitability

Authors

  • Haoxuan Xu Dulwich International High School Suzhou
  • Yong Zhang China University of Mining and Technology

DOI:

https://doi.org/10.47611/jsrhs.v13i3.7056

Keywords:

Exoplanet habitability, feature selection, KNN, semi-supervised learning, habitability identification

Abstract

The study of planetary habitability has gained widespread attention, and most existing studies focused on analyzing the habitability of a single planet based on a single feature, which makes it difficult to process a large amount of planetary data quickly. In this paper, we propose a machine learning-based identification method for efficiently distinguishing the habitability of a batch of planets. Firstly, a planet dataset comprising 5476 unlabeled records from the NASA Exoplanet Archive and 63 labeled entries with habitability from the Habitable Worlds Catalog is collected. Following that, a binary particle swarm optimization approach is used to select the most relevant features according to the 63 labeled planets. To address the missing values in the NASA data, next a standardized median imputation technique is applied. Two distinct methods, namely K-means clustering and distance-based filtering, are developed to label a subset of uninhabitable exoplanets by integrating the unlabeled 5476 and labeled 63 data points. Finally, KNN classifier and a semi-supervised label spreading classifier are trained and cooperated, contributing to the accomplishment of the final classification task. The experimental outcomes demonstrate the viability and effectiveness of the proposed method.

Downloads

Download data is not yet available.

Author Biography

Yong Zhang, China University of Mining and Technology

Yong Zhang received the Ph.D. degree in control theory and control engineering from the China University of Mining and Technology in 2009. He is a professor with the School of Information and Control Engineering, China University of Mining and Technology. His research interests include intelligence optimization and data mining.

References or Bibliography

Chen, R. E., Jiang, J. H., Rosen, P. E., Fahy, K. A., & Chen, Y. (2023). Exoplanets around Red Giants: Distribution and Habitability. Galaxies, 11, 112. https://doi.org/10.3390/galaxies11060112

Dada, E. G., Bassi, J. S., Chiroma, H., Abdulhamid, S. i. M., Adetunmbi, A. O., & Ajibuwa, O. E. (2019). Machine learning for email spam filtering: review, approaches and open research problems. Heliyon, 5(6), e01802. https://doi.org/https://doi.org/10.1016/j.heliyon.2019.e01802

Deng, L., & Li, X. (2013). Machine learning paradigms for speech recognition: An overview. IEEE Transactions on Audio, Speech, and Language Processing, 21(5), 1060-1089.

Deng, Y., Xia, C. S., Peng, H., Yang, C., & Zhang, L. (2023). Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, Seattle, WA, USA. https://doi.org/10.1145/3597926.3598067

Deo, R. C. (2015). Machine Learning in Medicine. Circulation, 132(20), 1920-1930. https://doi.org/doi:10.1161/CIRCULATIONAHA.115.001593

Dyches, Preston; Chou, Felcia (7 April 2015). "The Solar System and Beyond is Awash in Water". NASA. Retrieved 8 April 2015.

Gebhard, T. D., Angerhausen, D., Konrad, B. S., Alei, E., Quanz, S. P., & Schölkopf, B. (2024). Parameterizing pressure–temperature profiles of exoplanet atmospheres with neural networks. Astronomy & Astrophysics, 681, A3.

Kaltenegger, L., Selsis, F., Fridlund, M., Lammer, H., Beichman, C., Danchi, W., Eiroa, C., Henning, T., Herbst, T., Léger, A., Liseau, R., Lunine, J., Paresce, F., Penny, A., Quirrenbach, A., Röttgering, H., Schneider, J., Stam, D., Tinetti, G., & White, G. J. (2010). Deciphering Spectral Fingerprints of Habitable Exoplanets. Astrobiology, 10(1), 89-102. https://doi.org/10.1089/ast.2009.0381

Meadows, V. S., Arney, G. N., Schwieterman, E. W., Lustig-Yaeger, J., Lincowski, A. P., Robinson, T., Domagal-Goldman, S. D., Deitrick, R., Barnes, R. K., Fleming, D. P., Luger, R., Driscoll, P. E., Quinn, T. R., & Crisp, D. (2018). The Habitability of Proxima Centauri b: Environmental States and Observational Discriminants. Astrobiology, 18(2), 133-189. https://doi.org/10.1089/ast.2016.1589

NASA Astrobiology Strategy 2015, https://astrobiology.nasa.gov/about/astrobiology-strategy/

Nasios, I. (2024). Analyze mass spectrometry data with artificial intelligence to assist the understanding of past habitability of Mars and provide insights for future missions. Icarus, 408, 115824.

Schulze-Makuch, D., Méndez, A., Fairén, A. G., von Paris, P., Turse, C., Boyer, G., Davila, A. F., António, M. R. d. S., Catling, D., & Irwin, L. N. (2011). A Two-Tiered Approach to Assessing the Habitability of Exoplanets. Astrobiology, 11(10), 1041-1052. https://doi.org/10.1089/ast.2010.0592

Seager, S., & Deming, D. (2010). Exoplanet Atmospheres. Annual Review of Astronomy and Astrophysics, 48(1), 631-672. https://doi.org/10.1146/annurev-astro-081309-130837

Sebe, N. (2005). Machine learning in computer vision (Vol. 29). Springer Science & Business Media.

Vannah, S., Gleiser, M., & Kaltenegger, L. (2024). An information theory approach to identifying signs of life on transiting planets. Monthly Notices of the Royal Astronomical Society, 528, L4-L9. https://doi.org/10.1093/mnrasl/slad156

Wolszczan, A., & Frail, D. A. (1992). A planetary system around the millisecond pulsar PSR1257 + 12. Nature, 355(6356), 145-147. https://doi.org/10.1038/355145a0

Wolszczan, A. (1994). Confirmation of Earth-Mass Planets Orbiting the Millisecond Pulsar PSR B1257 + 12. Science, 264(5158), 538-542. https://doi.org/doi:10.1126/science.264.5158.538

Yoosefzadeh-Najafabadi, M., Earl, H. J., Tulpan, D., Sulik, J., & Eskandari, M. (2020). Application of Machine Learning Algorithms in Plant Breeding: Predicting Yield From Hyperspectral Reflectance in Soybean. Front Plant Science, 11, 624273. https://doi.org/10.3389/fpls.2020.624273

Zhang, Y., Gong, D., Hu, Y., & Zhang, W. (2015). Feature selection algorithm based on bare bones particle swarm optimization. Neurocomputing, 148(1), 150-157. https://doi.org/10.1016/j.neucom.2012.09.049

Published

08-31-2024

How to Cite

Xu, H., & Zhang, Y. (2024). Cooperated Supervised and Semi-supervised Machine Learning for Identification of Exoplanet Habitability. Journal of Student Research, 13(3). https://doi.org/10.47611/jsrhs.v13i3.7056

Issue

Section

HS Research Projects