Cooperated Supervised and Semi-supervised Machine Learning for Identification of Exoplanet Habitability
DOI:
https://doi.org/10.47611/jsrhs.v13i3.7056Keywords:
Exoplanet habitability, feature selection, KNN, semi-supervised learning, habitability identificationAbstract
The study of planetary habitability has gained widespread attention, and most existing studies focused on analyzing the habitability of a single planet based on a single feature, which makes it difficult to process a large amount of planetary data quickly. In this paper, we propose a machine learning-based identification method for efficiently distinguishing the habitability of a batch of planets. Firstly, a planet dataset comprising 5476 unlabeled records from the NASA Exoplanet Archive and 63 labeled entries with habitability from the Habitable Worlds Catalog is collected. Following that, a binary particle swarm optimization approach is used to select the most relevant features according to the 63 labeled planets. To address the missing values in the NASA data, next a standardized median imputation technique is applied. Two distinct methods, namely K-means clustering and distance-based filtering, are developed to label a subset of uninhabitable exoplanets by integrating the unlabeled 5476 and labeled 63 data points. Finally, KNN classifier and a semi-supervised label spreading classifier are trained and cooperated, contributing to the accomplishment of the final classification task. The experimental outcomes demonstrate the viability and effectiveness of the proposed method.
Downloads
References or Bibliography
Chen, R. E., Jiang, J. H., Rosen, P. E., Fahy, K. A., & Chen, Y. (2023). Exoplanets around Red Giants: Distribution and Habitability. Galaxies, 11, 112. https://doi.org/10.3390/galaxies11060112
Dada, E. G., Bassi, J. S., Chiroma, H., Abdulhamid, S. i. M., Adetunmbi, A. O., & Ajibuwa, O. E. (2019). Machine learning for email spam filtering: review, approaches and open research problems. Heliyon, 5(6), e01802. https://doi.org/https://doi.org/10.1016/j.heliyon.2019.e01802
Deng, L., & Li, X. (2013). Machine learning paradigms for speech recognition: An overview. IEEE Transactions on Audio, Speech, and Language Processing, 21(5), 1060-1089.
Deng, Y., Xia, C. S., Peng, H., Yang, C., & Zhang, L. (2023). Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, Seattle, WA, USA. https://doi.org/10.1145/3597926.3598067
Deo, R. C. (2015). Machine Learning in Medicine. Circulation, 132(20), 1920-1930. https://doi.org/doi:10.1161/CIRCULATIONAHA.115.001593
Dyches, Preston; Chou, Felcia (7 April 2015). "The Solar System and Beyond is Awash in Water". NASA. Retrieved 8 April 2015.
Gebhard, T. D., Angerhausen, D., Konrad, B. S., Alei, E., Quanz, S. P., & Schölkopf, B. (2024). Parameterizing pressure–temperature profiles of exoplanet atmospheres with neural networks. Astronomy & Astrophysics, 681, A3.
Kaltenegger, L., Selsis, F., Fridlund, M., Lammer, H., Beichman, C., Danchi, W., Eiroa, C., Henning, T., Herbst, T., Léger, A., Liseau, R., Lunine, J., Paresce, F., Penny, A., Quirrenbach, A., Röttgering, H., Schneider, J., Stam, D., Tinetti, G., & White, G. J. (2010). Deciphering Spectral Fingerprints of Habitable Exoplanets. Astrobiology, 10(1), 89-102. https://doi.org/10.1089/ast.2009.0381
Meadows, V. S., Arney, G. N., Schwieterman, E. W., Lustig-Yaeger, J., Lincowski, A. P., Robinson, T., Domagal-Goldman, S. D., Deitrick, R., Barnes, R. K., Fleming, D. P., Luger, R., Driscoll, P. E., Quinn, T. R., & Crisp, D. (2018). The Habitability of Proxima Centauri b: Environmental States and Observational Discriminants. Astrobiology, 18(2), 133-189. https://doi.org/10.1089/ast.2016.1589
NASA Astrobiology Strategy 2015, https://astrobiology.nasa.gov/about/astrobiology-strategy/
Nasios, I. (2024). Analyze mass spectrometry data with artificial intelligence to assist the understanding of past habitability of Mars and provide insights for future missions. Icarus, 408, 115824.
Schulze-Makuch, D., Méndez, A., Fairén, A. G., von Paris, P., Turse, C., Boyer, G., Davila, A. F., António, M. R. d. S., Catling, D., & Irwin, L. N. (2011). A Two-Tiered Approach to Assessing the Habitability of Exoplanets. Astrobiology, 11(10), 1041-1052. https://doi.org/10.1089/ast.2010.0592
Seager, S., & Deming, D. (2010). Exoplanet Atmospheres. Annual Review of Astronomy and Astrophysics, 48(1), 631-672. https://doi.org/10.1146/annurev-astro-081309-130837
Sebe, N. (2005). Machine learning in computer vision (Vol. 29). Springer Science & Business Media.
Vannah, S., Gleiser, M., & Kaltenegger, L. (2024). An information theory approach to identifying signs of life on transiting planets. Monthly Notices of the Royal Astronomical Society, 528, L4-L9. https://doi.org/10.1093/mnrasl/slad156
Wolszczan, A., & Frail, D. A. (1992). A planetary system around the millisecond pulsar PSR1257 + 12. Nature, 355(6356), 145-147. https://doi.org/10.1038/355145a0
Wolszczan, A. (1994). Confirmation of Earth-Mass Planets Orbiting the Millisecond Pulsar PSR B1257 + 12. Science, 264(5158), 538-542. https://doi.org/doi:10.1126/science.264.5158.538
Yoosefzadeh-Najafabadi, M., Earl, H. J., Tulpan, D., Sulik, J., & Eskandari, M. (2020). Application of Machine Learning Algorithms in Plant Breeding: Predicting Yield From Hyperspectral Reflectance in Soybean. Front Plant Science, 11, 624273. https://doi.org/10.3389/fpls.2020.624273
Zhang, Y., Gong, D., Hu, Y., & Zhang, W. (2015). Feature selection algorithm based on bare bones particle swarm optimization. Neurocomputing, 148(1), 150-157. https://doi.org/10.1016/j.neucom.2012.09.049
Published
How to Cite
Issue
Section
Copyright (c) 2024 Haoxuan Xu; Yong Zhang

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.


