From Structure to Function: Biological Activity Prediction of Phytochemicals Using Molecular Fingerprints with Convolutional Neural Networks

Authors

  • Rachel Choi Gangnam International School
  • Sinae Kim Gangnam International School

DOI:

https://doi.org/10.47611/jsrhs.v14i1.8615

Keywords:

Classification, Molecular Structure, Machine Learning

Abstract

Phytochemicals, naturally occurring compounds in plants, offer significant potential for drug development due to their diverse structures and biological activities. They exhibit antioxidant, anti-inflammatory, antimicrobial, anticancer, cardiovascular, and neuroprotective properties, making them beneficial for treating various health conditions. Advantages of phytochemicals include their natural origin, better safety profiles, multi-targeted actions, synergistic effects with other compounds, and sustainability. However, the traditional knowledge-based approach to analizing phytochemicals is limited by its reliance on well-known, locally available plants, resulting in a narrow scope and frequent redundancy. This approach often depends on anecdotal and subjective evidence, faces the risk of knowledge loss, and encounters ethical and legal issues. To address this issue, I propose a convolutional neural network-based systematic approach to predict potential biological activities from molecular structure inputs. The proposed system converts molecular structures into one-hot vector representations using SMILES notation and molecular fingerprint algorithms. These vectors are then fed into a biological activity prediction network to estimate possible biological activities. Through comprehensive experiments, I have demonstrated that applying a convolutional neural network-based machine learning approach yields promising results by achieving an accuracy of 87.8%.

Downloads

Download data is not yet available.

References or Bibliography

AI Hub (2023, Dec 14). “Plant functionality prediction genomic data”: AI Hub.

https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71316

Hossin, M., & Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process, 5(2), 1.

Muegge, I., & Mukherjee, P. (2016). An overview of molecular fingerprint similarity search in virtual screening. Expert opinion on drug discovery, 11(2), 137-148.

Pathania, S., Ramakrishnan, S. M., & Bagler, G. (2015). Phytochemica: a platform to explore phytochemicals of medicinal plants. Database, 2015, bav075.

Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of chemical information and computer sciences, 28(1), 31-36.

Willis, K. (2017). State of the world's plants 2017. Royal Botanics Gardens Kew.

Published

02-28-2025

How to Cite

Choi, R., & Kim, S. (2025). From Structure to Function: Biological Activity Prediction of Phytochemicals Using Molecular Fingerprints with Convolutional Neural Networks. Journal of Student Research, 14(1). https://doi.org/10.47611/jsrhs.v14i1.8615

Issue

Section

HS Research Projects