Prediction of EPA Pesticide Tolerance Using Machine Learning and Publicly Available Data
Keywords:
Pesticides, Machine Learning, Cheminformatics, Tolerance Levels, EPAAbstract
The EPA Tolerance Level for pesticide/commodity pair (Tol) is an important indicator in the environmental risk assessment of common pesticides. This metric is used to tell how much residue in parts-per-million (ppm) is tolerated on food. Pesticides must go through rigorous and costly testing to be approved for public use. For this reason, it is necessary to accurately estimate the Tol of a pesticide. This study aims to use publicly available pesticide data, along with collected values of physiochemical properties and molecular descriptors of chemical structures, to develop a reproducible model capable of predicting whether a pesticide can be tolerated. More specifically, the accuracies of models based on a Support Vector Machine, Decision Tree, Logistic Regression, and K-Nearest Neighbors algorithms were compared and evaluated. The experimental results suggest that it is possible to reach a relatively high accuracy using molecular descriptors and specific values from publicly available data. Compared to previous models, these models are more transparent in their methodology and input. Therefore, while not as accurate, the generalizable and modular workflow can be used in the preliminary evaluation of pesticides and reproduced in more data-intensive studies.
References or Bibliography
About Pesticide Registration.(n.d.).Environmental Protection Agency. Retrieved from https://www.epa.gov/pesticide-registration/about-pesticide-registration
Fishel, F. M. (n.d.). EPA Approval of Pesticide Labeling. Retrieved from https://edis.ifas.ufl.edu/publication/PI203
Kobayashi, Y., Uchida, T., & Yoshida, K. (2020). Prediction of soil adsorption coef-ficient in pesticides using physicochemical properties and molecular descriptorsby machine learning models. Environmental toxicology and chemistry,39(7),1451–1459.
Landrum, G., Sforna, G., De Winter, H., & Deric. (n.d.).RDKit. Retrieved from https://www.rdkit.org/
NCI/CADD Chemical Identifier Resolver.(n.d.). U.S. Department of Health and Human Services. Retrieved from https://cactus.nci.nih.gov/chemical/structure
PDP Database Search. (n.d.). Retrieved from https://apps.ams.usda.gov/pdp
PubChemPy documentation. (n.d.). Retrieved from https://pubchempy.readthedocs.io/en/latest/
Taylor, J. (2021, Mar). New Federal Study: Extremely Toxic Pesticide Break-down Products Found in 90% of Streams Sampled Across U.S. Retrieved from https://biologicaldiversity.org/w/news/press-releases/new-federal-study-extremely-toxic-pesticide-breakdown-products-found-in-90-of-streams-sampled-across-us-2021-03-26/10
Downloads
Posted
License
Copyright (c) 2021 Sahej Singh
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The copyright holder for this article has granted JSR.org a license to display the article in perpetuity.