Preprint / Version 1

Prediction of EPA Pesticide Tolerance Using Machine Learning and Publicly Available Data

##article.authors##

  • Sahej Singh Massachusetts Academy of Mathematics and Science

Keywords:

Pesticides, Machine Learning, Cheminformatics, Tolerance Levels, EPA

Abstract

The EPA Tolerance Level for pesticide/commodity pair (Tol) is an important indicator in the environmental risk assessment of common pesticides. This metric is used to tell how much residue in parts-per-million (ppm) is tolerated on food. Pesticides must go through rigorous and costly testing to be approved for public use. For this reason, it is necessary to accurately estimate the Tol of a pesticide. This study aims to use publicly available pesticide data, along with collected values of physiochemical properties and molecular descriptors of chemical structures, to develop a reproducible model capable of predicting whether a pesticide can be tolerated. More specifically, the accuracies of models based on a Support Vector Machine, Decision Tree, Logistic Regression, and K-Nearest Neighbors algorithms were compared and evaluated. The experimental results suggest that it is possible to reach a relatively high accuracy using molecular descriptors and specific values from publicly available data. Compared to previous models, these models are more transparent in their methodology and input. Therefore, while not as accurate, the generalizable and modular workflow can be used in the preliminary evaluation of pesticides and reproduced in more data-intensive studies.

References or Bibliography

About Pesticide Registration.(n.d.).Environmental Protection Agency. Retrieved from https://www.epa.gov/pesticide-registration/about-pesticide-registration

Fishel, F. M. (n.d.). EPA Approval of Pesticide Labeling. Retrieved from https://edis.ifas.ufl.edu/publication/PI203

Kobayashi, Y., Uchida, T., & Yoshida, K. (2020). Prediction of soil adsorption coef-ficient in pesticides using physicochemical properties and molecular descriptorsby machine learning models. Environmental toxicology and chemistry,39(7),1451–1459.

Landrum, G., Sforna, G., De Winter, H., & Deric. (n.d.).RDKit. Retrieved from https://www.rdkit.org/

NCI/CADD Chemical Identifier Resolver.(n.d.). U.S. Department of Health and Human Services. Retrieved from https://cactus.nci.nih.gov/chemical/structure

PDP Database Search. (n.d.). Retrieved from https://apps.ams.usda.gov/pdp

PubChemPy documentation. (n.d.). Retrieved from https://pubchempy.readthedocs.io/en/latest/

Taylor, J. (2021, Mar). New Federal Study: Extremely Toxic Pesticide Break-down Products Found in 90% of Streams Sampled Across U.S. Retrieved from https://biologicaldiversity.org/w/news/press-releases/new-federal-study-extremely-toxic-pesticide-breakdown-products-found-in-90-of-streams-sampled-across-us-2021-03-26/10

Downloads

Posted

12-27-2021