Unintended Bias in Artificial Intelligence Driven Diagnosis of Melanoma: A Systematic Review

Authors

  • Kemka Ihemelandu McDonogh School
  • Chris Albanese

DOI:

https://doi.org/10.47611/jsrhs.v12i1.4142

Keywords:

Bias, Artificial intelligence, melanoma

Abstract

Melanoma remains a public health crisis, with incidence rates increasing rapidly in the past decades. Improving diagnostic accuracy to decrease misdiagnosis using Artificial intelligence (AI) continues to be documented. Unfortunately, unintended racially biased outcomes a product of lack of diversity in the dataset used, with a noted class imbalance favoring lighter vs. darker skin tone have increasingly been recognized as a problem. Resulting in noted limitations of the accuracy of the Convolutional neural network (CNN) models. CNN models are prone to biased output due to biases in the dataset used to train them. Although the incidence of melanoma is lower in patients with darker skin tone, it is associated with a worse prognosis than in Caucasians, underscoring the need for accurate early diagnosis in these patients. Our objective in this systematic review was to assess to what degree race/ethnicity, specifically Black/ African American patient cohort were included in training datasets used in generating machine learning algorithms for automated melanoma diagnosis. Our review documents the fact that there is a remarkable lack of and inconsistent reporting of patient demographics especially race/ethnicity with notable under-representation of patients of color, highlighting a currently unmet critical need of lack of diversity in the publicly available skin image datasets. These publicly available skin image datasets, are an inherently unbalanced unintentionally biased datasets from which AI models created for the diagnosis of melanoma skin cancer have restricted applicability to real life clinical scenarios and limited population representation preventing generalizability to the community as a whole.

Downloads

Download data is not yet available.

References or Bibliography

) https://www.cancer.org/cancer/melanoma-skin-cancer/about/key-statistics.html

) https://www.cancer.gov/publications/dictionaries/cancer-terms/def/melanoma

) American Cancer Society. Facts & Figures 2022. American Cancer Society. Atlanta, Ga. 2022.

) Mahendraraj, K., Sidhu, K., Lau, C. S. M., McRoy, G. J., Chamberlain, R. S., & Smith, F. O. (2017). Malignant Melanoma in African-Americans: A Population-Based Clinical Outcomes Study Involving 1106 African-American Patients from the Surveillance, Epidemiology, and End Result (SEER) Database (1988-2011). Medicine, 96(15), e6258. https://doi.org/10.1097/MD.0000000000006258

) Codella N, Rotemberg V, Tschandl P, et al. Skin Lesion Analysis Toward Melanoma

Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC).

arXiv; 2018

) Kinyanjui NM, Odonga T, Cintas C, et al. Fairness of Classifiers Across Skin Tones in

Dermatology. Medical Image Computing and Computer Assisted Intervention -- MICCAI 2020.

) Daneshjou, R., Smith, M. P., Sun, M. D., Rotemberg, V., & Zou, J. (2021). Lack of Transparency and Potential Bias in Artificial Intelligence Data Sets and Algorithms: A Scoping Review. JAMA dermatology, 157(11), 1362–1369. https://doi.org/10.1001/jamadermatol.2021.3129

) Kaushal, A., Altman, R., & Langlotz, C. (2020). Geographic Distribution of US Cohorts Used to Train Deep Learning Algorithms. JAMA, 324(12), 1212–1213. https://doi.org/10.1001/jama.2020.12067

) Wang, X., Liang, G., Zhang, Y., Blanton, H., Bessinger, Z., & Jacobs, N. (2020). Inconsistent Performance of Deep Learning Models on Mammogram Classification. Journal of the American College of Radiology : JACR, 17(6), 796–803. https://doi.org/10.1016/j.jacr.2020.01.006

) Liberati, A., Altman, D. G., Tetzlaff, J., Mulrow, C., Gøtzsche, P. C., Ioannidis, J. P., Clarke, M., Devereaux, P. J., Kleijnen, J., & Moher, D. (2009). The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ (Clinical research ed.), 339, b2700. https://doi.org/10.1136/bmj.b2700

) Topol E. J. (2019). High-performance medicine: the convergence of human and artificial intelligence. Nature medicine, 25(1), 44–56. https://doi.org/10.1038/s41591-018-0300-7

) Tomatis, S., Bono, A., Bartoli, C., Carrara, M., Lualdi, M., Tragni, G., & Marchesini, R. (2003). Automated melanoma detection: multispectral imaging and neural network approach for classification. Medical physics, 30(2), 212–221. https://doi.org/10.1118/1.1538230

) Fujisawa, Y., Otomo, Y., Ogata, Y., Nakamura, Y., Fujita, R., Ishitsuka, Y., Watanabe, R., Okiyama, N., Ohara, K., & Fujimoto, M. (2019). Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. The British journal of dermatology, 180(2), 373–381. https://doi.org/10.1111/bjd.16924

) Dreiseitl, S., Ohno-Machado, L., Kittler, H., Vinterbo, S., Billhardt, H., & Binder, M. (2001). A comparison of machine learning methods for the diagnosis of pigmented skin lesions. Journal of biomedical informatics, 34(1), 28–36. https://doi.org/10.1006/jbin.2001.1004

) Tschandl, P., Rosendahl, C., & Kittler, H. (2018). The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5, 180161. https://doi.org/10.1038/sdata.2018.161

) Haenssle, H. A., Fink, C., Schneiderbauer, R., Toberer, F., Buhl, T., Blum, A., Kalloo, A., Hassen, A. B. H., Thomas, L., Enk, A., Uhlmann, L., Reader study level-I and level-II Groups, Alt, C., Arenbergerova, M., Bakos, R., Baltzer, A., Bertlich, I., Blum, A., Bokor-Billmann, T., Bowling, J., … Zalaudek, I. (2018). Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Annals of oncology : official journal of the European Society for Medical Oncology, 29(8), 1836–1842. https://doi.org/10.1093/annonc/mdy166

) Marchetti, M. A., Codella, N. C. F., Dusza, S. W., Gutman, D. A., Helba, B., Kalloo, A., Mishra, N., Carrera, C., Celebi, M. E., DeFazio, J. L., Jaimes, N., Marghoob, A. A., Quigley, E., Scope, A., Yélamos, O., Halpern, A. C., & International Skin Imaging Collaboration (2018). Results of the 2016 International Skin Imaging Collaboration International Symposium on Biomedical Imaging challenge: Comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images. Journal of the American Academy of Dermatology, 78(2), 270–277.e1. https://doi.org/10.1016/j.jaad.2017.08.016

) Wen, D., Khan, S. M., Ji Xu, A., Ibrahim, H., Smith, L., Caballero, J., Zepeda, L., de Blas Perez, C., Denniston, A. K., Liu, X., & Matin, R. N. (2022). Characteristics of publicly available skin cancer image datasets: a systematic review. The Lancet. Digital health, 4(1), e64–e74. https://doi.org/10.1016/S2589-7500(21)00252-1

) The International Skin Imaging Collaboration. https://www.isic-archive.com/

) Kinyanjui NM, Odonga T, Cintas C, et al. Estimating Skin Tone and Effects on

Classification Performance in Dermatology Datasets. 2019. https://arxiv.org/abs/1910.13268

) Phillips, M., Marsden, H., Jaffe, W., Matin, R. N., Wali, G. N., Greenhalgh, J., McGrath, E., James, R., Ladoyanni, E., Bewley, A., Argenziano, G., & Palamaras, I. (2019). Assessment of Accuracy of an Artificial Intelligence Algorithm to Detect Melanoma in Images of Skin Lesions. JAMA network open, 2(10), e1913436. https://doi.org/10.1001/jamanetworkopen.2019.13436

) Rotemberg, V., Kurtansky, N., Betz-Stablein, B., Caffery, L., Chousakos, E., Codella, N., Combalia, M., Dusza, S., Guitera, P., Gutman, D., Halpern, A., Helba, B., Kittler, H., Kose, K., Langer, S., Lioprys, K., Malvehy, J., Musthaq, S., Nanda, J., Reiter, O., … Soyer, H. P. (2021). A patient-centric dataset of images and metadata for identifying melanomas using clinical context. Scientific data, 8(1), 34. https://doi.org/10.1038/s41597-021-00815-z

) Combalia, M., Codella, N., Rotemberg, V., Carrera, C., Dusza, S., Gutman, D., Helba, B., Kittler, H., Kurtansky, N. R., Liopyris, K., Marchetti, M. A., Podlipnik, S., Puig, S., Rinner, C., Tschandl, P., Weber, J., Halpern, A., & Malvehy, J. (2022). Validation of artificial intelligence prediction models for skin cancer diagnosis using dermoscopy images: the 2019 International Skin Imaging Collaboration Grand Challenge. The Lancet. Digital health, 4(5), e330–e339. https://doi.org/10.1016/S2589-7500(22)00021-8

) Han, S. S., Kim, M. S., Lim, W., Park, G. H., Park, I., & Chang, S. E. (2018). Classification of the Clinical Images for Benign and Malignant Cutaneous Tumors Using a Deep Learning Algorithm. The Journal of investigative dermatology, 138(7), 1529–1538. https://doi.org/10.1016/j.jid.2018.01.028

) Edinburgh Innovations, Dermofit image library. https://licensing.edinburgh-innovations.ed.ac.uk/i/software/dermofit-image-library.html

) Gloster, H. M., Jr, & Neal, K. (2006). Skin cancer in skin of color. Journal of the American Academy of Dermatology, 55(5), 741–764. https://doi.org/10.1016/j.jaad.2005.08.063

) G. Marcus, E. Davis, A.I. Rebooting, Building Artificial Intelligence We Can Trust, 2019. Pantheon.

) Han, S. S., Kim, M. S., Lim, W., Park, G. H., Park, I., & Chang, S. E. (2018). Classification of the Clinical Images for Benign and Malignant Cutaneous Tumors Using a Deep Learning Algorithm. The Journal of investigative dermatology, 138(7), 1529–1538. https://doi.org/10.1016/j.jid.2018.01.028

) Guo, L. N., Lee, M. S., Kassamali, B., Mita, C., & Nambudiri, V. E. (2022). Bias in, bias out: Underreporting and underrepresentation of diverse skin types in machine learning research for skin cancer detection-A scoping review. Journal of the American Academy of Dermatology, 87(1), 157–159. https://doi.org/10.1016/j.jaad.2021.06.884

) Narla, S., Heath, C. R., Alexis, A., & Silverberg, J. I. (2022). Racial disparities in dermatology. Archives of dermatological research, 1–9. Advance online publication. https://doi.org/10.1007/s00403-022-02507-z

Published

02-28-2023

How to Cite

Ihemelandu, K., & Albanese, C. (2023). Unintended Bias in Artificial Intelligence Driven Diagnosis of Melanoma: A Systematic Review. Journal of Student Research, 12(1). https://doi.org/10.47611/jsrhs.v12i1.4142

Issue

Section

HS Review Articles