Audio Classification of Bird Species Using Convolutional Neural Networks

Authors

  • Jocelyn Wang Jericho High School
  • Guillermo Goldsztein

DOI:

https://doi.org/10.47611/jsrhs.v12i1.4108

Keywords:

Sound classification, CNN, Spectrogram

Abstract

As the total number of birds has declined in the billions over the last 50 years, an accurate method for classifying bird species is necessary for conservation efforts and population monitoring. One promising method is using machine learning models to classify birds by their sounds, which has emerged due to benefits such as being less affected by environmental factors (eg. habitat, time of day), and lower disturbances to bird species during the data collection process, contrary to other processes such as image classification. As audio processing may eventually become the main method of classifying birds and may be used as an important conservation tool, it is imperative to understand the challenges that must be overcome before it can be successfully applied. In this work, the programming language Python and the machine learning model Convolutional Neural Networks were used to process and classify audio recordings from over 150 different bird species. This study demonstrates that although audio classification is a promising method of classification, many challenges are still present in the field, such as the amount of variety in the different calls of a single bird, the presence of background noises in many audio recordings, and the difficulty in efficiently representing an audio signal with images, highlighting the importance of overcoming these challenges for conservation efforts. 

Downloads

Download data is not yet available.

References or Bibliography

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., ... & Zheng, X. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467. https://doi.org/10.48550/arXiv.1603.04467

Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., ... & Farhan, L. (2021). Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of big Data, 8(1), 1-74. https://doi.org/10.1186/s40537-021-00444-8

Barrowclough, G. F., Cracraft, J., Klicka, J., & Zink, R. M. (2016). How many kinds of birds are there and why does it matter? PLoS ONE 11(11): e0166307. https://doi.org/10.1371/journal.pone.0166307

BirdCLEF 2022 [data]. (2022). kaggle. Retrieved November 30, 2022, from https://www.kaggle.com/competitions/birdclef-2022/data

Budiman, I., Ramdania, D. R., Gerhana, Y. A., Putra, A. R. P., Faizah, N. N., & Harika, M. (2022, September). Classification of Bird Species using K-Nearest Neighbor Algorithm. In 2022 10th International Conference on Cyber and IT Service Management (CITSM) (pp. 1-5). IEEE. https://doi.org/10.1109/CITSM56380.2022.9936012

Gao, R. X., & Yan, R. (2006). Non-stationary signal processing for bearing health monitoring. International journal of manufacturing research, 1(1), 18-40. https://doi.org/10.1504/IJMR.2006.010701

Ghani, B., & Hallerberg, S. (2021). A randomized bag-of-birds approach to study robustness of automated audio based bird species classification. Applied Sciences, 11(19), 9226. https://doi.org/10.3390/app11199226

Giannakopoulos, T. (2015). pyaudioanalysis: An open-source python library for audio signal analysis. PloS one, 10(12), e0144610. https://doi.org/10.1371/journal.pone.0144610

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. Retrieved from: http://www.deeplearningbook.org

Harris, C. R., Millman, K. J., Van Der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., ... & Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357-362. https://doi.org/10.1038/s41586-020-2649-2

Kahl, S., Wilhelm-Stein, T., Hussein, H., Klinck, H., Kowerko, D., Ritter, M., & Eibl, M. (2017). Large-Scale Bird Sound Classification using Convolutional Neural Networks. In CLEF (working notes) (Vol. 1866).

Kalisińska, E. (Ed.). (2019). Mammals and birds as bioindicators of trace element contaminations in terrestrial environments: an ecotoxicological assessment of the Northern Hemisphere. Springer. https://doi.org/10.1007/978-3-030-00121-6

Lepczyk, C. A. (2005). Integrating published data and citizen science to describe bird diversity across a landscape. Journal of Applied Ecology, 42(4), 672-677. https://doi.org/10.1111/j.1365-2664.2005.01059.x

McFee, B., Metsai A., McVicar M., Balke S., Thomé C., Raffel C., Zalkow F., Malek A., Dana, Lee K., Nieto O., Ellis Dan., Mason J., Battenberg E., Seyfarth S.,Yamamoto R., Morozov V., Morozov R., Choi K., Moore J., … Kim T. (2022). librosa/librosa: 0.9.2 (0.9.2). Zenodo. https://doi.org/10.5281/zenodo.6759664

O'Shaughnessy, D. (1987). Speech communications: Human and machine (IEEE). Universities press.

Pérez-Granados, C., Bota, G., Giralt, D., Barrero, A., Gómez-Catasús, J., Bustillo-De La Rosa, D., & Traba, J. (2019). Vocal activity rate index: a useful method to infer terrestrial bird abundance with acoustic monitoring. Ibis, 161(4), 901-907. https://doi.org/10.1111/ibi.12728

Pérez-Granados, C., & Traba, J. (2021). Estimating bird density using passive acoustic monitoring: a review of methods and suggestions for further research. Ibis, 163(3), 765-783. https://doi.org/10.1111/ibi.12944

Ramashini, M., Abas, P. E., Mohanchandra, K., & De Silva, L. C. (2022). Robust cepstral feature for bird sound classification. International Journal of Electrical and Computer Engineering, 12(2), 1477-1487. https://doi.org/10.11591/ijece.v12i2.pp1477-1487

Rosenberg, K. V., Dokter, A. M., Blancher, P. J., Sauer, J. R., Smith, A. C., Smith, P. A., ... & Marra, P. P. (2019). Decline of the North American avifauna. Science, 366(6461), 120-124. https://doi.org/10.1126/science.aaw1313

Roslan, R., Nazery, N. A., Jamil, N., & Hamzah, R. (2017, October). Color-based bird image classification using Support Vector Machine. In 2017 IEEE 6th Global Conference on Consumer Electronics (GCCE) (pp. 1-5). IEEE. https://doi.org/10.1109/GCCE.2017.8229492

Sekercioglu, Ç. H., Wenny, D. G., & Whelan, C. J. (Eds.). (2016). Why birds matter: avian ecological function and ecosystem services. University of Chicago Press. https://doi.org/10.1111/jofo.12214

Tim Sainburg. (2019). timsainb/noisereduce: v1.0 (db94fe2). Zenodo. https://doi.org/10.5281/zenodo.3243139

Verstraeten, W. W., Vermeulen, B., Stuckens, J., Lhermitte, S., Van der Zande, D., Van Ranst, M., & Coppin, P. (2010). Webcams for bird detection and monitoring: A demonstration study. Sensors, 10(4), 3480-3503. https://doi.org/10.3390/s100403480

Wang, H., Xu, Y., Yu, Y., Lin, Y., & Ran, J. (2022). An Efficient Model for a Vast Number of Bird Species Identification Based on Acoustic Features. Animals, 12(18), 2434. https://doi.org/10.3390/ani12182434

Yang, S., Frier, R., & Shi, Q. (2021, February). Acoustic classification of bird species using wavelets and learning algorithms. In 2021 13th International Conference on Machine Learning and Computing (pp. 67-71). https://doi.org/10.1145/3457682.3457692

Published

02-28-2023

How to Cite

Wang, J., & Goldsztein, G. (2023). Audio Classification of Bird Species Using Convolutional Neural Networks. Journal of Student Research, 12(1). https://doi.org/10.47611/jsrhs.v12i1.4108

Issue

Section

HS Research Articles