Understanding Dog Behavior through Visual and Auditory Sensing Using Machine Learning

Authors

  • Amy Lin Princeton High School
  • Mark Eastburn Princeton High School

DOI:

https://doi.org/10.47611/jsrhs.v12i4.5801

Keywords:

machine learning, animal behavior, Convolutional Neural Network, image motion estimation, classification, stimuli

Abstract

This work aims to understand a dog’s behavior towards environmental stimuli. Different from previous works, we collect multi-modality data including both video and audio data observed from the dog’s egocentric perspective. We propose to model the association between a dog’s reaction and the visual and auditory stimuli perceived by the dog using machine learning, in particular through an extended Convolutional Neural Network (eCNN). The eCNN model takes colored images, Short Time Fourier Transform (STFT) of audio, and motion fields extracted from image sequences as input, and outputs a prediction of the dog’s reaction, classified as Sit, Stand, Walk, or Smell. Our proposed model achieves promising prediction results, with an average accuracy of 79.02% over all four classes. We also evaluate model performance by separately using one of image, audio, and motion information. Our results show that the dog responds strongly to low-frequency sounds and various color differences in its field of view. These research findings provide valuable insights to understanding animal behavior and intelligence as well as insights for building robotic companion dogs.

Downloads

Download data is not yet available.

References or Bibliography

Ehsani, K., Bagherinezhad H., Redmon, J., Mottaghi, R., & Farhadi, A. (2018). Who let the dogs out? Modeling dog behavior from visual data, CVPR, pp. 4051-4060. https://doi.org/10.48550/arXiv.1803.10827

Berns, G. S., Brooks, A. M., & Spivak, M. (2012). Functional MRI in Awake Unrestrained Dogs. PLoS ONE 7(5): e38027. https://doi.org/10.1371/journal.pone.0038027

Agrawal, P., Carreira, J., & Malik, J. (2015). Learning to see by moving, International Conference on Computer Vision, pp. 37-45. https://doi.org/10.48550/arXiv.1505.01596

Fathi, A., Farhadi, A., & Rehg, J. M. (2011). Understanding egocentric activities, International Conference on Computer Vision, pp. 407-414. https://doi.org/10.1109/ICCV.2011.6126269

Lee, Y. J., Ghosh, J., & Grauman, K. (2012). Discovering important people and objects for egocentric video summarization, IEEE Conference on Computer Vision and Pattern Recognition, pp. 1346-1353. https://doi.org/10.1109/CVPR.2012.6247820

Pintea, S. L., van Gemert, J. C., & Smeulders, A. W. M. (2014). Deja vu: Motion prediction in static images, European Conference on Computer Vision, pp 172–187. https://doi.org/10.48550/arXiv.1803.06951

Gonzalez, R., & Woods, R. (1992). Digital Image Processing, 3rd Edition, Pearson Prentice Hall, pp. 414-428.

Kim, J., & Moon, N. (2022). Dog Behavior Recognition Based on Multimodal Data from a Camera and Wearable Device, Applied Sciences, 12(6):3199. https://doi.org/10.3390/app12063199

Hussain, A., Ali, S., Abdullah, & Kim, H. -C. (2022). Activity Detection for the Wellbeing of Dogs Using Wearable Sensors Based on Deep Learning, IEEE Access, vol. 10, pp. 53153-53163. https://doi.org/10.1109/ACCESS.2022.3174813

Siwak, C. T., Murphey, H. L., Muggenburg, B. A., & Milgram, N. W. (2002). Age-dependent decline in locomotor activity in dogs is environment specific, Physiology & Behavior, 75(1-2), pp. 65-70. https://doi.org/10.1016/s0031-9384(01)00632-1

Quaranta, A., Siniscalchi, M., & Vallortigara, G. (2007). Asymmetric tail-wagging response by dogs to different emotive stimuli, Current Biology, vol. 17, no. 6, pp. 199-201. https://doi.org/10.1016/j.cub.2007.02.008

Völter, C. J., Lonardo, L., Steinmann, M. G. G. M., Ramos, C. F., Gerwisch, K., Schranz, M. T., Dobernig, I., & Huber, L. (2023). Unwilling or unable? Using three-dimensional tracking to evaluate dogs' reactions to differing human intentions, Proceedings of the Royal Society, 290(1991). https://doi.org/10.1098/rspb.2022.1621

Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324. https://doi.org/10.1109/5.726791

Martinez, J., Black, M. J., & Romero, J. (2017). On human motion prediction using recurrent neural networks, IEEE Computer Vision and Pattern Recognition Conference, pp. 2891-2900. https://doi.org/10.48550/arXiv.1705.02445

Liu, J., Shahroudy, A., Xu, D., & Wang, G. (2016). Spatio-temporal lstm with trust gates for 3d human action recognition, European Conference on Computer Vision, pp 816–833. https://doi.org/10.48550/arXiv.1607.07043

Venugopalan, S., Rohrbach, M., Donahue, J., Mooney, R., Darrell, T., & Saenko, K. (2015). Sequence to Sequence – Video to Text, IEEE Conference on Computer Vision and Pattern Recognition, pp. 4534-4542. https://doi.org/10.48550/arXiv.1505.00487

Mealin, S., Domínguez, I. X., & Roberts, D. L. (2016). Semi-supervised classification of static canine postures using the microsoft kinect. Proceedings of the Third International Conference on Animal-Computer Interaction, pp. 1-4. https://doi.org/10.1145/2995257.3012024

Robinson, C., Mancini, C., van der Linden, J., Guest, C., & Swanson, L. (2015). Exploring assistive technology for assistance dog owners in emergency situations. Proceedings of the 8th ACM International Conference on PErvasive Technologies Related to Assistive Environments, pp. 1-2. https://doi.org/10.1145/2769493.2769576

Atif, O., Lee, J., Park, D., & Chung, Y. (2023). Behavior-Based Video Summarization System for Dog Health and Welfare Monitoring. Sensors, 23(6), 2892. https://doi.org/10.3390/s23062892

Boneh-Shitrit, T., Feighelstein, M., Bremhorst, A., Amir, S., Distelfeld, T., Dassa, Y., Yaroshetsky, S., Riemer, S., Shimshoni, I., Mills, D. S., & Zamansky, A. (2022). Explainable automated recognition of emotional states from canine facial expressions: the case of positive anticipation and frustration. Sci Rep 12, 22611. https://doi.org/10.1038/s41598-022-27079-w

Ferres, K., Schloesser, T., & Gloor, P.A. (2022). Predicting Dog Emotions Based on Posture Analysis Using DeepLabCut. Future Internet 14(4), 97. https://doi.org/10.3390/fi14040097

MacLean, E. L., Herrmann, E., Suchindran, S., & Hare, B. (2017). Individual differences in cooperative communicative skills are more similar between dogs and humans than chimpanzees. Animal Behaviour, 126, 41–51. https://doi.org/10.1016/j.anbehav.2017.01.005

Published

11-30-2023

How to Cite

Lin, A., & Eastburn, M. (2023). Understanding Dog Behavior through Visual and Auditory Sensing Using Machine Learning. Journal of Student Research, 12(4). https://doi.org/10.47611/jsrhs.v12i4.5801

Issue

Section

HS Research Articles