Deep Learning for Comic Book Emotion Analysis
DOI:
https://doi.org/10.47611/jsrhs.v13i4.7960Keywords:
Computer Vision, Convolutional Neural Network, Comic Book, Deep Learning, Sentiment AnalysisAbstract
Deep learning techniques have been applied successfully in a number of fields, including computer vision and image processing. This study presents a comprehensive approach to analyzing the sentiment conveyed in comic book panels through the emotions depicted on characters' faces. We hypothesize that it is possible to accurately interpret emotional content using deep learning techniques even without a pre-existing sentiment dataset. Using optical character recognition and pretrained sentiment analysis models, the basis of an NLP model was formed to comprehend emotional context from characters' dialogues and thoughts. A neural network then categorizes emotions exhibited by characters' facial expressions. Our findings confirm that sentiment analysis can indeed be performed on comic book data, with tests on the Digital Comic Museum dataset demonstrating sentiment analysis efficacy of 89%. Then, the optimal configuration of convolutional neural networks was identified with 7x7 filters, 200 neurons, and 64 filters per layer, achieving an accuracy of 86%. This research advances the capability of facial recognition technology, expanding its application from humans to fictional characters in comic books. It also sets the groundwork for future research to generate datasets when specialized data is not available, demonstrating the practicality of performing sentiment analysis on comic book faces.
Downloads
References or Bibliography
Sagri, M.; Sofos, F.; Mouzaki, D. Digital Storytelling, Comics And New Technologies In Education: Review, Research, And Perspectives; 2018; Vol. 17. https://openjournals.library.sydney.edu.au/index.php/IEJ.
Nguyen, N.-V.; Rigaud, C.; Burie, J.-C. Digital Comics Image Indexing Based On Deep Learning; 2018. https://doi.org/10.3390/jimaging4070089.
Dutta, A.; Biswas, S.; Das, A. K. EmoComicNet: A Multi-Task Model For Comic Emotion Recognition. Pattern Recognition 2024, 150, 110261. https://doi.org/10.1016/j.patcog.2024.110261.
Laubrock, J.; Dunst, A.; Cohn, N.; Magliano, J. P. Computational Approaches To Comics Analysis; 2020; Vol. 12. https://doi.org/10.1111/tops.12476.
Chaudhuri et al. Optical Character Recognition Systems For Different Languages With Soft Computing; 1st ed.; Springer Nature, 2017.
Rigaud, C. Dataset: Golden Age Comic Book Panels. Digital Comic Museum. http://digitalcomicmuseum.com.
Rigaud, C. DCM dataset. GitLab. https://gitlab.univ-lr.fr/crigau02/dcm-dataset.
Knowledge-driven understanding of images in comic books. International Journal on Document Analysis and Recognition 2015, 1433–2833. https://doi.org/10.1007/s10032-015-0243-1.
Rigaud, C.; Haxaire, A.; Karatzas, D.; Burie, J.-C.; Ogier, J.-M. An Active Contour Model for Speech Balloon Detection in Comics. Int. J. Document Anal. Recognit. 2015, 18 (2), 121-135.
Jha, S.; Agarwal, N.; Agarwal, S. Bringing Cartoons to Life: Towards Improved Cartoon Face Detection and Recognition Systems. arXiv 2018, arXiv:1804.01753. https://arxiv.org/abs/1804.01753.
Gonzalez, R.; Woods, R. Digital Image Processing Global Edition, 4th ed.; Pearson, 2017.
Huang, L.-K.; Wang, M.-J. J. Image thresholding by minimizing the measures of fuzziness. Pattern Recognition 1995, https://doi.org/10.1016/0031-3203(94)e0043-k.
D’Haeyer, J. P. F. Gaussian filtering of images: A regularization approach. Signal Processing 1989, 18 (2), 169–181. https://doi.org/10.1016/0165-1684(89)90048-0.
High Pass Filter Documentation. NV5 Geospatial. https://www.nv5geospatialsoftware.com/docs/HighPassFilter.html.
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv.org. https://arxiv.org/abs/1810.04805.
Hugging Face – The AI community building the future. https://huggingface.co/.
LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521 (7553), 436-444.
Zhang, X.; Zhao, J.; LeCun, Y. Character-level Convolutional Networks for Text Classification. Adv. Neural Inf. Process. Syst. 2015, 28, 649-657.
Caceres, P., Introduction to Neural Network Models of Cognition. https://com-cog-book.github.io/com-cog-book/intro (2020).
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv.org. https://arxiv.org/abs/1512.03385.
Soydaner, D. Attention mechanism in neural networks: where it comes and where it goes. Neural Computing & Applications 2022, 34 (16), 13371–13385. https://doi.org/10.1007/s00521-022-07366-3.
Krizhevsky, A.; Sutskever, I.; Hinton, G. E. ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097-1105.
Russakovsky, O.; Deng, J.; Su, H.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vision 2015, 115 (3), 211-252.
Pal, N. R.; Bhandari, D. Image thresholding: Some new techniques. Signal Processing 1993, 33 (2), 139–158. https://doi.org/10.1016/0165-1684(93)90107-l.
Kim, Y. Convolutional Neural Networks for Sentence Classification. Proc. Conf. Empirical Methods Nat. Lang. Process. (EMNLP) 2014, 1746-1751.
Tesseract OCR. Tesseract: An Open Source OCR Engine. https://github.com/tesseract-ocr.
Sze, V.; Chen, Y.-H.; Yang, T.-J.; Emer, J. S. Efficient Processing of Deep Neural Networks: A Tutorial and Survey. Proc. IEEE 2017, 105 (12), 2295-2329.
Augereau, O.; Iwata, M.; Kise, K. A Survey of Comics Research in Computer Science; 2018; p 87. https://doi.org/10.3390/jimaging4070087.
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, 2016.
Published
How to Cite
Issue
Section
Copyright (c) 2024 Walter Hsieh; Eric Sakk

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.


