Unsupervised Gaze Representation Learning with Conjugate Gaze Consistency Loss for Enhanced Gaze Estimation
DOI:
https://doi.org/10.47611/jsrhs.v14i1.8682Keywords:
Gaze Estimation, Communication Board, Convolutional Neural NetworkAbstract
Patients suffering from quadriplegia and other paralysis that interfere with the ability to communicate have been increasing for the past decade, and in turn, the need for better communication boards have been growing. Current communication boards, both physical and digital communication boards, all have flaws starting from the need of an assistant to the sole expense of the device. However, gaze estimation techniques have also been gaining attention to enhance the quality of communication boards by tracking the movement of the eye via a camera to assess what the user is trying to communicate. Previous studies on gaze estimation algorithms have shown that collecting data for accurate gaze values is an arduous task, and that the accuracy of the gaze estimation models has been unsatisfactory for practical use. Thus, I propose a gaze estimation-based digital communication board system that combines gaze representation learning with transfer learning. In the representation learning phase, I introduce a random sign-reversal module to efficiently isolate gaze-related features. In the transfer learning phase, I implement a medically driven loss function to enhance the system's accuracy. The proposed system achieved an angular error of 9.42 degrees which represents state-of-the-art performance compared to previous studies.
Downloads
References or Bibliography
Funes Mora, K. A., Monay, F., & Odobez, J. M. (2014, March). Eyediap: A database for the development and evaluation of gaze estimation algorithms from rgb and rgb-d cameras. In Proceedings of the symposium on eye tracking research and applications (pp. 255-258).
Gamper, J., Koohbanani, N. A., Benes, K., Graham, S., Jahanifar, M., Khurram, S. A., ... & Rajpoot, N. (2020). Pannuke dataset extension, insights and baselines. arXiv preprint arXiv:2003.10778.
Kar, A., & Corcoran, P. (2017). A review and analysis of eye-gaze estimation systems, algorithms and performance evaluation methods in consumer platforms. IEEE Access, 5, 16495-16519.
Kim, J. H., & Jeong, J. W. (2020). Gaze in the dark: Gaze estimation in a low-light environment with generative adversarial networks. Sensors, 20(17), 4935.
Park, S., Aksan, E., Zhang, X., & Hilliges, O. (2020). Towards end-to-end video-based eye-tracking. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII 16 (pp. 747-763). Springer International Publishing.
Patak, L., Gawlinski, A., Fung, N. I., Doering, L., Berg, J., & Henneman, E. A. (2006). Communication boards in critical care: patients' views. Applied nursing research, 19(4), 182-190.
Suwarno, S., & Kevin, K. (2020). Analysis of face recognition algorithm: Dlib and opencv. Journal of Informatics and Telecommunication Engineering, 4(1), 173-184.
Zhang, D., Yin, J., Zhu, X., & Zhang, C. (2018). Network representation learning: A survey. IEEE transactions on Big Data, 6(1), 3-28.
Zhang, X., Sugano, Y., Fritz, M., & Bulling, A. (2015). Appearance-based gaze estimation in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4511-4520).
Published
How to Cite
Issue
Section
Copyright (c) 2025 Junho Kee; Giyoung Yang

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.


