A Method of Disentanglement of Latent Factor using Geometric Feature for Gaze Estimation Network Training

Authors

  • Seung-woo Ko Elite Open School LRC Korea
  • Bo Kyoung Park Elite Open School LRC Korea

DOI:

https://doi.org/10.47611/jsrhs.v12i1.4075

Keywords:

Gaze Estimation, Autoencoder, Representation Learning

Abstract

Since each human eye has different anatomical features, gaze estimation is a very challenging task. Although numerous studies regarding gaze estimation were proposed, there is a need for improving the preciseness in order to facilitate the application of the method to real-world scenarios. To accomplish this goal, I propose a novel training strategy for gaze representation learning. The proposed training method includes two training phases: the autoencoder-based representation learning phase and the gaze estimation network training phase. The proposed training strategy enforces the trained model to disentangle the gaze-related latent code and produce a more accurate gaze estimation. In addition, I also propose and showcase a real-world application that exploits the proposed method in order to prove the practicality of the proposed method. Through the experiment, it is proven that the proposed method shows an outstanding performance compared to other methods on the Gaze360 dataset. 

Downloads

Download data is not yet available.

References or Bibliography

Cheng, Y., Bao, Y., & Lu, F. (2022, June). Puregaze: Purifying gaze feature for generalizable gaze estimation. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 1, pp. 436-443).

Sun, Y., Zeng, J., Shan, S., & Chen, X. (2021). Cross-encoder for unsupervised gaze representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3702-3711).

Gideon, J., Su, S., & Stent, S. (2022). Unsupervised Multi-View Gaze Representation Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 5001-5009).

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).

Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Park, S. C., Park, M. K., & Kang, M. G. (2003). Super-resolution image reconstruction: a technical overview. IEEE signal processing magazine, 20(3), 21-36.

Kellnhofer, P., Recasens, A., Stent, S., Matusik, W., & Torralba, A. (2019). Gaze360: Physically unconstrained gaze estimation in the wild. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 6912-6921).

Liu, Y., Liu, R., Wang, H., & Lu, F. (2021). Generalizing gaze estimation with outlier-guided collaborative adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3835-3844).

Published

02-28-2023

How to Cite

Ko, S.- woo, & Park, B. K. (2023). A Method of Disentanglement of Latent Factor using Geometric Feature for Gaze Estimation Network Training. Journal of Student Research, 12(1). https://doi.org/10.47611/jsrhs.v12i1.4075

Issue

Section

HS Research Projects