Apple Recognition Method in Orchards Based on SAM and YOLOv8
DOI:
https://doi.org/10.47611/jsrhs.v14i1.8463Keywords:
Apple Recognition, SAM (Segment Anything Model), Computer Vision, YOLOv8Abstract
The aim of this research is to improve automatic harvesting of orchard apples through an efficient detection method. By applying TensorRT, the YOLOv8 model will run much more efficiently while optimising computational resources. In particular, we believe that the depth and complexity of various YOLOv8 versions of the model will play a key role in improving the detection performance. Therefore, in this study, we tested several versions of YOLOv8 algorithmic recognition models, such as YOLOv8s, YOLOv8n, YOLOv8l, YOLOv8m, etc., and used a variety of model annotation methods including traditional manual annotation, unsupervised annotation, and semiautomatic annotation tools based on the large-scale SAM model SAMsaa. In addition, we tested the effectiveness of automatic apple detection in orchards with and without hardware acceleration. In order to test the above hypotheses, we conducted several experiments and showed that the overall detection performance of the YOLOv8m model was significantly improved in the experimental setting where the dataset was labelled using the SAMsaa tool and optimised using TensorRT. In addition, the overall detection performance of the YOLOv8m model was even more significantly improved in the experiments where the TensorRT-optimised dataset was labelled using the SAMsaa tool on the Jetson Xavier computing platform. The detection mAP50 improved by 33% and 32.7%, respectively, and the average detection accuracy for apple detection reached 90.41%. These results validate the effectiveness and superiority of our method.
Downloads
References or Bibliography
Dalal, N., & Triggs, B. (2005, June). Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 886-893). Ieee. https://doi.org/10.1109/cvpr.2005.177.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25. https://doi.org/10.1145/3065386.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9). https://doi.org/10.1109/cvpr.2015.7298594.
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017, February). Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 31, No. 1). https://doi.org/10.1609/aaai.v31i1.11231.
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580-587). https://doi.org/10.1109/cvpr.2014.81.
Ren, S., He, K., Girshick, R., & Sun, J. (2016). Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence, 39(6), 1137-1149. https://doi.org/10.1109/iccv.2015.169.
Redmon, J. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/iccv.2015.169.
Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263-7271). https://doi.org/10.48550/arxiv.1612.08242.
Redmon, J. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. https://doi.org/10.48550/arXiv.1804.02767.
Published
How to Cite
Issue
Section
Copyright (c) 2025 Sunny Lu; Zhiquan Jiao

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.


