Machine Learning for the Visually Impaired: Benchmarking Object Detection Models
DOI:
https://doi.org/10.47611/jsrhs.v13i2.6630Keywords:
Machine Learning, Object DetectionAbstract
This research paper benchmarks object detection models, a form of machine learning, to determine which algorithm would be most beneficial for the visually impaired to locate objects that are used in everyday life. As most benchmarking experiments test object detection models with a variety of objects, it is essential to test the models using images of more relevant objects to find the most suitable algorithm. The models are tested using still images from the COCO database. Pretrained models employing five of the most popular object detection algorithms are used to process the images and find each model’s detection accuracy. To simulate real life scenarios, these objects may be partially hidden or at a distance. For each image, the models return a list of detections providing the names, confidence rating, and location of each object detected. These results will be filtered to remove detections with low confidence ratings as well as detections of irrelevant objects. The remaining results are compared to the dataset of object names and locations provided by the COCO database to calculate the distance between the predicted object locations and the true location. The algorithms will be ranked based on the number of failed detections, the time taken to analyze each image, and the accuracy of each object’s determined location.
Downloads
References or Bibliography
Brownlee, Jason. “How Do Convolutional Layers Work in Deep Learning Neural Networks?” MachineLearningMastery.Com, 16 Apr. 2020, machinelearningmastery.com/convolutional-layers-for-deep-learning-neural-networks/.
“Centernet/Resnet.” Kaggle, www.kaggle.com/models/tensorflow/centernet-resnet. Accessed 20 Feb. 2024.
Cohen, Jeremy. “Finally Understand Anchor Boxes in Object Detection (2D and 3D).” Welcome to The Library!, Welcome to The Library!, 2 May 2023, www.thinkautonomous.ai/blog/anchor-boxes/.
Duan, Kaiwen, et al. “CenterNet: Keypoint Triplets for Object Detection.”
“Efficientdet.” Kaggle, www.kaggle.com/models/tensorflow/efficientdet/frameworks/tensorFlow2/variations/d2. Accessed 20 Feb. 2024.
“FASTER_RCNN/Resnet_v1.” Kaggle, www.kaggle.com/models/tensorflow/faster-rcnn-resnet-v1/frameworks/tensorFlow2/variations/faster-rcnn-resnet101-v1-1024x1024. Accessed 20 Feb. 2024.
Grel, Tomasz. “What Is Region of Interest (ROI) Pooling?” Deepsense.Ai, Tomasz Grel https://deepsense.ai/wp-content/uploads/2023/10/Logo_black_blue_CLEAN_rgb.png, 6 Nov. 2023, deepsense.ai/region-of-interest-pooling-explained/.
Keita, Zoumana. “Yolo Object Detection Explained: A Beginner’s Guide.” DataCamp, DataCamp, 28 Sept. 2022, www.datacamp.com/blog/yolo-object-detection-explained.
Kundu, Rohit. “Yolo Algorithm for Object Detection Explained [+examples].” YOLO Algorithm for Object Detection Explained [+Examples], 17 Jan. 2023, www.v7labs.com/blog/yolo-object-detection#how-does-yolo-work-yolo-architecture.
“Max Pooling.” DeepAI, DeepAI, 17 May 2019, deepai.org/machine-learning-glossary-and-terms/max-pooling#:~:text=Max%20pooling%20is%20a%20downsampling,dimensions%20of%20an%20input%20volume.
“Object Detection Guide - Everything You Need to Know.” Fritz Ai, 3 Dec. 2023, fritz.ai/object-detection/#:~:text=for%20object%20detection.-,Basic%20structure,to%20locate%20and%20label%20objects.
Patel, Jagrat. “Top 10 Object Detection Models in 2023!” LinkedIn, 27 Aug. 2023, www.linkedin.com/pulse/top-10-object-detection-models-2023-jagrat-patel#:~:text=High%20Accuracy%3A%20Faster%20R%2DCNN,versatile%20for%20different%20use%20cases.
Ren, Shaoqing, et al. “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” arXiv.Org, 6 Jan. 2016, arxiv.org/abs/1506.01497v3.
Schumacher, Devin. “Center Pooling.” SERP AI, SERP AI, 26 July 2023, serp.ai/center-pooling/.
“Ssd_mobilenet_v2.” Kaggle, www.kaggle.com/models/tensorflow/ssd-mobilenet-v2/frameworks/tensorFlow2/variations/fpnlite-320x320. Accessed 20 Feb. 2024.
Thoma, Martin. “How Do Subsequent Convolution Layers Work?” Data Science Stack Exchange, 1 Nov. 1961, datascience.stackexchange.com/questions/9175/how-do-subsequent-convolution-layers-work.
Vyas, Kanan. “Efficientdet - A Comprehensive Review.” Medium, VisionWizard, 19 May 2020, medium.com/visionwizard/efficientdet-a-paper-review-21918d9a648d.
“Yolov8.” Kaggle, www.kaggle.com/models/keras/yolov8/frameworks/keras/variations/yolo_v8_l_backbone. Accessed 20 Feb. 2024.
Published
How to Cite
Issue
Section
Copyright (c) 2024 Aditya Patra; Mrs.Crandall
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.