Machine Learning for the Visually Impaired: Benchmarking Object Detection Models

Aditya Patra; Rae Crandall

doi:10.47611/jsrhs.v13i2.6630

Authors

Aditya Patra California High School
Mrs.Crandall California High School

DOI:

https://doi.org/10.47611/jsrhs.v13i2.6630

Keywords:

Machine Learning, Object Detection

PDF

Abstract

This research paper benchmarks object detection models, a form of machine learning, to determine which algorithm would be most beneficial for the visually impaired to locate objects that are used in everyday life. As most benchmarking experiments test object detection models with a variety of objects, it is essential to test the models using images of more relevant objects to find the most suitable algorithm. The models are tested using still images from the COCO database. Pretrained models employing five of the most popular object detection algorithms are used to process the images and find each model’s detection accuracy. To simulate real life scenarios, these objects may be partially hidden or at a distance. For each image, the models return a list of detections providing the names, confidence rating, and location of each object detected. These results will be filtered to remove detections with low confidence ratings as well as detections of irrelevant objects. The remaining results are compared to the dataset of object names and locations provided by the COCO database to calculate the distance between the predicted object locations and the true location. The algorithms will be ranked based on the number of failed detections, the time taken to analyze each image, and the accuracy of each object’s determined location.

Downloads

References or Bibliography

Brownlee, Jason. “How Do Convolutional Layers Work in Deep Learning Neural Networks?” MachineLearningMastery.Com, 16 Apr. 2020, machinelearningmastery.com/convolutional-layers-for-deep-learning-neural-networks/.

“Centernet/Resnet.” Kaggle, www.kaggle.com/models/tensorflow/centernet-resnet. Accessed 20 Feb. 2024.

Cohen, Jeremy. “Finally Understand Anchor Boxes in Object Detection (2D and 3D).” Welcome to The Library!, Welcome to The Library!, 2 May 2023, www.thinkautonomous.ai/blog/anchor-boxes/.

Duan, Kaiwen, et al. “CenterNet: Keypoint Triplets for Object Detection.”

“Efficientdet.” Kaggle, www.kaggle.com/models/tensorflow/efficientdet/frameworks/tensorFlow2/variations/d2. Accessed 20 Feb. 2024.

“FASTER_RCNN/Resnet_v1.” Kaggle, www.kaggle.com/models/tensorflow/faster-rcnn-resnet-v1/frameworks/tensorFlow2/variations/faster-rcnn-resnet101-v1-1024x1024. Accessed 20 Feb. 2024.

Grel, Tomasz. “What Is Region of Interest (ROI) Pooling?” Deepsense.Ai, Tomasz Grel https://deepsense.ai/wp-content/uploads/2023/10/Logo_black_blue_CLEAN_rgb.png, 6 Nov. 2023, deepsense.ai/region-of-interest-pooling-explained/.

Keita, Zoumana. “Yolo Object Detection Explained: A Beginner’s Guide.” DataCamp, DataCamp, 28 Sept. 2022, www.datacamp.com/blog/yolo-object-detection-explained.

Kundu, Rohit. “Yolo Algorithm for Object Detection Explained [+examples].” YOLO Algorithm for Object Detection Explained [+Examples], 17 Jan. 2023, www.v7labs.com/blog/yolo-object-detection#how-does-yolo-work-yolo-architecture.

“Max Pooling.” DeepAI, DeepAI, 17 May 2019, deepai.org/machine-learning-glossary-and-terms/max-pooling#:~:text=Max%20pooling%20is%20a%20downsampling,dimensions%20of%20an%20input%20volume.

“Object Detection Guide - Everything You Need to Know.” Fritz Ai, 3 Dec. 2023, fritz.ai/object-detection/#:~:text=for%20object%20detection.-,Basic%20structure,to%20locate%20and%20label%20objects.

Patel, Jagrat. “Top 10 Object Detection Models in 2023!” LinkedIn, 27 Aug. 2023, www.linkedin.com/pulse/top-10-object-detection-models-2023-jagrat-patel#:~:text=High%20Accuracy%3A%20Faster%20R%2DCNN,versatile%20for%20different%20use%20cases.

Ren, Shaoqing, et al. “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” arXiv.Org, 6 Jan. 2016, arxiv.org/abs/1506.01497v3.

Schumacher, Devin. “Center Pooling.” SERP AI, SERP AI, 26 July 2023, serp.ai/center-pooling/.

“Ssd_mobilenet_v2.” Kaggle, www.kaggle.com/models/tensorflow/ssd-mobilenet-v2/frameworks/tensorFlow2/variations/fpnlite-320x320. Accessed 20 Feb. 2024.

Thoma, Martin. “How Do Subsequent Convolution Layers Work?” Data Science Stack Exchange, 1 Nov. 1961, datascience.stackexchange.com/questions/9175/how-do-subsequent-convolution-layers-work.

Vyas, Kanan. “Efficientdet - A Comprehensive Review.” Medium, VisionWizard, 19 May 2020, medium.com/visionwizard/efficientdet-a-paper-review-21918d9a648d.

“Yolov8.” Kaggle, www.kaggle.com/models/keras/yolov8/frameworks/keras/variations/yolo_v8_l_backbone. Accessed 20 Feb. 2024.

Machine Learning for the Visually Impaired: Benchmarking Object Detection Models

Authors

DOI:

Keywords:

Abstract

Downloads

References or Bibliography

Published

How to Cite

Issue

Section

Announcements

Call for Papers: Volume 14 Issue 3

ARTICLES
PUBLISHED

STUDENT
AUTHORS

YEARS
OF SERVICE

Machine Learning for the Visually Impaired: Benchmarking Object Detection Models

Authors

DOI:

Keywords:

Abstract

Downloads

References or Bibliography

Published

How to Cite

Issue

Section

Announcements

Call for Papers: Volume 14 Issue 3

ARTICLESPUBLISHED

STUDENTAUTHORS

YEARSOF SERVICE

ARTICLES
PUBLISHED

STUDENT
AUTHORS

YEARS
OF SERVICE