IMPROVING THE ACCURACY OF SMALL OBJECT DETECTION ON YOLO BY INCREASING THE NUMBER OF INPUT GRIDS

Herdianto Herdianto; Indri Sulistianingsih; Iwan Fitrianto Rahmad

doi:10.46576/prosundhar.v4i1.366

IMPROVING THE ACCURACY OF SMALL OBJECT DETECTION ON YOLO BY INCREASING THE NUMBER OF INPUT GRIDS

Herdianto Herdianto, Indri Sulistianingsih, Iwan Fitrianto Rahmad

Abstract

Object detection is one of every Driver Autonomous System (DAS) capabilities. However, the object detection results currently used are limited to detecting large objects, whereas for small objects less than 80 * 80 pixels, the detection accuracy can be less than 60% when using YOLO. Based on the low object detection accuracy results above, this research will try to increase the number of grids in the YOLO input image from 7*7, 10*10, 13*13, 16*16 and 19*19 in the YOLO input to improve object detection accuracy small in size. The image data obtained is divided into two parts: 70% for training data and 30% for testing. From the results of tests carried out on objects measuring 80 * 80 pixels with a grid of 7 * 7, it is known that the accuracy of the detection results reaches 90%. Meanwhile, the number of grids 10 * 10, 13 * 13, 16 * 16 and 19 * 19 is still under further testing.

Keywords

Object Detection, YOLO, Driver Autonomous Systems, Deep Learning.

Full Text:

PDF

References

Alex, K., Ilya, S., & E Geoffrey, H. (2012). Imagenet classification with deep convolutional neural networks. NIPS Conference, 1097–1105.

F. Felzenszwalb, P., B. Girshick, R., McAllester, D., & Ramanan, D. (2009). Object Detection with Discriminatively Trained Part Based Models. Computer, 47(2), 1–19.

Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2013). Visual object detection with deformable part models. Communications of the ACM, 56(9), 97–105. https://doi.org/10.1145/2500468.2494532

Girshick, R. (2015). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, 2015 Inter, 1440–1448. https://doi.org/10.1109/ICCV.2015.169

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 580–587. https://doi.org/10.1109/CVPR.2014.81

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2016). Region-Based Convolutional Networks for Accurate Object Detection and Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 142–158. https://doi.org/10.1109/TPAMI.2015.2437384

He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision, 2017-Octob, 2980–2988. https://doi.org/10.1109/ICCV.2017.322

Herdianto, H., & Nasution, D. (2023). Implementasi Metode Cnn Untuk Klasifikasi Objek. METHOMIKA Jurnal Manajemen Informatika Dan Komputerisasi Akuntansi, 7(1), 54–60. https://doi.org/10.46880/jmika.vol7no1.pp54-60

Navneet, D., & Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 1–8. https://doi.org/10.1007/978-3-642-33530-3_8

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016-Decem, 779–788. https://doi.org/10.1109/CVPR.2016.91

Felzenszwalb, P., McAllester, D., Ramanan, D., An, W., … Zhang, L. (2008). A Discriminatively Trained, Multiscale, Deformable Part Model. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 330(6), 1299–1305.

Redmon, J., & Farhadi, A. (2018). YOLOv3: An Incremental Improvement. http://arxiv.org/abs/1804.02767

Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

Viola, P., & Jones, M. (2001). Rapid Object Detection using a Boosted Cascade of Simple Features. CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, 1–9.

Viola, P., & Jones, M. (2004). Robust Real-Time Face Detection Intro to Face Detection. International Journal of Computer Vision, 57(2), 137–154.

DOI: https://doi.org/10.46576/prosundhar.v4i1.366