Page 34 - Informatics, July 2021
P. 34
Technology Update
Object Detection Technologies
A simplified explanation of YOLO class of algorithms
Edited by Dr. DIBAKAR RAY bject detection is the task of localizing as the number of channels. The number X is chosen
well as classifying objects of interest from such that it is divisible by 32. YOLO v3 has 106 lay-
Oan image or video. It covers a wide range of ers, with 53 CNN layers (Darknet-53) stacked on
It is of the highest importance techniques, including image processing, pattern top of each other. The predictions are done at
in the art of detection to be recognition, artificial intelligence, and machine three different layers corresponding to strides 32,
learning. Object detection has a variety of uses,
16 and 8. For each cell of the image, we predict
able to recognise out of a some of which are surveillance and security, 3 bounding boxes at every scale. The bounding
traffic monitoring, video communication, image boxes are predicted as offsets to the prior boxes
number of facts which are annotation, activity detection, face recognition, also known as anchors.
robot vision and animation. Common Objects in Context (COCO) is the
incidental and which are vital dataset containing 80 classes of commonly oc-
Classes of Algorithms curring real life objects and is the standard data-
The three prominently used techniques in Ob- set to test object detection algorithms. For the
ject Detection are COCO dataset, YOLO v3 produces a tensor of the
- Arthur Conan Doyle
• R-CNN and its variations like Fast R-CNN, Faster shape 3* (4+ 1+ 80), where 3 is for the number of
R-CNN, Mask R-CNN etc. the bounding boxes, 4 is for the offset location of
• Single Shot Detectors bounding box, 1 is for the objectness score and
80 is for confidence probabilities of the number
• YOLO of classes. The offsets are given by t , t , t and
w
y
x
t where t and t are the center co-ordinates and
y
x
n
R-CNN t , t represents the width and height. The object-
w
n
Girshik et al. first proposed R-CNN in 2013 ness score represents the IOU between the pre-
wherein the system would make region proposals dicted box and any ground truth box.
and then these regions would be passed to the
CNN for classification and outputting bounding
box. The problem with this approach is that it is
painstakingly slow. Another version by the name
Fast-RCNN was published by Girshik et al. in 2015
which used implementation of sliding windows
convolution to identify all the proposed regions.
However, it was still slow. It wasn’t until the third
paper came out by the name Faster R-CNN that
this technique was used in practical applications.
It replaced the use of an external algorithm like
Selective Search with CNN to propose regions.
YOLO
YOLO is the acronym for “You Only Look Once”,
Dr. A.K. Hota whose first version appeared in 2016 by Redmon
Dy. Director General et al. Unlike previous approaches, the image is
& SIO passed only once to the network rather than using
ak.hota@nic.in
a pipeline for region proposals, classification etc.
and it simultaneously predicts the co-ordinates
of the bounding box and the class of object. This
increased the task’s performance. Subsequently,
here has been many versions of it namely YOLO
A.K. Somasekhar v2, YOLO v3, YOLO v4 and YOLO v5 with the most
recent one being YOLO v5 published in 2021. The
Sr. Technical Director concepts of YOLO v3 forms the basis for all sub-
som@nic.in Image courtesy - http://medium.com
sequent works.
YOLO v3 Each grid cell also predicts 80 condition-
YOLO v3 uses only convolutional layers as the al class probabilities, Pr(Class |Object). These
i
Shom C. pooling layers are also simulated by convolution- probabilities are conditioned on the grid cell to
Abraham al layers. The training network’s input is of the containing an object. At test time we multiply the
Scientific Assistant - A form (n, X,X,3), with n denoting the number of im- conditional class probabilities and the individual
shom.abraham@nic.in ages , X denoting the width, height and 3 denoting box confidence predictions.
34 informatics.nic.in July 2021