Introduction To Object Detection
Introduction To Object Detection
● Classification:
○ Input: Image
○ Output: Class label
○ Loss: Cross entropy (Softmaxlog)
○ Evaluation metric: Accuracy
● Localization:
○ Input: Image
○ Output: Box in the image (x, y, w, h)
○ Loss: L2 Loss (Euclidean distance)
○ Evaluation metric: Intersection over Union
● Classification + Localization:
○ Input: Image
○ Output: Class label + box in the image
○ Loss: Sum of both losses
Classification + Localization: ImageNet Challenge
● Dataset
○ 1000 Classes.
○ Each image has 1 class with at least one
bounding box.
○ ~800 Training images per class.
● Evaluation
○ Algorithm produces 5 (class + bounding box)
guesses.
○ Example is correct if at least one of guess has
correct class AND bounding box at least 50%
intersection over union.
Intersection Over Union (IoU)
Intersection(A,B)
IoU(A,B)
= Union(A,B)
Classification + Localization: Model
Classification Head:
● C Scores for C
classes
Localization Head:
● Class agnostic:
(x,y,w,h)
● Class specific:
(x,y,w,h) X C
Computer Vision Tasks
Was de-facto standard, Essentially scaled up version More categories and more
currently used as quick of PASCAL VOC, similar object object instances in every
benchmark to evaluate new statistics. image. Only 10% of images
detection algorithms. contain a single object
category, 60% in Pascal. More
small objects than large
objects.
Pascal Examples
COCO Examples
Object Detection
● Input: Image
● Output: For each object class c and each
image i, an algorithm returns predicted
detections: locations with
confidence scores .
Object Detection: Evaluation
●
Object Detection: Evaluation
● Mean Average Precision (mAP) across all classes, based on Average Precision
(AP) per class, based on Precision and Recall.
Precision And Recall For a Threshold
Precision-Recall Curve
● [In the vision community] AP is the estimated area under the PR curve
Mean Average Precision (mAP)
● The winner of each object class is the team with the highest average precision
● The winner of the challenge is the team with the highest mean Average
Precision (mAP) across all classes.
Object Detection: Evaluation
● Mean Average Precision (mAP) across all classes, based on Average Precision
(AP) per class, based on Precision and Recall.
Object Detection: Evaluation
cv@brodmann17.com