/
AdaScale: Towards Real-time Video Object Detection using Adaptive Scaling AdaScale: Towards Real-time Video Object Detection using Adaptive Scaling

AdaScale: Towards Real-time Video Object Detection using Adaptive Scaling - PowerPoint Presentation

alida-meadow
alida-meadow . @alida-meadow
Follow
344 views
Uploaded On 2020-01-21

AdaScale: Towards Real-time Video Object Detection using Adaptive Scaling - PPT Presentation

AdaScale Towards Realtime Video Object Detection using Adaptive Scaling TingWu Rudy Chin Ruizhuo Ding Diana Marculescu ECE Dept Carnegie Mellon University SysML 2019 Autonomous Cars ID: 773460

object scale adascale scales scale object scales adascale detection multi regression training image images bounding loss box ada 600

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "AdaScale: Towards Real-time Video Object..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

AdaScale: Towards Real-time Video Object Detection using Adaptive Scaling Ting-Wu (Rudy) Chin* Ruizhuo Ding* Diana MarculescuECE Dept., Carnegie Mellon University SysML 2019

Autonomous Cars 1Autonomous Drones21 1. https://medium.com/udacity/how-the-udacity-self-driving-car-works-575365270a402. https://software.intel.com/en-us/articles/object-detection-on-drone-videos-using-caffe-frameworkHousehold Robots33. Loghmani, Mohammad Reza, Barbara Caputo, and Markus Vincze. "Recognizing objects in-the-wild: Where do we stand?." 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018. Video object detection is one of the key tasks in various emerging applications

2 Prior art uses scales to trade speed for accuracy5. Lin, Tsung-Yi, et al. "Focal loss for dense object detection." ICCV. 2017.6. Redmon, Joseph, and Ali Farhadi. "YOLO9000: better, faster, stronger." CVPR. 2017. RetinaNet5YOLOv26

3 Down-sampling could reduce noises, which further reduce False PositivesScaling-down an image (resolution) may sometimes helpHow to determine which image to scale by how much?A regression problem

Motivation From images to scales: a regression problemAdaScale methodologyResultsOutline4

5 To generate target labelsChoose a set of discrete scales to broadly cover the scales of interest.For each image in the training set, evaluate every scale with a metric to identify the best scale. Regressing scales from input images

6 will not introduce regression loss if it is in background Current loss function favors extreme scalespredicted bounding box   ground truth bounding box    

7 Our proposal: only consider the foreground boxes   Foreground bounding box Background bounding box Sort by loss

Motivation From images to scales: a regression problemAdaScale methodologyResultsOutline8

9 The overall flow of AdaScaleFine-tune Object Detectors with multi-scale trainingGenerate labels for the scale regressor Multi-scale training for scale regressor(freezing object detector)TrainingTesting Backbone CNN Object Detector Scale Regressor Scaling For t+1 t t+1 t+n t (real value)

Motivation From images to scales: a regression problemAdaScale methodologyResultsOutline10

11 AdaScale on ImageNet VID SS/SS: Single-scale Training, Single-scale TestingMS/SS: Multi-scale Training, Single-scale TestingMS/AdaScale: Multi-scale Training, AdaScale Testing SS/SS MS/SS MS/Ada

12 Ablation study: multi-scale fine-tuningRegressed scales{600} SSAda74.274.27568{600,360}SS Ada 73.4 74.8 75 57 {600,480, 360} SS Ada 73.3 74.8 75 55 {600,480, 360,240} SS Ada 73.3 75.5 75 47 Method mAP Runtime ( ms ) Method mAP Runtime ( ms )

13 Qualitative analysis: dynamics of AdaScale

14 Qualitative analysis: comparison with baselineSS/SSMS/AdaScale

15 We propose AdaScale, which improves both speed and accuracy in video object detection with image scaling instead of trading one for the other.Our results demonstrate 1.3 and 2.7 mAP improvement on ImageNet VID and mini-YoutubeBB datasets with 1.6x and 1.8x speedup, respectively.Together with state-of-the-art video object detection acceleration technique (i.e., Deep Feature Flow), we further push the speedup by 1.25x with slightly better mAP. Conclusions

Q & A Thank you