AdaScale Towards Realtime Video Object Detection using Adaptive Scaling TingWu Rudy Chin Ruizhuo Ding Diana Marculescu ECE Dept Carnegie Mellon University SysML 2019 Autonomous Cars ID: 773460
Download Presentation The PPT/PDF document "AdaScale: Towards Real-time Video Object..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
AdaScale: Towards Real-time Video Object Detection using Adaptive Scaling Ting-Wu (Rudy) Chin* Ruizhuo Ding* Diana MarculescuECE Dept., Carnegie Mellon University SysML 2019
Autonomous Cars 1Autonomous Drones21 1. https://medium.com/udacity/how-the-udacity-self-driving-car-works-575365270a402. https://software.intel.com/en-us/articles/object-detection-on-drone-videos-using-caffe-frameworkHousehold Robots33. Loghmani, Mohammad Reza, Barbara Caputo, and Markus Vincze. "Recognizing objects in-the-wild: Where do we stand?." 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018. Video object detection is one of the key tasks in various emerging applications
2 Prior art uses scales to trade speed for accuracy5. Lin, Tsung-Yi, et al. "Focal loss for dense object detection." ICCV. 2017.6. Redmon, Joseph, and Ali Farhadi. "YOLO9000: better, faster, stronger." CVPR. 2017. RetinaNet5YOLOv26
3 Down-sampling could reduce noises, which further reduce False PositivesScaling-down an image (resolution) may sometimes helpHow to determine which image to scale by how much?A regression problem
Motivation From images to scales: a regression problemAdaScale methodologyResultsOutline4
5 To generate target labelsChoose a set of discrete scales to broadly cover the scales of interest.For each image in the training set, evaluate every scale with a metric to identify the best scale. Regressing scales from input images
6 will not introduce regression loss if it is in background Current loss function favors extreme scalespredicted bounding box ground truth bounding box
7 Our proposal: only consider the foreground boxes Foreground bounding box Background bounding box Sort by loss
Motivation From images to scales: a regression problemAdaScale methodologyResultsOutline8
9 The overall flow of AdaScaleFine-tune Object Detectors with multi-scale trainingGenerate labels for the scale regressor Multi-scale training for scale regressor(freezing object detector)TrainingTesting Backbone CNN Object Detector Scale Regressor Scaling For t+1 t t+1 t+n t (real value)
Motivation From images to scales: a regression problemAdaScale methodologyResultsOutline10
11 AdaScale on ImageNet VID SS/SS: Single-scale Training, Single-scale TestingMS/SS: Multi-scale Training, Single-scale TestingMS/AdaScale: Multi-scale Training, AdaScale Testing SS/SS MS/SS MS/Ada
12 Ablation study: multi-scale fine-tuningRegressed scales{600} SSAda74.274.27568{600,360}SS Ada 73.4 74.8 75 57 {600,480, 360} SS Ada 73.3 74.8 75 55 {600,480, 360,240} SS Ada 73.3 75.5 75 47 Method mAP Runtime ( ms ) Method mAP Runtime ( ms )
13 Qualitative analysis: dynamics of AdaScale
14 Qualitative analysis: comparison with baselineSS/SSMS/AdaScale
15 We propose AdaScale, which improves both speed and accuracy in video object detection with image scaling instead of trading one for the other.Our results demonstrate 1.3 and 2.7 mAP improvement on ImageNet VID and mini-YoutubeBB datasets with 1.6x and 1.8x speedup, respectively.Together with state-of-the-art video object detection acceleration technique (i.e., Deep Feature Flow), we further push the speedup by 1.25x with slightly better mAP. Conclusions
Q & A Thank you