Some modification after seminar Tackgeun YOU Contents Baseline Algorithm Fast RCNN Observations amp Proposals Fast RCNN in Microsoft COCO Object Detection Definition Predict the locationlabel of objects in the scene ID: 1027696
Download Presentation The PPT/PDF document "On-going research on Object Detection" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1. On-going research onObject Detection*Some modification after seminarTackgeun YOU
2. ContentsBaseline AlgorithmFast R-CNNObservations & ProposalsFast R-CNN in Microsoft COCO
3. Object DetectionDefinitionPredict the location/label of objects in the sceneTraditional PipelineApproximate a search space bySliding Window or Object ProposalsEvaluate the approximated regionsNon-maximal suppression to get proper regions
4. R-CNNCVPR 14Object ProposalsApproximate search spaceFine-tuned CNN Feature SVMScore each regionBounding Box RegressionRefinement regionNon-maximal Suppression
5. Training Pipeline of R-CNNSupervised Pre-trainingImage-level Annotation in ILSVRC 2012 Domain-specific Fine-tuningMini-batch with 128 samples32 Positive samples - Region proposals ≥ 0.5 IoU96 Negative samples – The restObject Category Classifier (SVM)Positive – Only GTNegative – 0.3 ≤ IoUHard Negative MiningBounding Box RegressionUsing nearby-samples – maximum overlap in { 0.6 ≥ IoU }Ridge Regression (Regularization is important)Iteration does not improve the result
6. Fast R-CNNArXiv15Training is single stage (cf. R-CNN)Multi-task LossCross-Entropy Loss : true class label : true bounding box regression target : predicted location
7. Fast R-CNNArXiv15Training is single stage (cf. R-CNN)Multi-task LossSmooth Regression Loss : true class label : true bounding box regression target : GT bounding box : predicted bounding boxConstructed by whitening ground truth : predicted location
8. Fast R-CNNArXiv15Smooth Regression LossTraining with L2 Loss requires the tuning of learning rate to prevent exploding gradient
9. Exploring VOC with Fast R-CNNObservationFailed to localize contiguous objectsHypothesisMultiple-objects region has a higher confidence than single-object objectExperimentCheck that maximum value is on tight objectMCMC iteration start from ground truth
10.
11.
12.
13.
14. Red – ratio(IoU > 0.5)Blue – mean(IoU)Magenta – ratio(IoU > 0.5)Black – ratio(IoU < 0.3)
15. Hope to achieve below conditionTailoring confidence for precise localizationWhole body of a single object (Highest)Partial body of a single object (Positive)Overlapped multiple object ( ? )Other classes (Lowest)
16. Detailed PlansDealing multiple-objects region?How to define multiple-objects region?Using Fast R-CNNFine-tuning multi-object regions as negative samplesNegative Sample on BatchPossible Failure - Decreases the performance, while alleviates the confidence on multiple-object regions.Adopting Proper Loss functionRanking
17.
18. Microsoft COCO80-classesTrain (82783), Validation (40504)Test (81434)Split#imgsSubmissionScore ReportedTest-Dev~ 20 KUnlimitedImmediatelyTest-Standard~ 20 KLimitedImmediatelyTest-Challenge~ 20 KLimitedWorkshopTest-Reserve~ 20 KLimitedNever
19. ref. Microsoft COCO: Common Objects in Context
20. ref. What makes for effective detection proposals?
21. Fast R-CNN with 1k-MCG proposals240k-iters (5.8 epoch on train)
22. Fast R-CNN with 1k-MCG proposals240k-iters + 130k-iters (6.4 epoch on val)
23. Processing Time of Fast R-CNNTesting SpeedWith MCG @1k - 1.872 s/image~21.06 hours @ validation set~10 hours @ test-dev set~42.35 hours @ test setTraining Speed0.564 s/iteration~6.48 hours/epoch_on_training_set
24. End
25. Sampleshttp://mscoco.org/explore/?id=407286http://mscoco.org/explore/?id=161602http://mscoco.org/explore/?id=123835http://mscoco.org/explore/?id=242673
26. Label Difference in Fine-tuning & SVMDomain-specific Fine-tuningMini-batch with 128 samples32 Positive samples - Region proposals ≥ 0.5 IoU96 Negative samples – The restObject Category Classifier (SVM)Positive – Only GTNegative – {0.0, 0.1, 0.2, 0.3, 0.4, 0.5} ≤ IoUFitting mAP on validation set0.0 -4%, 0.5 -5%Hard Negative Mining (Fitting training set is impossible)
27. ConjectureThe definition of positive examples used in fine-tuning does not emphasize precise localization.The softmax classifier was trained on randomly sampled negative examples rather than on the subset of “hard negatives” used for SVM training.
28. Fast R-CNNArXiv15Training is single stage (cf. R-CNN)Fine-tuning by Multi-task LossBounding box Regression + Detection