/
FOCUS FOCUS

FOCUS - PowerPoint Presentation

trish-goza
trish-goza . @trish-goza
Follow
380 views
Uploaded On 2016-07-13

FOCUS - PPT Presentation

Clustering Crowdsourced Videos by LineofSight Puneet Jain Justin Manweiler Arup Acharya and Kirk Beaty Clustered by shared subject c hallenges CAN IMAGE PROCESSING SOLVE THIS PROBLEM ID: 402149

view model image focus model view focus image similarity video videos extraction camera reconstruction sight multi vision line keypoint cloud sensing computer

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "FOCUS" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

FOCUS

: Clustering Crowdsourced Videos by Line-of-Sight

Puneet Jain, Justin Manweiler, Arup Acharya, and Kirk BeatySlide2

Clustered by shared subjectSlide3

challengesSlide4

CAN IMAGE PROCESSING SOLVE THIS PROBLEM?Slide5

Camera 2

Camera 4

Camera 3

Camera 1

5

LOGICAL

similarity

does not imply

VISUAL

similaritySlide6

6

VISUAL

similarity does not imply

LOGICAL similaritySlide7

CAN SMARTPHONE SENSING SOLVE THIS PROBLEM?Slide8

Sensors are noisy

, hard to distinguish subjects…

Why not

triangulate

?Slide9

GPS-COMPASS Line-of-SightSlide10

INSIGHTSlide11

Don’t need to visually identify actual

SUBJECT, can use background as PROXY

hard to identify

easy to identify

Simplifying

I

nsight 1Slide12

same basic structure persists

Simplifying

Insight 2

Don’t need to directly match videos, can compare all to a predefined visual

MODELSlide13

Simplifying

I

nsight

3

Light-of-sight

(triangulation) is

almost

enough, just not via sensing (alone)Slide14

FOCUS

Fast Optical Clustering of live User StreamsSlide15

Hadoop

/HDFS

Failover, elasticity

Image processing

Computer vision

Video Streams

(Android,

iOS

, etc.)

Clustered Videos

FOCUS Cloud

Video Analytics

Video

Extraction

Watching Live

home: 2

away: 1

Users Select & Watch

Organized Streams

Change Angle

Change

FocusSlide16

Clustered Videos

FOCUS Cloud

Video Analytics

Video

Extraction

Watching Live

home: 2

away: 1

Users Select & Watch

Organized Streams

Change Angle

Change

Focus

pre-defined reference “model”

Hadoop

/HDFS

Failover, elasticity

Image processing

Computer visionSlide17

17

Model construction technique based on

Photo Tourism: Exploring image collections in

3D

Snavely

et al., SIGGRAPH 2006

z

multi-view reconstruction

z

keypoint

extraction

estimates camera

POSE

and content in

field-of-view

Multi-view Stereo ReconstructionSlide18

Visualizing Camera PoseSlide19

~ 1 second at 90

th

%~ 18 seconds at 90th

%

19

z

multi-view reconstruction

z

keypoint

extraction

z

frame-by-frame

video to model

alignment

z

sensory inputs

Given a pre-defined 3D, align incoming video frames to the model

Also known as camera pose estimationSlide20

z

multi-view reconstruction

z

keypoint

extraction

z

integration of sensory inputs

Gyroscope

, provides “diff”

f

rom vision initial position

20

0

1

2

3

4

t

- 1

t

- 2

Filesize

≈ 1/Blur

Sampled Frame

GyroscopeSlide21

21

Field-of-view

Using

POSE

+ model

POINT CLOUD

, FOCUS geometrically identifies the set of model points in background of view

z

multi-view reconstruction

z

keypoint

extraction

z

pairwise model image analysisSlide22

1

3

2

Similarity between

image 1 & 2 = 18

Similarity between

image 1 & 3 = 13

22

Finding the similarity across videos

as size of

point cloud set intersection

z

multi-view reconstruction

z

keypoint

extraction

z

pairwise model image analysisSlide23

Clustering “similar” videos

Similarity Score

1

3

3

2

2

1

Application of Modularity Maximization

high

modularity implies:

high correlation

among the members of a cluster

minor correlation

with the members of other clustersSlide24

resultsSlide25

Collegiate Football Stadium

Stadium 33K seats56K maximum attendanceModel: 190K points 412 images (2896 x 1944 resolution)Android Appon Samsung Galaxy Nexus, S3325 videos captured 15-30 seconds each

25Slide26

26

Line-of-Sight Accuracy (visual)Slide27

Line-of-Sight Accuracy

GPS/Compass LOS estimation is

<260 meters for the same percentage

27

In >80% of the cases, Line-of-sight estimation is off by < 40 metersSlide28

FOCUS Performance

75% true positives

Trigger GPS/Compass failover techniques

28Slide29

Natural QuestionsWhat if 3D

model is not available?Online model generation from first few uploadsStadiums look very different on a game day?Rigid structures in the background persistsWhere it won’t work?Natural or dynamic environment are hardSlide30

ConclusionComputer vision and image processing

are often computation hungry, restricting real-time deploymentMobile Sensing is a powerful metadata, can often reduce computation burdenComputer vision + Mobile Sensing + Geometry, along with right set of BigData tools, can enable many real-time applicationsFOCUS, displays one such fusion, a ripe area for further researchSlide31

Thank You

http://cs.duke.edu/~puneet