b The rolling shutter used by sensors in these cameras also produces warping in the output frames we have exagerrated the effect for illustrative purposes c We use gyroscopes to measure the cameras rotations during video capture d We use the measure ID: 24268
Download Pdf The PPT/PDF document "Stanford Tech Report CTSR Digital Video..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
StanfordTechReportCTSR2011-03 Figure2:Pinholecameramodel.ArayfromthecameracenterctoapointinthesceneXwillintersecttheimageplaneatx.Thereforetheprojectionoftheworldontotheimageplanedependsonthecamera'scenterc,thefocallengthf,andthelocationofthecamera'saxis(ox;oy)intheimageplane.constantdepth).Therefore,translationreconstructionsometimesfailsduetounmodeledparallaxinthevideo.Toavoidtheseproblemswedonotincorporatetranslationsintoourmodel.Fortunately,camerashakeandrollingshutterwarpingoc-curprimarilyfromrotations.Thisisthecasebecausetranslationsattenuatequicklywithincreasingdepth,andobjectsaretypicallysufcientlyfarawayfromthelensthattranslationalcamerajitterdoesnotproducenoticeablemotionintheimage.Thisconclusionissupportedbyourstabilizationresults.2.1CameraModelOurrotationalrollingshuttercameramodelisbasedonthepinholecameramodel.InapinholecameratherelationshipbetweenimagepointxinhomogeneouscoordinatesandthecorrespondingpointXin3Dworldcoordinates(g.2)maybespeciedby:x=KX,andX=K1x:(1)Here,isanunknownscalingfactorandKistheintrinsiccameramatrix,whichweassumehasaninverseofthefollowingform:K1=0@10ox01oy00f1A;(2)where,(ox;oy)istheoriginofthecameraaxisintheimageplaneandfisthefocallength.Thecamera'sfocallengthisanunknownthatweneedtorecover.Weassumethatthecamerahassquarepix-elsbysettingtheupperdiagonalentriesto1.However,itisstraight-forwardtoextendthismodeltotakeintoaccountnon-squarepixelsorotheropticaldistortions.2.2CameraMotionWesettheworldorigintobethecameraorigin.ThecameramotioncanthenbedescribedintermsofitsorientationR(t)attimet.Thus,foranyscenepointX,thecorrespondingimagepointxattimetisgivenby:x=KR(t)X:(3)TherotationmatricesR(t)2SO(3)arecomputedbycompound-ingthechangesincameraangle(t).WeuseSLERP(SphericalLinearintERPolation)ofquaternions[Shoemake1985]inordertointerpolatethecameraorientationsmoothlyandtoavoidgimballock.1(t)isobtaineddirectlyfromgyroscopemeasuredratesof 1Inpractice,thechangeinanglebetweengyroscopesamplesissuf-cientlysmallthatEuleranglesworkaswellasrotationquaternions. ! Figure3:High-frequencycamerarotationswhiletheshutterisrollingfromtoptobottomcausetheoutputimagetoappearwarped.rotation!(t):(t)=(!(t+td)+!d)t:(4)Here!disthegyroscopedriftandtdisthedelaybetweenthegryoscopeandframesampletimestamps.Theseparametersaread-ditionalunknownsinourmodelthatwealsoneedtorecover.2.3RollingShutterCompensationWenowintroducethenotionofarollingshutterintoourcameramodel.RecallthatinanRScameraeachimagerowisexposedataslightlydifferenttime.Camerarotationsduringthisexposurewill,therefore,determinethewarpingoftheimage.2Forexample,ifthecameraswaysfromsidetosidewhiletheshutterisrolling,thentheoutputimagewillbewarpedasshowning.3.Thetimeatwhichpointxwasimagedinframeidependsonhowfardowntheframeitis.Moreformally,wecansaythatxwasimagedattimet(i;y):t(i;y)=ti+tsy=h,wherex=(x;y;1)T;(5)whereyistheimagerowcorrespondingtopointx,histhetotalnumberofrowsintheframe,andtiisthetimestampofthei-thframe.Thetstermstatesthatthefartherdownweareinaframe,thelongerittookfortherollingshuttertogettothatrow.Hence,tsisthetimeittakestoreadoutafullframegoingrowbyrowfromtoptobottom.Notethatanegativetsvaluewouldindicatearollingshutterthatgoesfrombottomtotop.Wewillshowhowtoautomaticallyrecoverthesignandvalueoftsinsection3.2.4ImageWarpingWenowderivetherelationshipbetweenimagepointsinapairofframesfortwodifferentcameraorientations(seeg.4).ForascenepointXtheprojectedpointsxiandxjintheimageplaneoftwoframesiandjaregivenby:xi=KR(t(i;yi))X,andxj=KR(t(j;yj))X:(6)IfwerearrangetheseequationsandsubstituteforX,wegetamap-pingofallpointsinframeitoallpointsinframej:xj=KR(t(j;yj))RT(t(i;yi))K1xi:(7) 2Translationalcamerajitterduringrollingshutterexposuredoesnotsig-nicantlyimpactimagewarping,becauseobjectsaretypicallyfarawayfromthelens.3 StanfordTechReportCTSR2011-03 Figure4:Topviewoftwocameraorientationsandtheircorre-spondingimageplanesiandj.AnimageofscenepointXappearsinthetwoframeswheretheray(red)intersectstheircameraplane.Sofarwehaveconsideredtherelationshipbetweentwoframesofthesamevideo.WecanrelaxthisrestrictionbymappingframesfromonecamerathatrotatesaccordingtoR(t)toanothercamerathatrotatesaccordingtoR0(t).Notethatweassumebothcameracentersareattheorigin.WecannowdenethewarpingmatrixWthatmapspointsfromonecameratotheother:W(t1;t2)=KR0(t1)RT(t2)K1:(8)Noticethateq.7cannowbeexpressedmorecompactlyas:xj=W(t(j;yj);t(i;yi))xi,whereR0=R:(9)AlsonotethatWdependsonbothimagerowsyiandyjofimagepointsxiandxjrespectively.Thiswarpingmatrixcanbeusedmatchpointsinframeitocorrespondingpointsinframej,whiletakingtheeffectsoftherollingshutterintoaccountinbothframes.Giventhisformulationofawarpingmatrix,thealgorithmforrollingshuttercorrectionandvideostabilizationbecomessimple.Wecreateasyntheticcamerathathasasmoothmotionandaglobalshutter.Thiscamera'smotioniscomputedbyapplyingaGaus-sianlow-passltertotheinputcamera'smotion,whichresultsinanewsetofrotationsR0.Wesettherollingshutterdurationtsforthesyntheticcamerato0(i.e.,aglobalshutter).WethencomputeW(ti;t(i;yi))ateachimagerowyiofthecurrentframei,andap-plythewarptothatrow.NoticethatthersttermofWnowonlydependsontheframetimeti.Thisoperationmapsallinputframesontooursyntheticcamera;andasaresult,simultaneouslyremovesrollingshutterwarpingandvideoshake.Inpractice,wedonotcomputeW(ti;t(i;yi))foreachimagerowyi.Instead,wesubdividetheinputimage(g.5a)andcomputethewarpateachverticalsubdivision(g.5cand5d).Inessence,wecreateawarpedmeshfromtheinputimagethatisapiecewiselinearapproximationofthenon-linearwarp.Wendthattensub-divisionsaretypicallysufcienttoremoveanyvisibleRSartifacts.Forss´enandRingaby[2010]refertothissamplingapproachasin-verseinterpolation.Theyalsoproposetwoadditionalinterpolationtechniques,whichtheyshowempiricallytoperformbetteronasyn-theticvideodataset.However,weuseinverseinterpolationbecauseitiseasytoimplementanefcientversionontheGPUusingvertexshaders.TheGPU'sfragmentshadertakescareofresamplingthemesh-warpedimageusingbilinearinterpolation.WendthatRSwarpinginactualvideosistypicallynotstrongenoughtoproducealiasingartifactsduetobilinearinverseinterpolation.Asaresult,inverseinterpolationworkswellinpractice.Somepriorworkinrollingshuttercorrectionmakesuseofglobalimagewarpssuchastheglobalafnemodel[Liangetal.2008] (a) (b) (c) (d)Figure5:(a)WarpedimagecapturedbyanRScamera.(b)Agloballineartransformationoftheimage,suchastheshearshownhere,cannotfullyrectifythewarp.(c)Weuseapiecewiselinearap-proximationofnon-linearwarping.(d)Wendthat10subdivisionsaresufcienttoeliminatevisualartifacts.andtheglobalshiftmodel[Chunetal.2008].Thesemodelsassumethatcamerarotationismoreorlessconstantduringrollingshutterexposure.Ifthisisnotthecase,thenalinearapproximationwillfailtorectifytherollingshutter(g.5b).Weevaluatetheperformanceofalinearapproximationonactualvideofootageinsection4.3CameraandGyroscopeCalibrationWenowpresentourframeworkforrecoveringtheunknowncam-eraandgyroscopeparameters.ThiscalibrationstepisnecessarytoenableustocomputeWdirectlyfromthegyroscopedata.Theun-knownparametersinourmodelare:thefocallengthofthecameraf,thedurationoftherollingshutterts,thedelaybetweenthegy-roscopeandframesampletimestampstd,andthegyroscopedriftwd.Notethatsomeoftheseparameters,suchasthecamera'sfocallength,mightbespeciedbythemanufacturer.Itisalternativelypossibletomeasuretheseparametersexperimentally.Forexample,Forss´enandRingaby[2010]useaquicklyashingdisplaytomea-suretherollingshutterdurationts.However,thesetechniquestendtobeimpreciseanderrorprone;andtheyarealsotootedioustobecarriedoutbyregularusers.Thedurationoftherollingshutteristypicallyinthemillisecondrange.Asaresult,asmallmisalignmentintdortswouldcauserollingshutterrecticationtofail.Ourapproachistoestimatetheseparametersfromasinglevideoandgyroscopecapture.Theuserisaskedtorecordavideoandgyroscopetracewheretheystandstillandshakethecamerawhilepointingatabuilding.Ashortclipofabouttensecondsindurationisgenerallysufcienttoestimatealltheunknowns.Notethatthisonlyneedstobedoneonceforeachcameraandgyroscopearrange-ment.Inourapproach,wendmatchingpointsinconsecutivevideo4 StanfordTechReportCTSR2011-03 Figure6:Pointcorrespondencesinconsecutiveframes.WeuseSIFTtondpotentialmatches.WethenapplyRANSACtodiscardoutliersthatdonotmatchtheestimatedhomography.framesusingSIFT[Lowe2004],andweuseRANSAC[FischlerandBolles1981]todiscardoutliers.Theresultisasetofpointcor-respondencesxiandxjforallneighboringframesinthecapturedvideo(g.6).Giventhisgroundtruth,onecanformulatecalibra-tionasanoptimizationproblem,wherewewanttominimizethemean-squaredre-projectionerrorofallpointcorrespondences:J=X(i;j)jjxjW(t(j;yj);t(i;yi))xijj2:(10)Notethatthisisanon-linearoptimizationproblem.Anumberofnon-linearoptimizerscouldbeusedtominimizeourobjectivefunc-tion.However,wehavefoundcoordinatedescentbydirectobjec-tivefunctionevaluationtoconvergequickly.EachtimewetakeastepwheretheobjectivefunctionJdoesnotdecrease,wereversethestepdirectionanddecreasethestepsizeofthecorrespondingparameter.Thealgorithmterminatesassoonasthestepsizeforallparametersdropsbelowadesiredthreshold(i.e.,whenwehaveachievedatargetprecision).OurMatlab/C++implementationtyp-icallyconvergesinunder2secondsforacalibrationvideoofabout10secondsinduration.Weinitializeouroptimizationalgorithmbysettingthefocallengthtobesuchthatthecamerahasaeldofviewof45.Wesetallotherparametersto0.Wendthatwiththeseinitialconditions,theoptimizerconvergestothecorrectsolutionforourdataset.Moregenerally,wecanavoidfallingintoalocalminimum(e.g.,whenthedelaybetweenthegyroandframetimestampsislarge)byrestartingourcoordinatedescentalgorithmforarangeofplausibleparame-ters,andselectingthebestsolution.Theaveragere-projectionerrorforcorrectlyrecoveredparametersistypicallyaround1pixel.Anadditionalunknowninourmodelistherelativeorientationofthegyroscopetothecamera.Forexample,rotationsaboutthegyro'sy-axiscouldcorrespondtorotationsaboutthecamera'sx-axis.Todiscoverthegyroscopeorientationwepermuteits3ro- Figure7:Signalsx(red)andf!y(t+td)(blue).Top:Beforecalibrationtheamplitudeofthesignalsdoesnotmatch,becauseourinitialguessforfistoolow.Inaddition,thesignalsareshiftedsinceweinitializetdto0.Bottom:Aftercalibrationthesignalsarewellalignedbecausewehaverecoveredaccuratefocallengthandgyroscopedelay.tationaxesandrunouroptimizerforeachpermutation.Theper-mutationthatminimizestheobjectivebestcorrespondstothecam-era'saxisordering.Wefoundre-projectionerrortobesignicantlylargerforincorrectpermutations.Therefore,thisapproachworkswellinpractice.Inourdiscussionwehaveassumedthatthecamerahasaverticalrollingshutter.TheRSmodelcouldbeeasilymodiedtoworkforimagecolumnsinsteadofrows.Findingtheminimumre-projectionerrorforbothcaseswouldtelluswhetherthecamerahasahorizon-talorverticalrollingshutter.Finally,inordertoprovideabettersenseoftheresultsachievedbycalibration,wepresentavisualizationofvideoandgyroscopesignalsbeforeandaftercalibration.Ifweassumethatrotationsbe-tweenconsecutiveframesaresmall,thentranslationsintheimagecanbeapproximatelycomputedfromrotationsasfollows:_x(t)f^!(t+td);where_x=(_x;_y)T^!=(!y;!x)T(11)Here,wehavealsoassumednoeffectsduetorollingshutter(i.e.,ts=0),andweignorerotationsaboutthez-axis(i.e.,!z).Welet_xbetheaveragerateoftranslationalongxandyforallpointcorrespondencesinconsecutiveframes.Ifouroptimizerconvergedtothecorrectfocallengthfandgyrodelaytd,thenthetwosig-nalsshouldalign.Fig.7plotstherstdimensionofsignals_xandf^!(t+td)beforeandafteralignment.Notehowaccuratelythegyroscopedatamatchestheimagemotions.Thissurprisingpreci-sionofMEMSgyroscopesiswhatenablesourmethodtoperformwellonthevideostabilizationandrollingshuttercorrectiontasks.4ResultsInthissectionwepresentdatasetandresultsforvideostabilizationandrollingshuttercorrection.Wealsocompareourapproachwithanumberoffeaturetrackerbasedalgorithms.4.1VideoandGyroscopeDatasetWeuseaniPhone4tocapturevideoandgyroscopedata.Theplat-formhasaMEMSgyroscope(seeg.8),whichwerunata(maxi-mum)frequencyof100Hz.Furthermore,thephonehasanRScam-eracapableofcapturing720pvideoat30framespersecond(fps).Theframe-rateisvariable;andtypicallyadjustsinlow-illuminationsettingsto24fps.Werecordtheframetimestampsaswellasthe5