/
Participant Presentations Participant Presentations

Participant Presentations - PowerPoint Presentation

briana-ranney
briana-ranney . @briana-ranney
Follow
382 views
Uploaded On 2016-12-07

Participant Presentations - PPT Presentation

Draft Schedule Now on Course Web Page httpsstor893spring2016webuncedu When You Present P lease L oad Talk on Classroom Computer Before Class An Interesting Objection Should not Study Angles ID: 498629

data pca cornea robust pca data robust cornea elliptical amp view cont outliers estimate big center

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Participant Presentations" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Participant Presentations

Draft Schedule Now on Course Web Page:

https://stor893spring2016.web.unc.edu

/

When You Present:

P

lease

L

oad Talk on

Classroom Computer Before ClassSlide2

An Interesting Objection:

Should not Study Angles

in PCABecause PC Scores (i.e. projections)Not ConsistentFor Scores and Can Show (Random!)

 

HDLSS

Math. Stat. of PCA

Due to Dan ShenSlide3

PC Scores (i.e. projections)

Not Consistent

So how can PCA find Useful Signals in Data?Key is “Proportional Errors”Axes have Inconsistent Scales, But Relationships are Still Useful

HDLSS

Math. Stat. of PCASlide4

PCA Context:

Spike Index

(as above: ) Sparsity Index : # non-0 entries ~ Compare: Conventional Sample PCA Sparse PCA: Shen & Huang (2008)Over Parameters and  

HDLSS

& SparsitySlide5

a

b

1

1

0

0

a>1

0 ≤

β

≤1

0.7

0.5

0.3

0.1

0.2

0.4

0.6

0.8

Spike Index

Sparsity Index

0≤

α

<

β

≤1

0≤

β

<

α

≤1

Jung and Marron

0≤

α

=

β

≤1

Sparse PCA: Inconsistent & New Consistency RegionSlide6

Sparse PCA Opens Up Whole New

Region of Consistency

HDLSS

& SparsitySlide7

Shen et al (2013)

Explores PCA Consistency under all of:

Classical: fixed, Portnoy: , Random Matrices: , HDMSS: , HDLSS: , fixed 

HDLSS

& Other

AsymptoticsSlide8

Question:

Which Statistic to Summarize Projections?

2 – Sample t statistic Mean Difference

HDLSS

Analysis of

DiProPermSlide9

Yet

both have mean 0

Reason: Less spread for original projectionE.g. Both i.i.d with t(5) marginalt-test summary rejects

HDLSS

Analysis of

DiProPerm

Type equation here.

 Slide10

Wei et al (

2015)

Mathematically Driven Recommendation:Use Mean Difference Summary to Focus on: vs.  

HDLSS

Analysis of

DiProPermSlide11

Cornea Data

Main Point: OODA Beyond FDA

Recall Interplay:Object Space  Descriptor SpaceSlide12

Cornea Data

Cornea: Outer surface of the eye

Driver of Vision: Curvature of CorneaData Objects: Images on the unit diskRadial Curvature as “Heat Map”Special Thanks to K. L. Cohen, N. Tripoli,UNC OphthalmologySlide13

Cornea Data

Cornea Data:

Raw DataDecomposeIntoModes ofVariation?Slide14

Cornea Data

Data Representation -

Zernike BasisPixels as features is large and wastefulNatural to find more efficient represent’nPolar Coordinate Tensor Product of:Fourier basis (angular)Special Jacobi (radial, to avoid singularities)See:Schwiegerling, Greivenkamp & Miller (1995) Born & Wolf (1980) Slide15

Cornea Data

Data Representation -

Zernike BasisDescriptor Space is Vector Space of Zernike CoefficientsSo Perform PCA ThereSlide16

PCA of Cornea Data

Recall: PCA can find (often insightful)

direction of greatest variabilityMain problem: display of result (no overlays for images)Solution: show movie of “marching along the direction vector”Slide17

PCA of Cornea Data

PC1 Movie:Slide18

PCA of Cornea Data

PC1 Summary:

Mean (1st image): mild vert’l astigmatismknown pop’n structure called “with the rule”Main dir’n: “more curved” & “less curved”Corresponds to first optometric measure(89% of variat’n, in Mean Resid. SS sense)Also: “stronger astig’m”

& “

no astig’

m”

Found corr’n

between astig’

m and curv

’re

Scores (cyan): Apparent Gaussian dist

’nSlide19

PCA of Cornea Data

PC2 Movie:Slide20

PCA of Cornea Data

PC2 Movie:

Mean: same as aboveCommon centerpoint of point cloudAre studying “directions from mean”Images along direction vector:Looks terrible???Why? Slide21

PCA of Cornea Data

PC2 Movie:

Reason made clear in Scores Plot (cyan): Single outlying data object drives PC dir’nA known problem with PCARecall finds direction with “max variation”In sense of varianceEasily dominated by single large observat’n Slide22

PCA of Cornea Data

Toy Example: Single Outlier Driving PCASlide23

PCA of Cornea Data

PC2 Affected by Outlier:

How bad is this problem?View 1: Statistician: Arrggghh!!!!Outliers are very dangerousCan give arbitrary and meaningless dir’nsSlide24

PCA of Cornea Data

PC2 Affected by Outlier:

How bad is this problem?View 2: Ophthalmologist: No ProblemDriven by “edge effects” (see raw data)Artifact of “light reflection” data gathering (“eyelid blocking”, and drying effects)Routinely “visually ignore” those anywayFound interesting (& well known) dir’n:steeper superior vs steeper inferior Slide25

Outliers in PCA

PCA for

DeeperToy E.g.Data:Slide26

Outliers in PCA

What can (should?) be done about outliers?

Context 1: Outliers are important aspects of the populationThey need to be highlighted in the analysisAlthough could separate into subpopulationsContext 2: Outliers are “bad data”, of no interestrecording errors? Other mistakes?Then should avoid distorted view of PCA  Slide27

Outliers in PCA

Motivates alternate approach:

Robust Statistical MethodsRecall main idea:Downweight (instead of delete) outliers a large literature. Good intro’s(from different viewpoints) are: Huber (2011) Hampel, et al (2011)Staudte & Sheather (2011) Slide28

Outliers in PCA

Controversy:

Is median’s “equal vote” scheme good or bad?Huber: Outliers contain some information,So should only control “influence” (e.g. median)Hampel, et. al.: Outliers contain no useful informationShould be assigned weight 0 (not done by median)Using “proper robust method” (not simply deleted) Slide29

Outliers in PCA

Robustness Controversy (cont.):

Both are “right” (depending on context)Source of major (unfortunately bitter) debate!Application to Cornea data:Huber’s model more sensibleAlready know some useful info in each data pointThus “median type” methods are sensible  Slide30

Robust PCA

What is

multivariate median?There are several! (“median” generalizes in different ways)Coordinate-wise median Often worst Not rotation invariant(2-d data uniform on “L”)Can lie on convex hull of data(same example)Thus poor notion of “center” Slide31

Robust PCA

Coordinate-wise median

Not rotation invariantThus poor notion of “center” Slide32

Robust PCA

Coordinate-wise median

Can lie on convex hull of dataThus poor notion of “center” Slide33

Robust PCA

What is

multivariate median (cont.)?ii. Simplicial depth (a. k. a. “data depth”): Liu (1990)“Paint Thickness” of dim “simplices” with corners at dataNice ideaGood invariance propertiesSlow to compute  Slide34

Robust PCA

What is

multivariate median (cont.)?iii. Huber’s M-estimate:Given data , Estimate “center of population” by

Where

is the usual Euclidean norm

Here: use only

(minimal impact by outliers)

 Slide35

Robust PCA

Huber

’s M-estimate (cont):Estimate “center of population” byCase : Can show

(sample mean)

(also called “Fréchet

Mean”, …)Again Here: use only

(minimal impact by outliers)

 Slide36

Robust PCA

M-estimate (cont.):

A view of minimizer: solution of

A useful viewpoint is based on:

=

Proj’

n of data onto sphere centered at

with radius

”And representation:

 Slide37

Robust PCA

M-estimate (cont.):

Thus the solution of is the solution of:

So

is

location where projected data are centered

Slide sphere around until mean (of projected data) is at center”

 Slide38

Robust PCA

M-estimate (cont.):

Data are + signsSlide39

Robust PCA

M-estimate (cont.):

Data are + signsSample Mean, outside “hot dog”of data Slide40

Robust PCA

M-estimate (cont.):

CandidateSphere Center,  Slide41

Robust PCA

M-estimate (cont.):

CandidateSphere Center, ProjectionsOf Data Slide42

Robust PCA

M-estimate (cont.):

CandidateSphere Center, ProjectionsOf DataMean ofProjections Slide43

Robust PCA

M-estimate (cont.):

“Slide sphere around until mean (of projected data) is at center” Slide44

Robust PCA

M-estimate (cont.):

Additional literature:Called “geometric median” (long before Huber) by: Haldane (1948)Shown unique for by: Milasevic and Ducharme (1987) Useful iterative algorithm: Gower (1974)(see also Sec. 3.2 of Huber (2011)).Cornea Data experience: works well for  Slide45

Robust PCA

M-estimate for Cornea Data:

Sample Mean M-estimateDefinite improvementBut outliers still have some influenceImprovement? (will suggest one soon) Slide46

Robust PCA

Now have robust measure of

“center”, how about “spread”?I.e. how can we do robust PCA?Slide47

Robust PCA

Now have robust measure of

“center”, how about “spread”?Parabs e.g.from above

With an

“outlier”

(???)

Added inSlide48

Robust PCA

Now have robust measure of

“center”, how about “spread”? Small Impact on MeanSlide49

Robust PCA

Now have robust measure of

“center”, how about “spread”? Small Impact on Mean More on PC1 Dir’n Slide50

Robust PCA

Now have robust measure of

“center”, how about “spread”? Small Impact on Mean More on PC1 Dir’n Dominates Residuals Thus PC2 Dir’n & PC2 scoresSlide51

Robust PCA

Now have robust measure of

“center”, how about “spread”? Small Impact on Mean More on PC1 Dir’n Dominates Residuals Thus PC2 Dir’n & PC2 scores Tilt now in PC3Viualization is veryUseful diagnosticSlide52

Robust PCA

Now have robust measure of

“center”, how about “spread”?How can we do robust PCA?Slide53

Robust PCA

Approaches to Robust PCA:

Robust Estimation of Covariance MatrixProjection PursuitSpherical PCASlide54

Robust PCA

Robust PCA 1:

Robust Estimation of Covariance MatrixA. Component-wise Robust Covariances:Major problem: Hard to get non-negative definitenessMinimum Volume Ellipsoid: Rousseeuw & Leroy (2005) Requires (in available software)Needed for simple definition of affine invariant  Slide55

Important Aside

Major difference between FDA (OODA)

& Classical Multivariate AnalysisHigh Dimension, Low Sample Size Data(sample size < dimension )Classical Multivariate Analysis:start with “sphering data” (multiply by )but doesn’t exist for HDLSS data  Slide56

Important Aside

Classical Approach to

HDLSS data: “Don’t have enough data for analysis, get more”Unworkable (and getting worse) for many modern settings:Medical Imaging (e.g. Cornea Data)Micro-arrays & gene expressionChemometric spectra dataSlide57

Robust PCA

Robust PCA 2:

Projection PursuitIdea: focus on“finding direction of greatest variability”Reference: Li and Chen (1985)Problems: Robust estimates of “spread” are nonlinearResults in many local optimaSlide58

Robust PCA

Robust PCA 2:

Projection Pursuit (cont.)Problems: Results in many local optimaMakes search problem very challengingEspecially in very high dimensionsMost examples have Guoying Li: “I’ve heard of , but 60 seems too big”  Slide59

Robust PCA

Robust PCA 3:

Spherical PCALocantore et al (1999)Slide60

Robust PCA

Robust PCA 3:

Spherical PCAIdea: use “projection to sphere” idea from M-estimationIn particular project data to centered sphere “Hot Dog” of data becomes “Ice Caps”Easily found by PCA (on proj’d data)Outliers pulled in to reduce influenceRadius of sphere unimportant Slide61

Robust PCA

Robust PCA 3:

Spherical PCAIndependent Derivation & Alternate Name:PCA of Spatial Signs(think: multivariate extension of “sign test”)Idea: test using #(+) & #(-) Slide62

Robust PCA

Robust PCA 3:

Spherical PCAIndependent Derivation & Alternate Name:PCA of Spatial Signs1st Paper: Möttönen & Oja (1995)Complete Description: Oja (2010)Slide63

Robust PCA

Spatial Signs

Interesting Variation:Spatial RanksIdea: Keep Track of “Depth”Via Ranks of RadiiSlide64

Robust PCA

Spherical PCA for Toy Example:

Curve DataWith anOutlierFirst recallConventionalPCASlide65

Robust PCA

Spherical PCA for Toy Example:

Now doSphericalPCABetter result?Slide66

Robust PCA

Spherical PCA for Toy Data:

Mean looks “smoother”PC1 nearly “flat” (unaffected by outlier)PC2 is nearly “tilt” (again unaffected by outlier)PC3 finally strongly driven by outlierOK, since all other directions “about equal in variation” Energy Plot, no longer ordered (outlier drives SS, but not directions)Slide67

Robust PCA

Spherical PCA for Toy Example:

Check outLaterComponentsSlide68

Aside On Visualization

Recall Multivariate Data Visualization Tool:

Parallel CoordinatesE.g. Fisher Iris Data Named Variables(thanks to Wikipedia) Slide69

Aside On Visualization

Recall Multivariate Data Visualization Tool:

Parallel CoordinatesE.g. Fisher Iris Data Named VariablesCurves are Data Objects(4-vectors)Inselberg (1985, 2009) Slide70

Robust PCA

Useful View: Parallel Coordinates Plot

X-axis:ZernikeCoefficientNumberY-axis:CoefficientSlide71

Robust PCA

Cornea Data, Parallel Coordinates Plot:

Top Plot: ZernikeCoefficientsSlide72

Robust PCA

Cornea Data, Parallel Coordinates Plot:

Top Plot: ZernikeCoefficientsAll n = 43 verySimilar.Slide73

Robust PCA

Cornea Data, Parallel Coordinates Plot:

Top Plot: ZernikeCoefficientsAll n = 43 verySimilarMost Action in fewLow Freq. Coeffs.Slide74

Robust PCA

Cornea Data, Parallel Coordinates Plot

Middle Plot: (Zernike Coefficients – median)Most Variation in lowest frequenciesE.g. as in Fourier compression of smooth signalsProjecting on sphere will destroy thisBy magnifying high frequency behaviorBottom Plot: discussed later Slide75

Robust PCA

Spherical PCA

Problem : Magnification of High Freq. Coeff’sSolution : Elliptical AnalysisMain idea: project data onto suitable ellipse, not sphereWhich ellipse? (in general, this is problem that PCA solves!)Simplification: Consider ellipses parallel to coordinate axes Slide76

Robust PCA

Spherical PCA

Problem : Magnification of High Freq. Coeff’sSolution : Elliptical AnalysisBackground (Univariate):MAD = Median Absolute Deviation MAD = Simple, High Breakdown, Outlier Resistant, Measure of “Scale” Slide77

Robust PCA

Rescale

Coords

Unscale

Coords

Spherical PCASlide78

Robust PCA

Elliptical Analysis (cont.):

Simple Implementation,via coordinate axis rescalingDivide each axis by MADProject Data to sphere (in transformed space)Return to original space (mul’ply by orig’l MAD) for analysisDo PCA on ProjectionsSlide79

Robust PCA

Elliptical Estimate of

“center”:Do M-estimation in transformed space (then transform back)Results for cornea data:Sample Mean Spherical Center Elliptical CenterElliptical clearly bestNearly no edge effect Slide80

Robust PCA

Elliptical PCA for cornea data:

Original PC1, Elliptical PC1Slide81

Robust PCA

Elliptical PCA for cornea data:

Original PC1, Elliptical PC1Still finds overall curvature & correlated astigmatismMinor edge effects almost completely goneSlide82

Robust PCA

Elliptical PCA for cornea data:

Original PC2, Elliptical PC2Slide83

Robust PCA

Elliptical PCA for cornea data:

Original PC1, Elliptical PC1Still finds overall curvature & correlated astigmatismMinor edge effects almost completely goneOriginal PC2, Elliptical PC2Huge edge effects dramatically reducedStill finds steeper superior vs. inferior Slide84

Robust PCA

Elliptical PCA for cornea data:

Original PC3, Elliptical PC3Slide85

Robust PCA

Elliptical PCA for Cornea Data (cont.):

Original PC3, Elliptical PC3-Edge effects greatly diminishedBut some of against the rule astigmatism also lostPrice paid for robustnessSlide86

Robust PCA

Elliptical PCA for cornea data:

Original PC4, Elliptical PC4Slide87

Robust PCA

Elliptical PCA for Cornea Data (cont.):

Original PC3, Elliptical PC3-Edge effects greatly diminishedBut some of against the rule astigmatism also lostPrice paid for robustnessOriginal PC4, Elliptical PC4Now looks more like variation on astigmatism??? Slide88

Robust PCA

Current state of the art:

Spherical & Elliptical PCA are a kludgePut together by Robustness AmateursTo solve this HDLSS problemGood News: Robustness Pros are now in the game:Maronna, et al (2006), Sec. 6.10.2Slide89

Robust PCA

Disclaimer on robust

analy’s of Cornea Data:Critical parameter is “radius of analysis”, : Shown above, Elliptical PCA very effective: Stronger edge effects, Elliptical PCA less useful: Edge effects weaker, don’t need robust PCA  Slide90

Big Picture View of PCA

Above View:

PCA finds optimal directions in point cloudSlide91

Big Picture View of PCA

Above View:

PCA finds optimal directions in point cloudSlide92

Big Picture View of PCA

Above View:

PCA finds optimal directions in point cloudMaximize projected variationMinimize residual variation(same by Pythagorean Theorem)Notes:Get useful insights about dataCan compute for any point cloudBut there are other views. Slide93

Big Picture View of PCA

Alternate Viewpoint: Gaussian LikelihoodSlide94

Big Picture View of PCA

Alternate Viewpoint: Gaussian LikelihoodSlide95

Big Picture View of PCA

Alternate Viewpoint: Gaussian Likelihood

When data are multivariate GaussianPCA finds major axes of ellipt’al contours of Probability Density Maximum Likelihood EstimateSlide96

Big Picture View of PCA

Alternate Viewpoint: Gaussian Likelihood

Maximum Likelihood EstimateSlide97

Big Picture View of PCA

Alternate Viewpoint: Gaussian Likelihood

When data are multivariate GaussianPCA finds major axes of ellipt’al contours of Probability Density Maximum Likelihood EstimateMistaken idea: PCA only useful for Gaussian data Slide98

Big Picture View of PCA

Simple check for Gaussian distribution:

Standardized parallel coordinate plotSubtract coordinate wise median(robust version of mean)(not good as “point cloud center”, but now only looking at coordinates)Divide by MAD / MAD(N(0,1))(put on same scale as “standard deviation”)See if data stays in range –3 to +3 Slide99

Big Picture View of PCA

E.g.

Cornea Data: StandardizedParallel CoordinatePlotShown beforeSlide100

Big Picture View of PCA

Raw Cornea Data:

Data – Median(Data – Median)------------------- MADSlide101

Big Picture View of PCA

Check for Gaussian

dist’n: Stand’zed Parallel Coord. PlotE.g. Cornea data (recall image view of data)Several data points > 20 “s.d.s” from the centerDistribution clearly not GaussianStrong kurtosis (“heavy tailed”)But PCA still gave strong insights Slide102

Big Picture View of PCA

Mistaken idea

: PCA only useful for Gaussian dataToy Example:Each MarginalBinaryClearly NOTGaussian

n

= 100, d = 4000Slide103

Big Picture View of PCA

Mistaken idea

: PCA only useful for Gaussian dataBut PCARevealsTrimodalStructureSlide104

GWAS Data Analysis

Genome Wide Association Study (GWAS)

Data Objects: Vectors of Genetic Variants, at known chromosome locations(Called SNPs)Discrete (takes on 2 or 3 values)Dimension as large as ~5 million(can be reduced, e.g.  Slide105

GWAS Data Analysis

Genome Wide Association Study (GWAS)

Cystic Fibrosis Study: Wright et al (2011)Interesting Feature: Some Subjects are Close Relatives(e.g. ~half SNPs are same) Slide106

GWAS Data Analysis

PCA View

Clear EthnicGroupsSlide107

GWAS Data Analysis

PCA View

Clear EthnicGroupsAnd SeveralOutliers!Eliminate WithSpherical PCA?Slide108

GWAS Data Analysis

Spherical

PCALooks Same?!?What is going on?Slide109

GWAS Data Analysis

Explanation:

HDLSS geometric representationRecall in limit as with fixed, Data lie near surface of -sphere Data tend to be ~orthogonal Family members are half the same Thus relatively small angle Enough for families to dominate PCs Spherical PC doesn’t change anything! Slide110

GWAS Data Analysis

Alternate Approach:

L1 PCAIdea replace norm, By norm:

More robust, since

no square

 Slide111

L1 Statistics

E.g. Simple Linear Regression

Replace Best L2 FitSlide112

L1 Statistics

E.g. Simple Linear Regression

Replace Best L2 FitWithBest L1 FitSlide113

L1 Statistics

E.g. Simple Linear Regression

Best L1 FitAdvantages: Robust Against Outliers Good “Sparsity” PropertiesSlide114

L1 PCA

Calculation:

Clever “backwards” algorithmBrooks, Dulá, Boone (2013)Slide115

L1 PCA

Challenge:

L1 ProjectionsHard to Interpret2-d ToyExampleNoteOutlierSlide116

L1 PCA

Challenge:

L1 ProjectionsHard to InterpretParallelCoordinateViewSlide117

L1 PCA

Conventional

L2 PCAOutlier PullsOff PC1DirectionSlide118

L1 PCA

L1 PCA

Much BetterPC1 DirectionSlide119

L1 PCA

L1 PCA

Much BetterPC1 DirectionBut Vary StrangeProjections(i.e. Little Data Insight)Slide120

L1 PCA

L1 PCA

Reason:SVD RotationBefore L1ComputationNote: L1 MethodsNot Rotation Invariant