/
Environmental Data Analysis with Environmental Data Analysis with

Environmental Data Analysis with - PowerPoint Presentation

mrsimon
mrsimon . @mrsimon
Follow
342 views
Uploaded On 2020-06-16

Environmental Data Analysis with - PPT Presentation

MatLab Lecture 4 Multivariate Distributions Lecture 01 Using MatLab Lecture 02 Looking At Data Lecture 03 Probability and Measurement Error Lecture 04 Multivariate Distributions ID: 779470

color probability lecture species probability color species lecture tan covariance pigeon white parameters model variance correlation birds gull irrespective

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Environmental Data Analysis with" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Environmental Data Analysis with MatLab

Lecture 4:

Multivariate Distributions

Slide2

Lecture 01

Using

MatLabLecture 02 Looking At DataLecture 03 Probability and Measurement Error Lecture 04 Multivariate DistributionsLecture 05 Linear ModelsLecture 06 The Principle of Least SquaresLecture 07 Prior InformationLecture 08 Solving Generalized Least Squares Problems Lecture 09 Fourier SeriesLecture 10 Complex Fourier SeriesLecture 11 Lessons Learned from the Fourier Transform Lecture 12 Power SpectraLecture 13 Filter Theory Lecture 14 Applications of Filters Lecture 15 Factor Analysis Lecture 16 Orthogonal functions Lecture 17 Covariance and AutocorrelationLecture 18 Cross-correlationLecture 19 Smoothing, Correlation and SpectraLecture 20 Coherence; Tapering and Spectral Analysis Lecture 21 InterpolationLecture 22 Hypothesis testing Lecture 23 Hypothesis Testing continued; F-TestsLecture 24 Confidence Limits of Spectra, Bootstraps

SYLLABUS

Slide3

purpose of the lecture

understanding propagation of error

from many datato several inferences

Slide4

probability

with several variables

Slide5

example100 birds live on an island

30 tan pigeons

20 white pigeons10 tan gulls 40 white gullstreat the species and color of the birds as random variables

Slide6

tan, t

white, w

pigeon, p30%20%gull, g10%40%color, c

species, s

Joint Probability,

P(

s,c

)

probability that a bird has a species, s, and a color, c.

Slide7

probabilities must add up to 100%

Slide8

tan,

t

white, wpigeon, p30%20%gull, g10%40%

color,

c

species,

s

pigeon,

p

50%

gull,

g

50%

P(

s,c

)

P(s)

sum rows

tan,

t

white,

w

40%

60%

sum columns

P(c)

species,

s

color,

c

Univariate probabilities can be calculated by summing the rows and columns of P(s,c)

probability of species, irrespective of color

probability of color, irrespective of species

Slide9

probability of species, irrespective of color

probability of color, irrespective of species

Slide10

tan,

t

white, wpigeon, p30%20%gull, g10%40%

color,

c

species,

s

P(

s,c

)

divide by row sums

divide by column sums

tan,

t

white,

w

pigeon,

p

60%

40%

gull,

g

20%

80%

color, cspecies,

s

P(c|s)

tan,

t

white,

w

pigeon,

p

75%

33%

gull,

g

25%

67%

color,

c

species,

s

P(

s|c

)

Conditional probabilities: probability of one thing, given that you know another

probability of color, given species

probability of species, given color

probability of color and species

Slide11

calculation of conditional probabilities

divide each species by fraction of birds of that color

divide each color by fraction of birds of that species

Slide12

Bayes Theorem

same, so solving for

P(s,c)rearrange

Slide13

3 ways to write P(c) and

P(s)

Slide14

so 3 ways to write Bayes

Therem

the last way seems the most complicated, but it is also the most useful

Slide15

Beware!

major cause of error both

among scientists and the general public

Slide16

example

probability

that a dead person succumbed to pancreatic cancer(as contrasted to some other cause of death) P(cancer|death) = 1.4%probability that a person diagnosed with pancreatic cancerwill die of it in the next five yearsP(death|cancer) = 90%vastly different numbers

Slide17

Bayesian Inference“updating information”

An observer on the island has sighted a bird.

We want to know whether it’s a pigeon.when the observer says, “bird sighted”, the probability that it’s a pigeon is:P(s=p) = 50%since pigeons comprise half of the birds on the island.

Slide18

Now the observer says, “the bird is tan”.The probability that it’s a pigeon changes.

We now want to know

P(s=p|c=t) The conditional probability that it’s a pigeon, given that we have observed its color to be tan.

Slide19

we use the formula

% of tan pigeons

% of tan birds

Slide20

observation of the bird’s color changed the probability that is was a pigeon

from

50%to 75%thus Bayes Theorem offers a way to assess the value of an observation

Slide21

continuous variables

joint probability density function,

p(d1, d2)

Slide22

d

1

d2p(d1,d2)d2Ld2Ld1Ld1Rif the probability density function ,

p(d

1

,d

2

),

is though of as a cloud made of water vapor

the probability that

(d

1

,d

2

)

is in the box given by the total mass of water vapor in the box

Slide23

normalized to unit total probability

Slide24

univariate p.d.f.’s

integrate away” one of the variablesthe p.d.f. of d1 irrespective of d2the p.d.f. of d2 irrespective of d1

Slide25

d

1

d2d1integrate over d2integrate over d1d2p(d1,d2)p(d2)p(d1)

Slide26

mean and variance calculated in usual way

Slide27

correlation

tendency of random variable

d1to be large/smallwhen random variable d2 is large/small

Slide28

positive correlation: tall people tend to weigh more than short people …

negative correlation:

long-time smokers tend to die young …

Slide29

d

1

d2positive correlationd1d2d1d2negative correlationuncorrelatedshape of p.d.f.

Slide30

d

1

d2p(d1,d2)d1d2d1d2s(d1,d2)s(d1,d2) p(d

1

,d

2

)

+

+

-

-

quantifying correlation

now multiply and integrate

p.d.f

.

4-quadrant function

Slide31

covariancequantifies correlation

Slide32

combine variance and covariance into a matrix, C

C =

σ12σ22σ1,2σ1,2note that C is symmetric

Slide33

many random variables

d

1, d2, d3 … dNwrite d’s a a vectord = [d1, d2, d3 … dN ]T

Slide34

the mean is then a vector, too

d

= [d1, d2, d3 … dN ]T

Slide35

and the covariance is an N×N matrix,

C

C =σ12σ22σ1,2σ32…

σ

1,3

σ

2,3

σ

1,2

σ

2,3

σ

1,3

variance on the main diagonal

Slide36

multivariate Normal p.d.f.

square root of determinant of covariance matrix

inverse of covariance matrixdata minus its mean

Slide37

compare with

univariate

Normal p.d.f.

Slide38

corresponding terms

Slide39

error propagation

p

(d) is Normal with mean d and covariance, Cd.given model parameters mwhere m is a linear function of dm = MdQ1. What is p(m)?Q2. What is its mean m and covariance Cm?

Slide40

Answer

Q1: What is

p(m)? A1: p(m) is NormalQ2: What is its mean m and covariance Cm? A2: m = Md and Cm = M Cd MT

Slide41

where the answer comes from

transform

p(d) to p(m)starting with a Normal p.d.f. for p(d) :and the multivariate transformation rule:

determinant

Slide42

this is not as hard as it looks

because the

Jacobian determinant J(m) is constant:so, starting with p(d), replaces every occurrence of d with M-1m and multiply the result by |M-1|. This yields:

Slide43

p

(

m) whererule for error propagationNormal p.d.f. for model parameters

Slide44

example

d

1measurement 1: weight of AAA

B

d

2

measurement

2: combined weight of A and B

suppose the measurements are

uncorrelated and that both have the same variance,

σ

d

2

Slide45

model parameters

m

1weight of BABm1 = d2 – d1d1B=m

2

weight

of B

minus

weight of A

A

B

A

A

A

-

-

+

A

B

=

+

-

+

m

2

= d

2

– 2d

1

Slide46

linear rule relating model parameters to data

m

= M dwith

Slide47

so the means of the model parameters are

Slide48

and the covariance matrix is

Slide49

model parameters are correlated, even though data are uncorrelated

bad

variance of model parameters different than variance of databad if bigger, good if smaller

Slide50

d

1

p(d1,d2)400400d2m1p(m1,m2)20

-

20

20

-

20

m

2

The model parameters, (m

1

, m

2

), have mean (10, -5), variance (10, 25) and covariance,15.

The data,

(d

1

, d

2), have mean (15, 25), variance (5, 5) and zero covarianceexample with specific values of d and σd2