Temporal and Spatial Models of Moth Distribution at the HJ Andrews Experimental Forest Erin Childs Pomona College Andrew Calderon Heritage University Evan Goldman Bard College Boston University Molly ONeill Lehigh University Clay Showalter Evergreen University with the hel ID: 816176
Download The PPT/PDF document "Mothematical Modeling:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Mothematical Modeling: Temporal and Spatial Models of Moth Distribution at the H.J. Andrews Experimental Forest
-
Erin Childs (Pomona College) , Andrew Calderon (Heritage University), Evan Goldman (Bard College, Boston University), Molly O’Neill (Lehigh University), Clay Showalter (Evergreen University), with the help of Olivia
Poblacion (Oregon State University)
Slide2Acknowledgements
Dr. Dietterich
, CS Professor Dr. Wong, CS ProfessorSteven Highland, Geosciences PhD CandidateJorge
Ramirez, Math ProfessorDan Sheldon, CS Post-docJulia Jones, Geosciences Professor Rebecca Hutchinson, CS Post-doc
Javier Illan, PhD, Post-doc
Slide3Studying Climate Change: Lepidoptera
Why are Lepidoptera are good indicator of climate change?
Past studies on Lepidoptera
Woiwod 1996: Detecting the effects of climate change on LepidopteraDewar and Watt 1992: Predicted changes in the synchrony of larval emergence and budburst under climatic warming
Slide4Research Questions
How is vegetation related to moth species distribution and composition?
How does climate affect moth phenology?
Slide5Study Site
H.J. Andrews Experimental Forest
http://andrewsforest.oregonstate.edu/about.cfm?topnav=2
Slide6How is vegetation related to moth species distribution and composition?
Slide7Vegetation Surveying: Methods
GPS coordinates
Walked out 30m and 100m radius in all directions
Presence/absence of 71 species of known host plants
Slide8Slide9Moth Trapping: Methods
Moth Trapping
9 sites selected
Equipment usedMoth preservation
Slide10Methods
Moth Identification
Slide11Moth Trapping Results
Semiothis
signaria
Pero occidentalis
Slide12Overview: Is vegetation a good predictor of moth species presence/absence?
Develop software tools for exploring/analyzing data
Run generalized boosted regression models (GBMs) for each moth species
Create GIS layers for the predicted locations of each moth species
Slide13Software Tasks for Data Exploration
Format data
Compare the similarities and differences between sites, moths and vegetation
Discover correlations between vegetation and moth species
Calculate marginal probabilities of plant occurrences
Visualize results
Slide14Measuring Similarity: Hamming Distance
Hamming distance is the number of co-
variates
that differ between sample sets
Smaller number means sets are more similar
Marginal Probabilities
Using the vegetation data collected at 20 sites, generate marginal probabilities for plants occurrences
If huckleberry (VAHU) is found at a site, what is the probability of finding thimbleberry (RUPA) but not licorice root (!LIGR) at that site?
Slide17Canonical Correlation Analysis (CCA)Canonical correlations analysis aims at highlighting correlations between two data sets
Gives us a way of making sense of cross-covariance matricesAllows ecologists to relate the abundance of species to environmental variables
Using CCA we analyzed our vegetation data and moth data
Slide18X-correlation:
Highlights any correlations among only moth species
(422x422)Y-correlation:Highlights any correlations among only plant species
(71x71)Cross-correlation:Highlights any correlations between both data sets(71x422)
Slide19Generalized Boosted Regression Models (GBMs)
Regression analysis allows us to explore the
relationships between individual moth presence/absence
(dependent variable) and various characteristics of each site
(independent variables)
The goal is to
minimize the loss function
, which represents the loss associated with an estimate being different from the true value
Basis functions
are an element of a set of vectors that, in linear combination,
can represent every vector in a given vector space
Every function can be represented as a linear combination of basis function
Boosting is the process of
iteratively adding basis functions
in a greedy fashion
so that each additional basis function further reduces the selected loss function
The model is
run several times with different values for the tuning parameters
to determine the best values
Slide20Validating the GBM
All available
regressors
are used in the model, meaning that the choice of independent variables is not supported by theory
The standard approach to validating models is to
split the data into a training and a test data set
The model is
fit on the training data
, then used to make
predictions on the test data
This ensures that the
model is
generalizable
and not
overfit
Slide21Running the Model
Ran the model for individual moth species using all 256 trap sites at HJA, using moth trapping data collected from 2004 to 2008
Did not include vegetation data, since we only collected it at 20 sites
The GBM lays a grid over the Andrews forest and calculates the predicted probability of the moth species being present for each grid cell
Slide22Visualizing GBM Results
Slide23How does climate affect moth phenology?
Slide24Thermal Climate of the H.J. Andrews
Experimental Forest
PRISM estimated mean monthly maximum and minimum temperature maps showing topographic effects of radiation and sky view factors. Provided by Jonathan W. Smith
Slide25Slide26Slide27Degree Day CurveUse a linear regression model to interpolate the degree for a given trap site for specific days of a year
Parameterize temperature in order to later be included in the temporal modelProduce degree day curves for any trap site
Slide28Find Coefficients
Each
Trap_ID will have two sets of coefficients (Maximum and Minumum)
Multi-Linear Regression Analysis
Slide29Predicting Daily Temp
Linear Interpolation
Fill gaps in the daily temperature data
In goes the
trap_ID
,
start_date
and
end_date
Out comes the min and max for the given day(s)
Slide30Temporal Distribution of Moths
Slide31The ProblemYear-round distribution of moths
Limited observation pointsUnseen, unmeasurable
dataCatching probabilitiesTotal moth population
Slide32Example: Flight timesConsider 3 trapping times and 4 associated intervals, and moths with flight times as follows
t
1
t
2
t
3
I
0
I
3
I
2
I
1
Slide33Example: Distribution
This gives us a distribution table:
I
0
I
1
I
2
I
3
I
0
0
0
0
0
I
1
0
0
0
0
I
2
0
0
0
0
I
3
0
0
0
0
t
1
t
2
t
3
I
0
I
3
I
2
I
1
Slide34Example: Distribution
This gives us a distribution table:
I
0
I
1
I
2
I
3
I
0
0
0
0
0
I
1
0
1
0
0
I
2
0
0
0
0
I
3
0
0
0
0
t
1
t
2
t
3
I
0
I
3
I
2
I
1
Slide35Example: Distribution
This gives us a distribution table:
I
0
I
1
I
2
I
3
I
0
0
1
0
0
I
1
0
1
0
0
I
2
0
0
0
0
I
3
0
0
0
0
t
1
t
2
t
3
I
0
I
3
I
2
I
1
Slide36Example: Distribution
This gives us a distribution table:
I
0
I
1
I
2
I
3
I
0
0
1
0
0
I
1
0
1
0
1
I
2
0
0
0
0
I
3
0
0
0
0
t
1
t
2
t
3
I
0
I
3
I
2
I
1
Slide37Example: Distribution
This gives us a distribution table:
I
0
I
1
I
2
I
3
I
0
0
1
1
0
I
1
0
1
0
1
I
2
0
0
0
0
I
3
0
0
0
0
t
1
t
2
t
3
I
0
I
3
I
2
I
1
Slide38Example: Distribution
This gives us a distribution table:
I
0
I
1
I
2
I
3
I
0
1
2
4
1
I
1
0
2
3
3
I
2
0
0
1
2
I
3
0
0
0
1
Slide39Example con’t
I
0
I
1
I
2
I
3
I
0
1
2
4
1
I
1
0
2
3
3
I
2
0
0
1
2
I
3
0
0
0
1
7
f
1
This gives us a distribution table
… and flight counts
Slide40Example con’t
I
0
I
1
I
2
I
3
I
0
1
2
4
1
I
1
0
2
3
3
I
2
0
0
1
2
I
3
0
0
0
1
7
11
f
1
f
2
This gives us a distribution table
… and flight counts
Slide41Example con’t
I
0
I
1
I
2
I
3
I
0
1
2
4
1
I
1
0
2
3
3
I
2
0
0
1
2
I
3
0
0
0
1
7
11
6
f
1
f
2
f
3
This gives us a distribution table
… and flight counts
Slide42Example: Flight CountsWhen trapping moths, all we see is flight counts
Given flight counts, we want to predict moth distribution
7
116
f
1
f
2
f
3
Slide43Maximum Likelihood Model
Maximize Prob (Data | Parameters)Data = Moth trapping
moths trapped: f=(f1, f2, … f
T) times trapped: t=(t1, t2, … tT)
Slide44Maximum Likelihood Model
Parameters = probability distribution of emergence time and life span Emergence and life span assumed to be Gaussian with parameters µE
, σE, µS, σS
Emergence ~ N(µE, σ
E)Life Span ~ N(µS, σ
S
)
Slide45Moth DistributionUse distributions to calculate
p(j,k), the probability of a moth emerging in interval j
and dying in interval k
t
j
r
s
d
t
k
t
k+1
t
j+1
I
j
I
k
…
Slide46Calculating Probabilities
Slide47Probability Table
Emergence Interval
Death Interval
I
0
I
1
…
I
T
I
0
P(0,0)
P(0,1)
…
P(0,T)
I
1
P(1,0)
P(1,1)
…
P(1,T)
I
2
…
…
…
I
3
P(T,1)
P(T,2)
…
P(T,T)
Slide48Multinomial DistributionAll moths fall into one of the probability squares
Moths have a multinomial distribution
Approximate this with a multivariate Gaussian (or normal)
Slide49Approximation ErrorWhat is the error associated with this approximation?
approximated as m!=
s(m)Error of
Slide50Likelihood
={µE, σE, µS, σS}
Slide51Likelihood surface
Log Loss
µ
e
µ
s
21 19 17 15 13 11 9 7 5 3 1
Slide52Results
Semiothisa
Signaria
Trap 38B
2005
Slide53Results
R
2 =0.23p<0.01
Slide54Results
Slide55Synthetic Data
Slide56Model Limitations: The “hidden” population and sample size
Trap 13B
n=9
n=28
n=87
Slide57Model Limitations:Sample Size
Slide58Estimating “Hidden” Moth Population
Slide59How is vegetation related to moth species distribution and composition?
CCA and Hamming distance shows a strong correlation between vegetation and moth speciesFor the Future:
Vegetation surveys at other trap sites would help improve the performance of the model
Slide60How does climate affect moth phenology?
Moth emergence shows a strong correlations to the local temperatureFor the future: incorporating the degree day curves we calculated for each site will make the model more robust
Slide61Questions?