Reference Data Collection for

Reference Data Collection for Global Cropland Mapping Russell G. Congalton & Kamini Yadav Department of Natural Resources & the Environment University of New Hampshire Presented at EROS June

Reference Data Collection for

Reference Data Collection for Global Cropland Mapping Russell G. Congalton & Kamini YadavDepartment of Natural Resources & the EnvironmentUniversity of New Hampshire Presented at EROS June 24, 2014

Sources of Reference DataFrom Existing Collections by OthersMust be evaluated carefully to see if appropriate (often not useful)From Higher Resolutions ImageryMust be assessed (error matrix) to determine if accurate enoughFrom Field CollectionExpensive and time consuming, but very necessary 2

Exploration and Analysis of Existing Ground Reference Data (393,429 Locations) Spatial Distribution

Important Attributes: Land Cover Type , Plot Size , Latitude-Longitude , Crop Type , Intensity , Irrigation Plot Size have a range of area ≤ 0.5 ha, 0.5 ≤ area ≤ 1 ha, 1 ≤ area ≤ 10 ha, area ≤ 10 ha, 500 and 90*90 Attribute Information

2055/393,429 ( 0.52% ) locations have C rop Type 1936/393,429 ( 0.49% ) locations have Crop Intensity 270,363/393,429 ( 68.72% ) locations have plot size127,379/393,429 (32.38%) locations have Irrigation Spatial Distribution of a Single Attribute in the Continents

Asia (17,699 Ground Reference Locations, 4.5%) Crop Type : 1470/17,699 ( 8.31% ) *1104 have Crop T ypes as Rice , Cotton, Soybeans, Wheat, Paddy, Potato )Intensity: 1499/17,699 (8.47%)Irrigation: 1970/17,699 ( 11.13% ) Plot Size : 1998/17,699 ( 11.29% ) Crop Type : 1470, 8.31 % Plot Size: 1998, 11.29 % Crop Intensity : 1499, 8.47 % Irrigation: 1970, 11.13% Continent wise availability of Ground Reference data

Africa ( 83,710 Ground Reference Locations, 21.28%) Crop Type : 580/83710 ( 0.69% ) * (187 have Crop Type as Rice, Maize, Cotton, Soybeans ) Intensity : 432/83710 (0.52%)Irrigation: 393/83710 (0.47%) Plot Size : 1/83710 ( 0% ) Crop Type : 580, 0.69 % Crop Intensity : 432, 0.52 % Irrigation : 393, 0.47 %

Europe (277,608 Ground Reference Locations, 70.56%) Crop Type : 1/277,608 ( 0% ) Intensity : 1/277,608 ( 0% ) Irrigation : 124,656/277,608 ( 44.9%)Plot Size: 266,892/277,608 (96.14%)Irrigation: 124656, 44.9%) Plot Size: 266892, 96.14% )

Australia (1,784 Ground Reference Locations, 0.45% ) Crop Type : NA Intensity: NA Irrigation : NA Plot Size: NA

North America (5,157 Ground R eference Locations, 1.31%) Crop Type : NA Intensity : NA Irrigation : NA Plot Size: NA

South America ( 5,399 ground reference locations, 1.37%) Crop Type : NA Intensity : NA Irrigation: NA Plot Size: NA

1715/393,429 ( 0.44% ) locations have C rop Type, Intensity and Irrigation 412/393,429 ( 0.1% ) locations have C rop Type, Intensity, Irrigation & Plot Size 453/393,429 ( 0.12% ) locations have C rop Type and Plot Size 1840/393,429 ( 0.47% ) locations have C rop Type and Irrigation 1749/393,429 ( 0.44% ) locations have C rop Type and Intensity Spatial Distribution for combination of attributes in the Continents

Continent Reference Data Crop Type Crop Intensity Irrigation Plot Size World 393429 2055 1936 127379 270363 Asia 17699 1470 1499 1970 1998 Africa 83710 580 432 393 1 Australia 1784 0 0 0 0 Europe 277608 1 1 124656 266892 North America 5157 0 0 0 0 South America 5399 0 0 0 0 Total (6 Cont.) 391357 2051 1932 127019 268891 Summary Europe and Africa covers approx. 71% and 21% of the Ground Reference Data respectively Europe does not have Crop Type and Intensity attribute for any of the Ground Location In Europe and Africa there is some anomaly in certain areas which can cause spatial auto correlation in the reference data In Asia, Rice C rop Type is named both as ‘Rice’ & ‘Paddy’ Only 0.1% data have all four required attributes (Crop Type, Intensity, Irrigation and Plot Size)

ThemesLand ResourcesAgro Climatic ResourcesSuitability and Potential Yield Actual Yield and Production Yield and production Gaps Agro Climatic Resources Thermal Regimes Moisture Regimes Growing Period Annual moisture conditions based on Length of Growing Period, i.e. the number of days during the year when both moisture and temperature are conducive to crop growth ( Reference length of growing period zones ) Suitability and Potential YieldAgro-climatic yieldClimate yield constraintsCrop CalendarAgro-ecological Suitability and productivity (Suitability distribution and Aggregate crop production potential) http://gaez.fao.org/

Agro-ecological Map of Africa

Ground Data Collection Collection of valid reference data is key to:Successful classification through representative TRAINING Data Effective accuracy assessment through representative VALIDATION Data Training data and Validation data must be INDEPENDENT of each other Sufficient reference data must be collected for both training & validation 18

GoalTo obtain the data necessary for completing the training and validation components of the project as efficiently and effectively as possible.Requires thorough documentation of the entire process. 19

Two Important TermsSampling unit – area covered by a single reference data sample. Must account for positional errorMust be homogeneous (same land cover class) NEVER referred to as a point Minimum Mapping Unit (MMU) – smallest area that will be mapped.Represents the required level of detailUsually larger than a single pixel 20

Field Guide ProceduresWe need to agree on the procedures that will be used to collect the reference data. The rest of the presentation will document what we have agreed upon to date and highlight ideas for discussionIt is important that we agree to these methods very soon as they need to be implemented this summer. 21

Mapping ObjectivesCrop Type (8) – wheat, maize, rice, barley, soy beans, pulses, cotton, & potatoes Crop Intensity (4) – single, double, triple, and continuous croppingIrrigation (2) – irrigated or rainfed 22

Crop Type DefinitionsDefinitions are a critical component of reference data collection. It is imperative that everyone agree on the definitions and then employ them instead of just their own ideas of what a land cover type is.Definitions of the 8 crop types are presented in the hardcopy paper. 23

Crop Intensity DefinitionCrop intensity is expressed as the number of cropping cycles in the same area in a single year.The choices are single, double, triple, and continuous. 24

Irrigation DefinitionsIrrigated crops – areas which are irrigated one or more times during the crop growing season. Irrigation is the artificial application of any amount of water to overcome crop water stress.Rain-fed crops – areas which have no irrigation whatsoever and are purely precipitation dependent. 25

Pre-field ActivityStratification of ContinentRequires geospatial informationImageryThematic mapsOther geospatial data (roads, elevation, slope, aspect, ecological regions, etc.) Want as few strata as possible, but enough to adequately represent the diversity of the continentI had asked each of you to look into this 26

Flow Chart of Procedure 27

StratificationBased on local knowledge and geospatial dataImplement unsupervised clustering or Hseg to see distinct spectral groupingsGuide the field sampling 28

Stratified Sampling 29 Determine sampling size in each strata. n – The total sample size - The sample size of strata h Note : If some types are rare, make sure to take a minimum number of samples in these crop types .  

General Principles – Field SurveyThe total number of reference data sampling units collected should be equal or more than nh. Stop and sample at places where more than one crop type (spectral cluster) is located and collect one or more sampling units for each provided:Only one sample unit will be taken in a single cluster.Sampling units should not be collected beyond the maximum distance from the road (D max ). The distance between the sampling units should be larger than the minimum distance (D min ) to minimize spatial autocorrelation. 30

Diagram of Sampling Approach31

Field Form32

Using Higher Resolution ImagerySome ThoughtsOur plan is to do this.Must assess our ability to interpret the map classes in selected areas using ground reference data. If not accurate, then can not use.Hoping that Hseg process can help here. 33

DiscussionTime to make plans and discuss issues34 THIS IS INCREDIBLY IMPORTANT!!!