A global land-cover validation data set, part I: fundamental design principles

Share Embed


Descripción

This article was downloaded by: [European Space Agency] On: 25 April 2012, At: 01:59 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of Remote Sensing Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tres20

A global land-cover validation data set, part I: fundamental design principles a

b

a

Pontus Olofsson , Stephen V. Stehman , Curtis E. Woodcock , a

a

a

Damien Sulla-Menashe , Adam M. Sibley , Jared D. Newell , a

Mark A. Friedl & Martin Herold

c

a

Department of Geography and Environment, Boston University, Boston, MA, 02215, USA b

Department of Forest and Natural Resources Management, State University of New York, Syracuse, NY, 13210, USA c

Laboratory of Geo-Information Science and Remote Sensing, Wageningen University, 6708, Wageningen, The Netherlands Available online: 29 Mar 2012

To cite this article: Pontus Olofsson, Stephen V. Stehman, Curtis E. Woodcock, Damien SullaMenashe, Adam M. Sibley, Jared D. Newell, Mark A. Friedl & Martin Herold (2012): A global landcover validation data set, part I: fundamental design principles, International Journal of Remote Sensing, 33:18, 5768-5788 To link to this article: http://dx.doi.org/10.1080/01431161.2012.674230

PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-andconditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings,

Downloaded by [European Space Agency] at 01:59 25 April 2012

demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

International Journal of Remote Sensing Vol. 33, No. 18, 20 September 2012, 5768–5788

Downloaded by [European Space Agency] at 01:59 25 April 2012

A global land-cover validation data set, part I: fundamental design principles PONTUS OLOFSSON*†, STEPHEN V. STEHMAN‡, CURTIS E. WOODCOCK†, DAMIEN SULLA-MENASHE†, ADAM M. SIBLEY†, JARED D. NEWELL†, MARK A. FRIEDL† and MARTIN HEROLD§ †Department of Geography and Environment, Boston University, Boston, MA 02215, USA ‡Department of Forest and Natural Resources Management, State University of New York, Syracuse, NY 13210, USA §Laboratory of Geo-Information Science and Remote Sensing, Wageningen University, 6708 Wageningen, The Netherlands (Received 28 April 2011; in final form 25 September 2011) A number of land-cover products, both global and regional, have been produced and more are forthcoming. Assessing their accuracy would be greatly facilitated by a global validation database of reference sites that allows for comparative assessments of uncertainty for multiple land-cover data sets. We propose a stratified random sampling design for collecting reference data. Because the global validation database is intended to be applicable to a variety of land-cover products, the stratification should be implemented independently of any specific map to facilitate general utility of the data. The stratification implemented is based on the Köppen climate/vegetation classification and population density. A map of the Köppen classification was manually edited and intersected by two layers of population density and a land water mask. A total of 21 strata were defined and an initial global sample of 500 reference sites was selected, with each site being a 5 × 5 km block. The decision of how to allocate the sample size to strata was informed by examining the distribution of the sample area of land cover for two global products resulting from different sample size allocations to the 21 strata. The initial global sample of 500 sites selected from the Köppen-based stratification indicates that these strata can be used effectively to distribute sample sites among rarer land-cover classes of the two global maps examined, although the strata were not constructed using these maps. This is the first article of two, with the second paper presenting details of how the sampling design can be readily augmented to increase the sample size in targeted strata for the purpose of increasing the sample sizes for rare classes of a particular map being evaluated.

1.

Introduction

Land cover is a fundamental property of landscapes, and an accurate characterization of land cover is essential for a number of different scientific areas. Several global

*Corresponding author. Email: [email protected] International Journal of Remote Sensing ISSN 0143-1161 print/ISSN 1366-5901 online © 2012 Taylor & Francis http://www.tandfonline.com http://dx.doi.org/10.1080/01431161.2012.674230

Global land-cover validation data set

5769

Downloaded by [European Space Agency] at 01:59 25 April 2012

land-cover products have been produced (Ledwith 2000, Loveland et al. 2000, Friedl et al. 2002, Bicheron et al. 2008) and more are forthcoming (Jung et al. 2006). Numerous regional land-cover products exist as well (Herold et al. 2006). Documentation of the accuracy of these land-cover maps is necessary to allow users to evaluate the utility of a map for their particular applications. Typically, each land-cover map has been subject to an independent accuracy assessment, but this is an expensive and inefficient approach to validation. Further, the lack of consistency in the accuracy assessment methodology implemented and reporting of results hinders accuracy comparisons among maps. A coordinated, comparable and regularly updated global land-cover validation database would be a cost-efficient approach to validate large-area land-cover products, and it would enhance the ability to compare different maps. 1.1 History of global land-cover assessment It has long been recognized that anyone can make a map of any surface phenomenon. The real question is how accurate is the map for a specific purpose? One perspective of this problem is that a map is nothing more than a hypothesis until it is tested against higher quality reference data. It is this perspective that drives the global scientific community to demand robust and continuous accuracy assessment of existing and future land-cover products. Rigorous accuracy assessment of maps at high resolutions has been the norm for decades (see the book by Congalton and Green (1999)), but cost and practical problems posed by the assessment of global maps have limited the efforts to assess products at these scales. A thorough randomized accuracy assessment was completed for the International Geosphere-Biosphere Programme (IGBP) DISCover land-cover map (Scepan 1999). This effort involved expert interpretation of individual 1 km pixels on high-resolution imagery and used many of the methods identified in the ‘best practices’ document (Strahler et al. 2006), resulting in the production of confusion matrices and estimates of class-specific accuracies and their associated standard errors. One lesson learnt from the IGBP exercise is that accuracy assessment is expensive, thereby highlighting the imperative to develop global validation schemes that support the assessment of more than one map at a time. For example, the accuracy assessment data collected (at significant expense) for the assessment of the accuracy of the IGBP/DISCover land-cover map could not be used to produce statistically defensible estimates of a University of Maryland (UMD) land-cover map despite the fact that both maps were based on AVHRR data of the same year. Differences in the legends of the two maps prevented application of the DISCover validation data to the UMD map. As a result, to date there is no such rigorous global assessment for the accuracy of the UMD land-cover map. An accuracy assessment of the Global Land Cover 2000 (GLC 2000, Ledwith 2000) map was done using a randomized sample and image interpreters from around the world (Mayaux et al. 2006). The primary weakness of this effort relates to budget limitations that limited the amount of training and the resulting consistency between interpreters. GlobCover, based on 300 m MERIS (Medium Resolution Imaging Spectrometer) data from ENVISAT (Environmental Satellite) from the year 2005, was subjected to a validation effort in 2008 (Bicheron et al. 2008). A total of 4258 sample points were selected using stratified random sampling. Certain areas of the world were unsampled or poorly sampled, including Pakistan, Afghanistan, Iran, Japan, Colombia, Central America and Eastern Brazil. Of the original 4258 sample points, 1091 were discarded

Downloaded by [European Space Agency] at 01:59 25 April 2012

5770

P. Olofsson et al.

as separate interpreters did not agree on the sample land cover and another 1052 were discarded because the interpreters failed to identify a single unique land cover. The reported accuracy is therefore based on unambiguous, easily identified sample points. The exclusion of certain geographic areas further weakens the validation effort. The IGBP MODIS (Moderate Resolution Imaging Spectrometer) land-cover product (Friedl et al. 2002, 2010) has not been subjected to the rigors of comparison with a probability sample of accuracy assessment sites, so statistically defensible estimates of accuracy are not available. More limited efforts have been made using post-classification probabilities and opportunistic samples from around the world based on interpretation primarily of Landsat imagery. These results are available at www-modis.bu.edu/landcover and are documented in Herold et al. (2006). Direct comparison of the accuracies of the various land-cover products is difficult. Problems stem from the lack of consistent legends (e.g. IGBP vs Land Cover Classification System (LCCS)), lack of consistent sample designs and intensities and, in the case of MODIS, the absence of a probability sampling design. Despite this gloomy situation, recent progress on two fronts is encouraging with respect to the future of accuracy assessment of land cover. First, the international effort to define consensus ‘best available’ methods for accuracy assessment has been completed and published (Strahler et al. 2006). Second, international progress towards acceptance of LCCS as the standard for defining legends for land-cover maps points to a time when direct comparison of alternative land-cover products will be possible. The main limitation of the efforts described above lies in the inability to make significant use of accuracy assessment data for the validation of multiple maps. This raises the cumulative cost of performing rigorous accuracy assessments on all of these global land-cover maps. If, instead, we had a global validation database available and applicable to more than one land-cover map at a particular time, the overall cost invested in accuracy assessments of regional and global maps would be greatly reduced and would also allow for better comparison of different maps. 1.2 Constructing a reference database for validating multiple maps The overwhelming majority of the research and practical experience with accuracy assessment accumulated over the past 30 years applies to ‘one at a time’ assessment of land-cover maps. The question of how to assess the accuracy of several maps in an integrated, cohesive manner has received very little attention. The fundamental components of the accuracy assessment methodology remain the same whether a single map or multiple maps are being assessed. That is, the methodology must include specification of the sampling design, response design and analysis protocols (Stehman and Czaplewski 1998). The sampling design and response design protocols become more complicated when the goal is to develop a methodology that is applicable to more than one land-cover map. The difficulty in challenging the sampling design is how to accommodate the multiple accuracy estimation objectives arising from assessing several maps. Typically, the sampling design can be tailored to achieve a fixed set of accuracy objectives for a particular map. However, the sampling design considerations change when the objectives encompass the assessment of several maps (e.g. tailoring the stratification to a specific map will likely reduce the benefit of stratification for other maps), so the typical sampling designs applied to assess a particular map may not be effective when several maps must be assessed from a common sampling design. Similarly, when developed to assess a single map, the choice of the response design protocol needs to address only the land-cover legend and spatial resolution of that

Downloaded by [European Space Agency] at 01:59 25 April 2012

Global land-cover validation data set

5771

particular map. If the goal is to create a database applicable to multiple land-cover maps, the response design considerations become more complex in an attempt to accommodate different legends and spatial resolutions. This article is the first of a two-part series developing the conceptual framework and specific methodological details for constructing a global validation database that, over an extended period of time, could be used to assess the accuracy of multiple land-cover maps. In this article, the key elements of the accuracy assessment protocol required to select an initial baseline global validation sample are developed. Of particular interest is the specification of a practical and generally applicable stratification scheme and a mechanism to evaluate different options for allocating the initial baseline validation sample to these strata. In the companion paper (part II; Stehman et al. in press) the details of the sampling methodology for augmenting the baseline global sample over time are presented. The response design for the reference class labelling protocol uses a legend based on the United Nations’ Food and Agriculture Organization LLCS (Di Gregorio and Jansen 2000) for classifying high-resolution imagery. This legend has been developed to allow the analyst to provide additional distinctions of ecology, hydrologic regimes and land use within each primary class label. It is based on the comparison of a reference map to the land cover of the map being evaluated, with pixels matched using fuzzy membership according to the methods in Ahlqvist (2005). 2. Sampling design The global validation database will be based on a stratified random sample of 5 × 5 km blocks at which the ‘reference’ land-cover data for the locations is determined from very high resolution satellite imagery (e.g. QuickBird). Within each sample block, a ‘map’ of the land cover is derived from the high-resolution imagery. To create the sampling frame for implementing the sampling design, the land surface of the earth was partitioned into 5 × 5 km blocks. The sampling design for selecting the sample blocks was constructed to achieve several criteria: (1) it satisfies definition of a probability sampling design; (2) it provides adequate sample sizes for rare land-cover classes; (3) it allows flexibility to change sample size in response to unpredictable funding or revised accuracy assessment objectives; (4) it focuses sample sites in the areas most difficult for land-cover mapping. Rather than stratified by a single land-cover map, which would diminish the advantage of stratification when evaluating other maps, strata were constructed based on a combination of Köppen climate classes (Peel et al. 2007) and population density. Stratified random sampling provides a probability sampling design that can readily be modified to increase or decrease the sample size within a stratum or targeted geographic region (Stehman et al. in press). 2.1 Stratification Stratification is commonly incorporated in sampling designs for accuracy assessment, because stratification allows specification of the sample size allocated to each stratum. Typically, the strata represent land-cover classes, so stratified sampling allows for increasing the sample size from rare land-cover classes, which in turn decreases the standard errors of the accuracy estimates for these rare classes. Within strata, sample units can be selected via a simple random or systematic protocol. Stratified random sampling in which simple random sampling is used within each stratum is the proposed option because of the ease of increasing or decreasing the sample size within

Downloaded by [European Space Agency] at 01:59 25 April 2012

5772

P. Olofsson et al.

Af

BWh

Csa

Cwa

Cfa

Dsa

Dwa

Dfa

Am

BWk

Csb

Cwb

Cfb

Dsb

Dwb

Dfb

Aw

BSh

Cwc

Cfc

BSk

Data source: GHCN v2.0 station data ET Temperature (N = 4 844) and EF Precipitation (N = 12 396)

Dsc

Dwc

Dfc

Period of record: All available

Dsd

Dwd

Dfd

Min length: ≥30 days for each month

Contact: Murray C. Peel ([email protected]) for further information

Resolution: 0.1° lat/long

Figure 1. World map of Köppen–Geiger climate classification. The original Köppen map from Peel et al. (2007). The classes are represented by a two- or three-letter symbol, where ‘A’ is tropical, ‘B’ is arid, ‘C’ is temperate, ‘D’ is cold and ‘E’ is polar. The second letter represents the seasonal precipitation levels and the third temperature. Note that classes ‘Csc’ (temperate, cold and dry summers), ‘EFH’ and ‘ETH’ (the high-altitude types of the frost and tundra classes) are not included in this figure. Figure reproduced with permission.

strata afforded by simple random selection (Stehman et al. in press). As we wanted to keep the selection of sampling sites independent of any of the existing land-cover products, an external source of information was sought as a basis for stratification. The fundamental assumption is that current land cover worldwide is influenced by climate as natural driver and human disturbances as anthropogenic driver. A stratification using representative information for both variables should offer a suitable foundation to allocate samples independent of existing land-cover maps yet relevant for future land-cover data sets. The Köppen climate system classifies the world into five major climatic groups based on annual and monthly air temperature and precipitation. Each of these groups is further divided into subgroups based on seasonal patterns of temperature and preciptation. As the climate groups are intended to correspond to vegetation groups (Trewartha 1968), the Köppen system provides a suitable basis for stratification. We used an updated Köppen version in which 32 classes are identified at 0.1◦ resolution or 11 km at the equator (Peel et al. 2007; figure 1 and table 1). The classification is based on the data from the National Oceanic and Atmospheric Administration’s (NOAA’s) Global Historical Climatology Network and is independent of existing land-cover maps. 2.1.1 Modifying the Köppen map. As 32 climatic groups were considered too many for use as strata, these 32 classes were collapsed to 13. In some cases, entire classes were merged, for example, classes 4 and 5 (BWh: ‘Desert, hot’ and BWk: ‘Desert, cold’) were merged into one desert class. In other cases, classes were manually merged into different classes for different geographic regions; for example, class 17 (Dsa: ‘Continental, dry and hot summer’) was merged with the steppe class in Asia, whereas

Global land-cover validation data set

5773

Table 1. The climatic groups of the original Köppen map.

Downloaded by [European Space Agency] at 01:59 25 April 2012

No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Group Tropical Arid

Temperate

Continental

Polar

Rainforest Monsoon Savanna Desert Desert Steppe Steppe Dry summer Dry summer Dry summer Dry winter Dry winter Dry winter No dry season No dry season No dry season Dry summer Dry summer Dry summer Dry summer Dry winter Dry winter Dry winter Dry winter No dry season No dry season No dry season No dry season Tundra Frost Tundra Frost

Hot Cold Hot Cold Hot summer Warm summer Cold summer Hot summer Warm summer Cold summer Hot summer Warm summer Cold summer Hot summer Warm summer Cold summer Very cold winter Hot summer Warm summer Cold summer Very cold winter Hot summer Warm summer Cold summer Very cold winter >1500 m >1500 m

Code

Distribution (%)

Af Am Aw BWh BWk BSh BSk Csa Csb Csc Cwa Cwb Cwc Cfa Cfb Cfc Dsa Dsb Dsc Dsd Dwa Dwb Dwc Dwd Dfa Dfb Dfc Dfd ET EF ETH EFH

3.2 2.6 9.1 12.0 4.2 4.8 6.1 1.1 0.7 0.0 2.8 1.4 0.0 4.0 2.1 0.1 0.3 0.4 0.6 0.0 0.8 1.5 2.2 0.4 1.7 8.7 16.6 1.7 7.7 3.1 0.3 0.0

Note: The rightmost column gives the global distribution of the groups.

in North America, it was merged with the continental forest class. Table 2 summarizes the collapsing of the initial 32 classes to the final 13 climate classes. The workflow for the creation of the final strata map from the original Köppen map is described in the three subsections below. The workflow is also outlined in the block diagram in figure 2. In addition, as elevation was not taken into account when interpolating the climatic station data (Peel et al. 2007), certain climatic borders were not evident in the original Köppen map. For example, the Andes in Patagonia, South America, creates a distinct climatic border between a moist temperate climate and the more arid areas east of the Andes (Trewartha 1968), whereas the Cfb zone (temperate, no dry season, warm summer) clearly stretches too far east of the mountain range (figure 3). As a final editing task, small islands of climate zones within larger zones without obvious climatic basis were removed (figure 4; compare with figure 3). Trewartha (1968) and Strahler and Strahler (2004) were the primary sources of information for this editing step, and the resulting map is shown in figure 5.

5774

P. Olofsson et al. Table 2. Modified (1 = in Asia; 2 = in North America; 3 = around the Himalayas).

Section 2.1.1

1 2 3 4 5 6 7 8 9 10 11 12 13

Group

Original numbers

Tropical rainforest Tropical seasonal forest Savanna Desert Steppe Mediterranean Temperate evergreen forest Marine west coast Continental forest Boreal forest Cold boreal forest Tundra Frost

1 2 3 4+5 6 + 7 + 16 + 171 + 181 + 22 8:13 14 15 + 16 172 + 182 + 21 + 22 + 25 + 26 19 + 23 + 27 20 + 24 + 28 233 + 29 + 31 30 + 32

Original Köppen map (32 classes; table 1; figure 1)

Merging of classes

Collapsed Köppen map (13 classes; table 2)

Section 2.1.2

Manual editing Strata map v. 1 (26 classes: 13 populated and 13 unpopulated)

Intersection

Edited Köppen map (13 classes; table 2; figure 4)

Merging of classes Population density (>5 persons/km2); figure 5

Strata map v. 2 (20 classes: 7 populated and 13 unpopulated)

Population density (>1000 persons/km2); figure 6

Section 2.1.3

Downloaded by [European Space Agency] at 01:59 25 April 2012

New no.

Water mask (>25% water)

Intersection

Masking

Strata map v. 3 (21 classes: 7 pop., 13 unpop., 1 urban; table 3)

Final strata map (21 classes with water pixels removed; figure 8)

Figure 2. Block diagram outlining the workflow for creating the final strata map.

Global land-cover validation data set

5775

Af. Tropical rainforest Am. Tropical monsoon Aw. Tropical savanna BWh. Arid desert hot BWh. Arid desert cold BSh. Arid steppe hot BSk. Arid steppe cold Csa. Temp. dry and hot summer Csb. Temp. dry and warm summer CWa. Temp. dry winter hot summer CWb. Temp. dry winter warm summer CWc. Temp. dry winter cold summer Cfa. Temp. no dry season hot summer

Downloaded by [European Space Agency] at 01:59 25 April 2012

Cfb. Temp. no dry season warm summer Cfc. Temp. no dry season cold summer ET. Polar tundra

Figure 3. The original Köppen climate zones for lower South America.

2.1.2 Intersection by population data. Although climate groups are likely to be related to natural land covers, areas with a high degree of human activity have land cover with little or no relation to climate. For example, Western Europe is mainly within the ‘temperate, warm summer, no dry season’ zone, whereas the human-shaped landscape is highly heterogeneous with a mix of different land covers. To reflect this dimension of land-cover complexity, data on population density were intersected with the climate map. A global population density map from 2000 (CIESIN 2005) at a resolution of 2.5 arc minutes (4.6 km at the equator) was used for this purpose. A threshold of 5 persons/km2 was chosen to separate between populated and unpopulated areas (figure 6) so that each climate type was split into populated and unpopulated categories. This yielded a total of 26 classes. Of these, the populated versions of the groups Cold Boreal Forest, Tundra and Ice (11, 12 and 13) were merged with the unpopulated version as they contained only a handful of grid cells. The unpopulated versions of Mediterranean and marine west coast were also small and thus merged with the populated versions. Finally, populated boreal forest was merged with the unpopulated because of a relatively small number of populated pixels. A total of 20 strata, 7 populated and 13 unpopulated, were the result of the merging process.

5776

P. Olofsson et al. Tropical rainforest Tropical seasonal forest Tropical savanna Desert Steppe Mediterranean Temperate evergreen forest Marine west coast

Downloaded by [European Space Agency] at 01:59 25 April 2012

Tundra

Figure 4.

Same as figure 3 but after the manual editing of the climate zones.

Tropical rainforest

Desert

Mediterranean

Continental forest

Tundra

Tropical seasonal forest

Steppe

Temperate evergreen forest

Boreal forest

Snow and ice

Marine west coast

Tropical savanna

Figure 5.

The Köppen map collapsed to 13 classes.

Downloaded by [European Space Agency] at 01:59 25 April 2012

Global land-cover validation data set

5777

Figure 6. Areas with a population density above 5 persons/km2 .

Figure 7. Areas with a population density above 1000 persons/km2 for parts of Europe, Africa and Asia.

None of the 20 strata effectively captured cities, so a final stratum was defined based solely on population density. This was accomplished by reclassifying all areas with a population density over 1000 persons/km2 to an urban stratum (figure 7). This gave a final number of 21 strata (table 3).

5778

P. Olofsson et al.

Downloaded by [European Space Agency] at 01:59 25 April 2012

Table 3. The final strata and their global distribution and allocation. No.

Strata

Distribution (%)

Prop. all.

Final allocation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Tropical rainforest Tropical seasonal forest Savanna Desert Steppe Mediterranean Temperate evergreen forest Marine west coast Continental forest Boreal forest Cold boreal forest Tundra Frost pTropical rainforest pTropical seasonal forest pTropical Savanna pDesert pSteppe pTemperate evergreen forest pContinental forest Urban

2.4 2.0 5.0 14.4 8.3 1.6 1.2 1.6 4.3 12.7 1.2 3.3 1.2 2.2 1.9 11.0 6.0 7.0 5.2 6.7 0.6

12 10 25 72 41 8 6 8 22 63 6 17 6 11 10 55 30 35 26 34 3

10 10 15 20 20 25 25 25 30 50 10 10 0 15 10 40 25 35 40 50 35

Note: A ‘p’ in front of a stratum name denotes populated version.

2.1.3 Completing the strata map. Next, blocks dominated by water were removed. A water mask based on MODIS data at 250 m resolution was aggregated to 5 km resolution to give the water percentage in each sample block. A threshold was set so that all blocks with 25% or more water present were screened out. The strata map was reprojected into an equal-area projection (Goode Homolosine) and resampled to 5 km resolution. The final strata for the area in figures 3 and 4 are shown in figure 8, and the complete (global) map of the final strata is shown in figure 9. The purpose of the stratification is to create the option to allocate the sample size to strata so that the sample size can be increased for targeted land-cover classes. Although the strata do not correspond exactly to the land-cover classes of any particular map, rare land-cover classes will generally be associated with one or more strata. So, to target a particular land-cover class, the sample size would be increased in those strata in which the targeted class is found. The choice of stratification and sample size allocation will affect the standard errors of the accuracy estimators, but the accuracy estimators from stratified sampling are unbiased regardless of these choices. 2.2 Stratified sampling and sample size allocation The sample size for the initial global baseline sample was set to 500 based on approximate cost and precision considerations. For an average cost per sample block of €200, the initial baseline sample could be completed at a cost of €100 000. From the standpoint of precision, a rough approximation based on treating the 500 blocks as a simple random sample would yield a variance for overall estimated accuracy as p × (1 − p) ÷ 500, where p is the true (but unknown) overall accuracy (the standard error of overall accuracy is the square root of

Global land-cover validation data set

5779

Tropical rainforest Tropical seasonal forest Tropical savanna Desert Steppe Mediterranean Temp evergreen forest Marine west coast Tundra pTropical rainforest pTropical seasonal forest pTropical savanna

Downloaded by [European Space Agency] at 01:59 25 April 2012

pDesert pSteppe pTemp evergreen forest Urban

Figure 8. The final strata for the area shown in figures 3 and 4.

the variance). For example, if the true overall global accuracy is 0.6, the standard error is 0.02. The actual standard error may be smaller depending on the success of the stratification. The baseline sample is not intended to serve as an all purpose, stand-alone set of validation sites, but rather to provide a catalyst upon which additional sample units can be added to produce precise accuracy estimates. Although the baseline sample could be used as a default sample, the intended use is for the baseline sample to be augmented (Stehman et al. in press) as subsequent land-cover mapping projects add reference sample data to the existing sample. Each sample unit is a 5×5 km block, so for a land-cover map at 1 km, there would be as many as 25 pixels per block for comparison with the reference map. As the resolution of the land-cover maps becomes finer, the sample size within each block increases. Figure 10 illustrates the use of a 5×5 km block as a sample assessment unit. The high-resolution image and reference data are shown as well as the corresponding subsets of the MODIS IGBP land-cover product at 463 m resolution (Friedl et al. 2002) and GlobCover at 300 m resolution (Bicheron et al. 2008). As illustrated by the

5780

P. Olofsson et al.

Downloaded by [European Space Agency] at 01:59 25 April 2012

Tropical rainforest Tropical seasonal forest Tropical savanna Desert Steppe Mediterranean Temperate evergreen forest

Marine west coast Continental forest Boreal forest Tundra Snow and ice Cold boreal forest pTropical rainforest

pTropical seasonal forest pTropical savanna pDesert pSteppe pTemperature evergreen forest pContinental forest Urban

Figure 9. Final strata.

figure, for each block, a whole set of pixels from the land-cover map to be validated will be available for comparison with the reference map. The next step following the stratification was to allocate these 500 sample blocks among the different strata. Allocating an equal sample size to each stratum (yielding 24 sample blocks per stratum) implies that all strata are considered equally important. In reality, strata differ in importance (see below). If the sample is allocated proportional to stratum area, strata considered important such as urban would end up with very few sample blocks whereas large homogeneous strata such as desert would have an unnecessarily large sample size (table 3). Instead, we wanted to target the sample allocation to certain strata and, in turn, certain land-cover classes. The basis for the allocation was that the complex landscapes which are more difficult to map should have relatively high sample densities. Heterogeneous landscapes with a mix of different land-cover classes (such as mosaics of cropland and natural vegetation and built-up areas) were considered complex whereas homogeneous land covers such as desert and forest were not. This strategy was based on the notion that homogeneous landscapes are easier to classify than heterogeneous landscapes (Jung et al. 2006, Friedl et al. 2010). As it is our objective to assess the accuracy of a variety of different land-cover products, land-cover classes more likely to be misclassified were of higher interest. A problem with using a stratification that is independent of land-cover products is the lack of one-to-one correspondence between a stratum and a land-cover class. Accordingly, we did not have the means to target individual land-cover classes without increasing the sample density in other classes. For example, if we wanted to increase the number of sample blocks in the class ‘Urban & Built-up Land’ in the MODIS land-cover product by allocating more sample blocks to the urban stratum, we would in addition to boosting the sample size for the urban class also boost the sample size for land-cover classes that tend to appear in proximity to urban areas such as cropland and cropland mosaic. An evaluation of the one-to-many relationship between

Downloaded by [European Space Agency] at 01:59 25 April 2012

Global land-cover validation data set

5781

Figure 10. (a) 6 × 6 km QuickBird image over an area in Sugarland, TX (the 5 × 5 sample block with a 1 km buffer). (b) Land-cover classification of the image – red: artificial areas; dark green: trees; light green: herbaceous/grasslands; blue: water; yellow: herbaceous/croplands; grey: barren. (c) Corresponding subset of the MODIS IGBP land-cover product, and (d) GlobCover land-cover product. The grey rectangle shows the extent of the QuickBird image.

strata and classes was conducted using the MODIS IGBP land-cover product and GlobCover (both global maps reprojected to Goode Homolosine). The distribution of sample area among the land-cover classes of each map was cross-tabulated by strata for different allocations of sample size to strata (i.e. we calculated the area of each MODIS IGBP land class in the sample for a specified sample size allocation of the 500 sample blocks to the 21 strata). The results are shown in figures 11 and 12. The final decision on the allocation of the sample size to strata (table 3) was based on a subjective assessment of which allocation provided an effective distribution of the sample area among land-cover classes for the two global land-cover products examined (i.e. an allocation that increased the sample area for some of the rarer classes at the expense of sample area for more common classes was desirable).

5782

P. Olofsson et al.

MODIS IGBP land-cover class samples (%)

Prop allocation Final allocation

16 14 12 10 8 6 4 2

Barren

Snow and ice

Cropland/veg mos

Urban

Cropland

Perm wetland

Grassland

Savanna

Woody savanna

Open shrubland

Closed shrubland

Mixed

Deciduous BL

Deciduous NL

Evergreen BL

0 Evergeen NL

Downloaded by [European Space Agency] at 01:59 25 April 2012

18

Figure 11. Percentage of samples in the different MODIS IGBP land-cover classes. The blue bars show the distribution of samples if they were allocated in proportion to the area of the strata. The red bars show the sample distribution according to the final allocation. The proportional and final sample allocations are listed in table 3.

The native resolution of the final strata map in figure 9 is 5 × 5 km. The land area of the world (Antarctica excluded) is our sampling frame, and every grid cell in the strata map constitutes a potential sample block. All grid cells in the strata map were assigned a unique ID, which was extracted to a list together with the stratum label, and 500 sample blocks were randomly selected from this list according to the sample allocation to strata shown in table 3. The location of the global sample of reference sites is shown in figure 13. A total of 35 blocks globally were selected from the urban stratum (table 3). These 35 blocks represented the largest increase in sample size among all strata relative to the sample size resulting from proportional allocation. Urban is a relatively rare class and urban areas are assumed difficult to classify because they may vary significantly in complexity and may consist of a range of different land covers. The sample size of three urban sample blocks that would have resulted from proportional allocation was considered too few and the sample size was increased to 35. Mediterranean, temperate evergreen forest and marine west coast were all boosted from 6–8 samples to 25. These strata represent areas with a high degree of heterogeneity and human impact and include Western Europe, the west coast of North America, the Mediterranean and the east coast of Australia. Samples in ice-covered areas (Frost) were set to zero. The sample size for Desert was decreased the most (not counting Frost) with a final allocation of 20 sample blocks compared with 72 from a proportional allocation. As both Frost and Desert represent large homogeneous barren areas, they are considered easier to classify and hence are less intensively sampled. For the same reason, the sample size for Steppe was decreased by over 50% relative to proportional allocation. The

Global land-cover validation data set

5783

MODIS IGBP land-cover class samples (%)

Downloaded by [European Space Agency] at 01:59 25 April 2012

18

Prop allocation Final allocation

16 14 12 10

8 6

4 2 0 11

14

20

30

40

50

60

70

90

100 110

120

130

140

150

160

170 180

190

200

220

GlobCover class codes

Figure 12. As figure 11 but for GlobCover. The numbers on the x-axis refer to GlobCover class codes: 11. Irrigated croplands; 14. Rainfed croplands; 20. Mosaic croplands/veg; 30. Mosaic veg/croplands; 40. Closed-open BL evergreen; 50. Closed BL deciduous; 60. Open BL deciduous; 70. Closed NL evergreen; 90. Open NL deciduous or evergreen; 100. Closed to open mixed BL/NL; 110. Mosaic forest–shrub-/grassland; 120. Mosaic grassland/forest–shrubland; 130. Closed-open shrubland; 140. Closed-open grassland; 150. Sparse veg; 160. Closed-open BL forest regularly flooded; 170. Closed BL forest or shrubland perm. flooded; 180. Closed-open veg regularly flooded; 190. Artificial area; 200. Bare areas; 220. Snow/ice.

Tropical rainforest Tropical seasonal forest Tropical savanna Desert Steppe Mediterranean Temperate evergreen forest

Figure 13.

Marine west coast Continental forest Boreal forest Tundra Snow and ice Cold boreal forest pTropical rainforest

pTropical seasonal forest pTropical savanna pDesert pSteppe pTemperature evergreen forest pContinental forest Urban

Final sample distribution.

5784

P. Olofsson et al.

Downloaded by [European Space Agency] at 01:59 25 April 2012

allocation to Populated Desert was decreased by only 17% relative to proportional allocation. The purpose of the initial sample of 500 blocks is to provide a global baseline that can be augmented by increasing the sample size in targeted strata to tailor the sample to a specific global product. Although the stratification and sample size allocation chosen will not be ideal for any particular land-cover map, the baseline sample provides better coverage of rare classes than would be obtained by foregoing stratification entirely. The sample size allocation to strata used to select the baseline sample can be modified at the sample augmentation stage to increase the sample size for targeted land-cover classes of a specific map (Stehman et al. in press).

3. Response design The response design defines the protocol for determining the ground condition (i.e. ‘reference classification’) at the selected sample sites. The reference data for the sample sites will be based on interpretation, by regional experts, of very high-resolution imagery (e.g. Quickbird imagery pan sharpened to 0.6 m). The interpretation of the imagery will be done manually on standardized image products. The precise nature of the image products remains to be determined and may include some degree of processing of the data that will ease the difficulty of interpretation for the regional experts. We plan to use automated image segmentation to define polygons, and the interpreter’s task will be simplified to label each polygon (e.g. following approaches developed at Joint Research Centre (JRC, Ispra, Italy) as part of the TREES project (Achard et al. 2002)). A key element of the response design protocol is that the reference maps can be used to assess different land-cover maps. The LCCS classification (Di Gregorio and Jansen 2000) provides the basis for a consistently applied classification legend. This system allows flexibility to define land-cover classes within a general framework. The labelling of the polygons in the very high-resolution imagery will follow the hierarchical conventions of LCCS, with the expectation that the regional interpreters will be able to provide higher levels of thematic detail than is typically included in the legends of global land-cover maps. A legend based on LCCS has been developed for classifying the very high-resolution imagery and is designed to map objects/segments that have a minimum size of 4 m2 . With a minimum mapping unit of 4 m2 , comparison with pixel sizes in the order of 300–500 m (GlobCover and MODIS) will be possible following aggregation of the reference data. The interpretation protocol will also support assessment of Landsat (or similar)-based analyses with larger minimum mapping units. This legend distinguishes between a set of 12 required classes and then several levels of optional sub-classes. The more general required classes allow for consistency across all the sample block classifications and should be identifiable with a single date of very high-resolution imagery. The subclasses allow the analyst to provide additional distinctions of ecology, hydrologic regimes and land use within each primary class label. These additional labels often require some ancillary information including but not restricted to multi-date imagery, existing land-use maps or statistics, field photographs and expert knowledge. Within each vegetated class, there are up to five levels of detail that may be provided within the database including species-level distinctions that are further specified in the scene’s metadata. The non-vegetated land-cover classes have

Downloaded by [European Space Agency] at 01:59 25 April 2012

Global land-cover validation data set

5785

fewer optional subclass levels. Although the additional class distinctions allow for a more rigorous assessment of classification accuracy and uncertainty of global or regional land-cover maps, they will not affect the recommended standard accuracy analysis procedures (§4). The basic spatial assessment unit of the 5 × 5 km sample block can be viewed as a cluster of pixels for any land-cover map. The response design protocol is constructed to provide the reference data required for a per-pixel assessment of accuracy. This block assessment unit also provides a richer spatial context for understanding accuracy and differences in accuracy among maps than would be available from an assessment based on single, isolated pixels. Obtaining a ‘map’ of the reference classification for the 5 × 5 km blocks affords the opportunity to evaluate the accuracy of landscape pattern metrics (e.g. patch size and shape distributions, fragmentation and edge metrics) calculated from each map and also the land-cover composition accuracy of the maps. Thus, the response design provides the necessary data to extend analyses beyond the basic error matrix description to include the assessment of more general features and patterns of maps (Dungan 2006) and follows the recommendation in the ‘Best Practices’ document (Strahler et al. 2006) that encourages use of advanced methods above and beyond the traditional baseline set of descriptive accuracy measures. A guiding principle for constructing the response design is that the protocols must be operationally practical and consistently implemented given that a large number of interpreters dispersed across the globe will likely be involved. Evaluation of the appropriate time-frame of the reference data will be an important feature of the response design so that if land-cover change occurs within a sample unit, the reference classification will be updated. The methodology for screening for potential change in the reference data and updating the reference classification is ongoing. The current plan is to assess the degree of change by analysing changes in tasselled cap brightness, greenness and wetness from a time series of Landsat data. 4. Analysis The global validation database will permit a variety of descriptive and comparative analyses of accuracy. The traditional approach of estimating an error matrix for a per-pixel, site-specific assessment expresses the comparison between the map to be assessed and the reference map in terms of specific parameters such as overall accuracy and user’s and producer’s accuracies. Stehman et al. (in press) present the estimation theory for these error matrix-based analyses and the proposed stratified sampling design. Because the reference data will be obtained for the 5 × 5 km blocks according to the spatial resolution of the assessed map the area of each land-cover class will be available according to both the map and the reference classifications. These areas represent continuous variables, and therefore, accuracy measures constructed for continuous variables (Ji and Gallo 2006, Pontius and Cheuk 2006, Riemann et al. 2010) may be used to quantify accuracy of the area mapped as each land-cover class. A special procedure is recommended to compare the spatially aggregated reference classes with the class legend for the map to be analysed following Ahlqvist (2005). This approach focuses on estimating the semantic similarity between classes of different legends according to specific land-cover dimensions. The method is especially relevant for the reference data that will be aggregated from the object scale to the landscape scale where each grid cell contains proportions of different objects. By generating a semantic similarity matrix between the two legends, it becomes possible

Downloaded by [European Space Agency] at 01:59 25 April 2012

5786

P. Olofsson et al.

to distinguish between apparent errors caused by class ambiguities (semantics) and actual uncertainty within the map caused by the mapping algorithm, the choice of the classifier and the limitations of the input data to distinguish between the classes. The purpose of the global validation database is to provide reference data for accuracy assessment that can be used in a variety of analyses. Although we do not propose a standardized analysis protocol for the global validation database, any analysis of these data will need to take into account the stratified sampling design. Specifically, the stratified design has unequal inclusion probabilities for different sample units selected (e.g. the probability that a 5 × 5 km block from one stratum included in the sample is not necessarily the same as the probability of including a 5 × 5 km block from a different stratum). These unequal inclusion probabilities must be incorporated in the analysis by appropriate weighting of each sample block in the accuracy estimates. Consequently, the database will include the information allowing computation of these weights (i.e. either the estimation weight or the inclusion probability associated with each sample unit) and also instructions on how to incorporate the weights in the analysis.

5. Discussion and conclusions Methods for assessing accuracy of land-cover maps have not been developed to address the objective of assessing accuracy of several land-cover maps from the same validation data set. In this article, we developed some of the fundamental structures of the methodology needed to construct a global reference land-cover database that would serve as the basis for validating multiple global and regional land-cover products. A critical feature of this methodology is the use of the LCCS to provide a generally applicable land-cover legend that will allow the reference classification to be used across a broad range of maps. An additional important feature of the response design protocol is that the reference data obtained from the 5 × 5 km sampling unit can be used to assess maps of different resolutions. The 5 × 5 km unit was also chosen because it creates efficiency for reference labelling as interpreters will be able to spend more time focusing on each block and taking advantage of spatial context information provided with the block than would be possible for a more broadly distributed sample of smaller units. Another critical feature of the global validation database is that the reference data should be collected from a common underlying sampling design. This feature will allow reference data obtained from different validation efforts to be integrated in a cohesive and statistically rigorous framework. The sampling design must have the capacity to increase the sample size for rare land-cover classes. This capacity is readily achieved by stratified sampling, but the use of the validation data to assess multiple land-cover maps mitigates against constructing the stratification from any single land-cover map. The stratification based on the Köppen climate/vegetation classes and population density is independent of any existing land-cover map. By judiciously choosing the sample size allocation to strata when selecting the global baseline sample of 500 blocks, we demonstrated that this stratification was effective for the purpose of increasing the sample size from targeted rare classes of different land-cover maps. Creating a global validation database that can be used to assess a variety of global and regional land-cover products presents many challenges not typically encountered when designing an assessment for a single map. In this article, we have provided an

Downloaded by [European Space Agency] at 01:59 25 April 2012

Global land-cover validation data set

5787

overview of the three components of a successful methodology (sampling design, response design and analysis) and addressed one of the key challenges of the sampling design, which is to develop a stratification that is independent of any specific landcover map yet effective for increasing the sample size from rare land-cover classes. The extension of the sampling and estimation protocols to assess the accuracy of a specific land-cover map is developed in the companion article (Stehman et al. in press), which also discusses the option of combining the baseline global validation sample with another probability sample to provide an opportunity to take advantage of existing reference data. Although the response design and analysis protocols for the global land-cover validation data set will initially follow common best practice recommendations (Strahler et al. 2006), research is in progress to develop and test improved response design and analysis protocols. Constructing a broadly applicable land-cover validation data set presents new challenges for all three components of accuracy assessment protocol, and we expect new methods to arise as researchers address these challenges. References ACHARD, F., STIBIG, H.J., EVA, H. and MAYAUX, P., 2002, Tropical forest cover monitoring in the humid tropics – TREES project. Tropical Ecology, 43, pp. 9–20. AHLQVIST, O., 2005, Using uncertain conceptual spaces to translate between land cover categories. International Journal of Information Science, 19, pp. 831–857. BICHERON, P., DEFOURNY, P., BROCKMANN, C., SCHOUTEN, L., VANCUTSEM, C., HUC, M., BONTEMPS, S., LEROY, M., ACHARD, F., HEROLD, M., RANERA, F. and ARINO, O., 2008, GLOBCOVER Products Report Description and Validation (Toulouse: MEDIASFrance). CIESIN, 2005, Gridded Population of the World Version 3 (GPWv3): Population Density Grids. Center for International Earth Science Information Network (CIESIN), Columbia University; and Centro Internacional de Agricultura Tropical (CIAT) (Palisades, NY: Socioeconomic Data and Applications Center (SEDAC), Columbia University). Available online at: http://sedac.ciesin.columbia.edu/gpw (accessed May 2009). CONGALTON, R.G. and GREEN, K., 1999, Assessing the Accuracy of Remotely Sensed Data: Principles and Practices (Boca Raton, FL: CRC Press). DI GREGORIO, A. and JANSEN, L.J.M., 2000, Land Cover Classification System (LCCS): Classification Concepts and User Manual (Rome: United Nations’ Food and Agriculture Organization). DUNGAN, J.L., 2006, Focusing on feature-based differences in map comparison. Journal of Geographical Systems, 8, pp. 131–143. FRIEDL, M., MCIVER, D., HODGES, J., ZHANG, X., MUCHONEY, D., STRAHLER, A.H., WOODCOCK, C., GOPAL, S., SCHNEIDER, A., COOPERA, A., BACCINI, A., GAO, F. and SCHAAF, C., 2002, Global land cover mapping from MODIS: algorithms and early results. Remote Sensing of Environments, 83, pp. 287–302. FRIEDL, M.A., SULLA-MENASHE, D., TAN, B., SCHNEIDER, A., RAMANKUTTY, N. and SIBLEY, A., 2010, MODIS Collection 5 Global Land Cover: algorithm refinements and characterization of new datasets. Remote Sensing of Environment, 114, pp. 168–182. HEROLD, M., WOODCOCK, C.E., DI GREGORIO, A., MAYAUX, P., BELWARD, A.S., LATHAM, J. and SCHMULLIUS, C.C., 2006, A joint initiative for harmonization and validation of land cover datasets. IEEE Transactions on Geoscience and Remote Sensing, 44, pp. 1719–1727. JI, L. and GALLO, K., 2006, The agreement coefficient for image comparison. Photogrammetric Engineering and Remote Sensing, 73, pp. 823–833.

Downloaded by [European Space Agency] at 01:59 25 April 2012

5788

P. Olofsson et al.

JUNG, M., HENKEL, K., HEROLD, M. and CHURKINA, G., 2006, Exploiting synergies of global land cover products for carbon cycle modeling. Remote Sensing Environment, 101, pp. 534–553. LEDWITH, M., 2000, The Land Cover Map for North East Europe in the Year 2000. M. Lewith. GLC2000 database, European Commision Joint Research Centre, 2003. Available online at: http://bioval.jrc.ec.europa.eu/products/glc2000/glc2000.php LOVELAND, T., REED, B., BROWN, J., OHLEN, D., ZHU, Z., YANG, L. and MERCHANT, J., 2000, Development of a global land cover characteristics database and IGBP DISCover from 1 km AVHRR data. International Journal of Remote Sensing, 21, pp. 1303–1330. MAYAUX, P., EVA, H., GALLEGO, J., STRAHLER, A.H., HEROLD, M., AGRAWAL, S., NAUMOV, S., DE MIRANDA, E.E., DI BELLA, C.M., ORDOYNE, C., KOPIN, Y. and ROY, P.S., 2006, Validation of the global land cover 2000 map. IEEE Transactions on Geoscience and Remote Sensing, 44, pp. 1728–1739. PEEL, M.C., FINLAYSON, B.L. and MCMAHON, T.A., 2007, Updated world map of the Köppen– Geiger climate classification. Hydrology and Earth System, 11, pp. 1633–1644. PONTIUS, R.G. and CHEUK, C.M., 2006, A generalized cross-tabulation matrix to compare soft-classified maps at multiple resolutions. International Journal of Geographical Information Science, 20, pp. 1–30. RIEMANN, R., WILSON, B.T., LISTER, A. and PARKS, S., 2010, An effective assessment protocol for continuous geospatial datasets of forest characteristics using USFS Forest Inventory and Analysis (FIA) data. Remote Sensing of Environment, 114, pp. 2337–2352. SCEPAN, J., 1999, Thematic validation of high-resolution global land-cover data sets. Photogrammetric Engineering and Remote Sensing, 65, pp. 1051–1060. STEHMAN, S.V. and CZAPLEWSKI, R.L., 1998, Design and analysis of thematic map accuracy assessment: fundamental principles. Remote Sensing of Environment, 64, pp. 331–344. STEHMAN, S.V., OLOFSSON, P., WOODCOCK, C.E., HEROLD, M. and FRIEDL, M.A., in press, A global land cover validation dataset, part II: augmenting a stratified sampling design to estimate accuracy by region and land-cover class. International Journal of Remote Sensing. STRAHLER, A.H., BOSCHETTI, L., FOODY, G.M., FRIEDL, M.A., HANSEN, M.C., HEROLD, M., MAYAUX, P., MORISETTE, J.T., STEHMAN, S.V. and WOODCOCK, C.E., 2006, Global Land Cover Validation: Recommendations For Evaluation And Accuracy Assessment of Global Land Cover Maps. EUR 22156 EN – DG (Luxembourg: Office for Official Publications of the European Communities), 48 pp. STRAHLER, A.H. and STRAHLER, A.N., 2004, Physical Geography: Science & Systems of the Human Environment, 3rd ed. (New York: Wiley). TREWARTHA, G.T., 1968, An Introduction to Climate, 4th ed. (New York: McGraw-Hill).

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.