The major GLC products derived from satellite data for three epochs (2000–2005, 2008–2010, and 2013–2015) are used in this study, including the GLC2000 dataset from the European Commission’s Joint Research Center, land cover dataset based on the Moderate Resolution Imaging Spectroradiometer (MODIS) data, GLC map (GlobCover) from European Space Agency (ESA), GLC by National Mapping Organizations (GLCNMO) dataset from the International Steering Committee for Global Mapping, and Climate Change Initiative Land Cover dataset (CCI-LC) from ESA. These products are developed by different national or international initiatives for different scientific purposes, with diverse classification characteristics, which are widely used as a baseline for GLC conditions and dynamics over the past decades. Characteristics of these different products are summarized in Table 1.
GLC product GLC2000 MODIS2001/2010/2015 GlobCover2005/2009 GLCNMO2003/2008/2013 CCI-LC2001/2010/2015 Satellite sensor SPOT VGT MODIS Terra and Aqua Envisat MERIS MODIS Terra and Aqua Envisat MERIS SPOT VGT Input data Daily mosaic of 4 spectral bands 16-day nadir BRDF adjusted reflectance (1–7), EVI, LST MERIS: Bi-monthly reflectance from 10-day composites 16-day nadir BRDF adjusted reflectance (1–7), NDVI, DMSP-OLS, Landsat ETM+ MERIS: 7-day composite reflectance, SPOT VGT time series data Spatial resolution 1 km 500 m 300 m 1 km/500 m 300 m Time of data collection 1999.11–2000.12 Calendar year 2001/2010/2015 2004.12–2006.6 Calendar year 2009 Calendar year 2003/2008/2013 Calendar year 2001/2010/2015 Classification scheme LCCS (22 classes) Multi-scheme including the IGBP (17 classes) LCCS (22 classes) LCCS (20 classes) LCCS (22 classes) Classification method General unsupervised, depending on the partner Supervised decision tree boosting Unsupervised/supervised spatio–temporal clustering expert-based labeling Supervised decision tree Combination of supervised and unsupervised; machine learning classification Overall accuracy (%) Globally 68.6 Globally 74.8 Globally 73.1/67.5 Globally 81.2/82.6 Globally 73.2 Reference Bartholomé and Belward (2005), Mayaux et al. (2006) Friedl et al. (2010) Bontemps et al. (2011) and Defourny et al. (2011) Tateishi et al. (2011, 2014) Defourny et al. (2016) Notes: SPOT VGT—Systeme Probatoired'Observation de la Terrestre Vegetation, BRDF—Bidirectional Reflectance Distribution Function, EVI—Enhanced Vegetation Index, LST—Land Surface Temperature, IGBP—International Geosphere–Biosphere Programme, MERIS—Medium Resolution Imaging Spectrometer, NDVI—Normalized Difference Vegetation Index, DMSP-OLS—Defense Meteorological Satellite Program Operational Linescan System, ETM+—Enhanced Thematic Mapper.
Table 1. Description of major GLC products
We clipped all the GLC products to the overlap extent of the globe, projected, and co-registered them to the same projection and spatial resolution to reduce errors in geo-registration. Since the resolution of the products ranges from 300 m to 1 km, the resampling strategies for the three epochs are not the same, e.g., the products are resampled to a resolution of 1 km for 2000–2005 and 500 m for periods of both 2008–2010 and 2013–2015 by using the nearest neighbor rule.
These GLC products have different classification schemes (Table 1). In order to quantify the uncertainty among these products for climate models, legends associated with each GLC product are translated into a common scheme with 14 land cover classes (Table 2), by using the Land Cover Classification System (LCCS) legend translation protocols developed by Herold et al. (2008).
Common scheme GLC2000 MODIS GlobCover GLCNMO CCI-LC 1. Evergreen needleleaf forest 4 1 70 3 70, 71, 72 2. Evergreen broadleaf forest 1 2 40 1 50 3. Deciduous needleleaf forest 5 3 90 4 80, 81, 82 4. Deciduous broadleaf forest 2, 3 4 50, 60 2 60, 61, 62 5. Mixed forest 6, 9, 10 5, 8 100, 110 5, 6 90, 100 6. Shrublands 11, 12 6, 7 130 7 120, 121, 122 7. Grassland 13 9, 10 120, 140 8, 9 110, 130, 140 8. Croplands 16 12 11, 14 11, 12 10, 11, 12, 20 9. Permanent wetlands 7, 8, 15 11 160, 170, 180 14, 15 160, 170, 180 10. Urban and built-up 22 13 190 18 190 11. Cropland/natural vegetation mosaic 17, 18 14 20, 30 13 30, 40 12. Snow and ice 21 15 220 19 220 13. Barren or sparsely vegetated 14, 19 16 150, 200 10, 16, 17 150, 151, 152, 153, 200, 201, 202 14. Inland water 20 0 210 20 210
Table 2. Translation of individual legend into a 14-class common scheme. Legend descriptions corresponding to the GLC2000, MODIS, GlobCover, GLCNMO, and CCI-LC can be found in the supplementary tables
Several existing GLC reference datasets are used in this paper, including the consolidated GLC2000 reference dataset (Schultz et al., 2015), GLCNMO2008 training dataset (Tateishi et al., 2011), consolidated GlobCo-ver2005 reference dataset (Bicheron et al., 2008), System for Terrestrial Ecosystem Parameterization (STEP) reference dataset (Friedl et al., 2010), Visible Infrared Imaging Radiometer Suite (VIIRS) Surface Type validation database (Olofsson et al., 2012), GLC Ground Truth (GLCGT) database, Global Forest Resources Assessment (Global FRA) reference (Potapov et al., 2011), Geo-Wiki validations (Fritz et al., 2009), and a common validation sample set for the Finer Resolution Observation and Monitoring (FROM) of GLC project (Zhao et al., 2014) and Global Flux sites, which are generated by different international organizations through enormous interpretation efforts. All these reference datasets are publicly accessible through the Global Observation of Forest Cover and Land Dynamics (GOFC-GOLD) reference data portal, Geo-Wiki, and Internet, and can be freely used by any researcher. It should be noted that time periods for these reference datasets are discrepant, but errors in the processes of land cover changing over time can be neglected compared to misclassification errors as most of the reference points and scientific sites are located in homogeneous and stable areas. The class of land cover at the centroid of sample sites is considered as the reference land cover class. Legends of all the reference datasets mentioned above are translated into a 14-class common scheme listed in Table 2. Totally, 17,361 reference sites within the globe are adopted in this study. The spatial distributions and corresponding size of each reference dataset as well as reference land cover classes are shown in Fig. 1.
In this study, the area and spatial comparisons are applied. The area comparison shows the total area differences of individual land cover class in the common scheme among different GLC products. The spatial comparison is based on the per-pixel comparison among GLC products, and the overall spatial agreement (Ao) as well as class-specific spatial agreement (As) values are calculated for quantifying the differences of spatial patterns for each class by using the following equations.
where Xi is the number of pixels of class i in GLC product X, and Yi is the number of pixels of class i in GLC product Y. XYii is the number of pixels of class i in both GLC products X and Y. N is the number of land cover classes (14 in this study), which is used to calculate Ao, and M is the total number of pixels in the entire global area.
The thematic similarity is defined as the affinity between different land cover classes based on a set of common land cover attributes from the LCCS translation in previous studies (Ahlqvist, 2005; Neumann et al., 2007; Fritz and See, 2008; Pérez-Hoyos et al., 2012). It allows us to examine the agreement between different land cover products with different classification schemes. The overlap property is firstly computed for each separate categorical or continuous attribute, and then those values for all the common considered attributes are combined together to yield an overlap metric, ranging from 0 (unrelated classes) to 1 (coincidence of all attributes). The thematic similarity indicates the degree of agreement between two land cover classes, and the thematic uncertainty represents the discrepancies in land cover definitions and positional errors. The thematic uncertainty of GLC products can be obtained through the following steps.
Step 1: Following the method described by Pérez-Hoyos et al. (2012), land cover classes of GLC products are translated into LCCS and a set of land cover attributes are defined, and the overlap property is calculated for each separate common attribute. Then, the overlap property values for all common attributes are aggregating into a single quantity by using the following equations to indicate the thematic similarity for any pair of GLC products.
The overlap metric is a similarity score of the attributes that two land cover classes have in common, ranging from 0 (total disagreement between the two land cover classes) to 1 (the two land cover classes are considered to be identical). Oij is the overlap metric, Wk represents the weight of each common attribute (each attribute is assumed to have the same contribution to the overlap metric), and Ok(Ci, Xj) is the overlap property for each common attribute between classes i and j.
Step 2: TUI is defined as below.
where ats is the accuracy, which is the difference value between the true and mean thematic similarities, and pts is the precision, which is the standard deviation (STD) of thematic similarities. The Truets is set as 1. For a given pixel, if all attributes of the two land cover classes are the same, the thematic similarity is 1, otherwise, it is computed for any pair of GLC products, so there are several thematic similarities of pair-combinations for a given pixel. For example, there are totally six pair-combinations for the four GLC products, and thus, six thematic similarities are obtained for a given pixel. Meants and STDts are the mean value and STD of the six thematic similarities for a given pixel in the same epoch, respectively.
The accuracy of GLC products is often expressed in terms of the global accuracy, without informing about the spatial variation in the classification accuracy. The local classification accuracy denotes the local probability that a specific GLC product is correct, and it can be modeled with substantial GLC reference datasets to assess the local classification accuracy of global-scale products in previous studies (See et al., 2015; Tsendbazar et al., 2015; Comber et al., 2017). We followed the method elaborated by Tsendbazar et al. (2015) for calculating the local classification accuracy in this work. The local classification accuracy uncertainty of GLC products is calculated through the following two steps.
Step 1: Correspondences between GLC products and reference data are indicator-coded. If the land cover class of a reference site matched with that of a GLC product, an indicator code of 1 is assigned to the reference site, otherwise, an indicator code of 0 is given to that site. Then, the spatial autocorrelation of the indicator-coded data is analyzed by using indicator semivariograms. More detailed information can be found in (Tsendbazar et al., 2015). Semivariogram examples of the spatial correspondence for GLC in 2008–2010, and the used fitted models for the indicator kriging, are demonstrated in Fig. 2. Finally, the local classification accuracy is generated by indicator kriging for each GLC product. The value of local classification accuracy ranges from 0 to 1, which denotes the local probability that a specific GLC product is correct. Here, a correct land cover class has a value of 1, while a probably wrong land cover class has a value of less than 1.
Figure 2. Semivariogram examples of the spatial correspondence and fitted models for (a) MODIS, (b) GlobCover, (c) GLCNMO, and (d) CCI-LC in 2008–2010 (model parameters: partial sills, range, and nugget).
Step 2: AUI is defined as follows.
where Truela is set as 1, and Meanla and STDla are the mean value and STD of the local classification accuracy among all GLC products in the same epoch, respectively.
In this study, an IUI, which combines both the thema-tic uncertainty and local classification accuracy uncertainty, is defined as below.
where IUI is the GLC integrated uncertainty index, TUI is the thematic uncertainty index, and AUI is the local classification accuracy uncertainty index. Wi and Wj denote the weights of TUI and AUI, respectively. Although in this study identical weights are assigned to TUI and AUI (Wi and Wj are equal to 0.5), different weights can also be applied to reveal the relative importance of the two uncertainty indexes in terms of land cover characters.