Development and Evaluation of Hourly and Kilometer Resolution Retrospective and Real-Time Surface Meteorological Blended Forcing Dataset (SMBFD) in China

+ Author Affiliations + Find other works by these authors
  • Corresponding author: Chunxiang SHI, shicx@cma.gov.cn
  • Funds:

    Supported by the National Key Research and Development Project (2018YFC1506601), National Science Foundation Key Research Project (91437220), and National Public Welfare Special Project (GYHY201306045 and GYHY201506002)

  • doi: 10.1007/s13351-019-9042-9

PDF

  • A real-time, long-term surface meteorological blended forcing dataset (SMBFD) has been developed based on station observations, satellite retrievals, and reanalysis products in China. The observations are collected at national and regional automatic weather stations, satellite data are obtained from the Fengyun (FY) series satellites retrievals, and the reanalysis products are obtained from the ECMWF. The 90-m resolution digital terrain elevation data in China are obtained from the Shuttle Radar Topographic Mission (SRTM) for temperature and humidity elevation adjustment. The dataset includes 2-m air temperature and humidity, 10-m zonal and meridional winds, downward shortwave radiation, surface pressure, and precipitation. The spatial resolution is 1 km, and the temporal resolution is 1 h. During the data processing procedure, various data fusion techniques including the space–time multiscale variational analy-sis, the discrete ordinates radiative transfer (DISORT) model, the hybrid radiation estimation model, and a terrain correction algorithm are employed. Dependent and independent evaluations of the dataset are performed against observations. The SMBFD dataset is also compared with similar datasets produced in other major meteorological operational centers in the world. The results are as follows. (1) All variables show reasonable geographic distribution features and realistic spatial and temporal variations. (2) Dependent and independent evaluations both indicate that the gridded SMBFD dataset is close to the observations, while the dependent evaluation yields better results than the independent evaluation. (3) Compared with similar datasets produced in other meteorological operational centers, the real-time and retrospective surface meteorological fusion data obviously have higher quality. The dataset introduced in the present study is in general stable and accurate, and can be applied in various practice such as meteorology, agriculture, ecology, environmental protection, etc. Meanwhile, this dataset has been used as the atmospheric forcing data to drive the operational High-resolution Land Data Assimilation System of China Meteorological Administration. The dataset with the network Common Data Form (NETCDF) can be decoded by various programming languages, and it is freely available to non-commercial users.
  • 加载中
  • Fig. 1.  Spatial distribution map of elevation in China (m).

    Fig. 2.  Distributions of (a) the national automatic weather stations (more than 2380) and (b) the regional weather stations (about 38,000) in China.

    Fig. 3.  Spatial distributions of the monthly mean (a1–d1) 2-m temperature (°C), (a2–d2) surface pressure (hPa), and (a3–d3) humidity (g kg−1); and (e1–h1) 10-m wind (m s−1), (e2–h2) incident solar radiation (W m−2), and (e3–h3) precipitation (mm) for (a1–a3 and e1–e3) January, (b1–b3 and f1–f3) April, (c1–c3 and g1–g3) July, and (d1–d3 and h1–h3) October 2015 from the blended forcing dataset

    Fig. 3.  (Continued).

    Fig. 4.  Frequency distributions of the (a, d) Bias (°C), (b, e) RMSE (°C), and (c, f) correlation coefficient (COR) from (a–c) independent and (d–f) dependent evaluations of 2-m air temperature for May–October 2015.

    Fig. 5.  As in Fig. 4, but for 2-m humidity during May–October 2015.

    Fig. 6.  As in Fig. 4, but for 10-m wind speed during May–October 2015.

    Fig. 7.  As in Fig. 4, but for surface pressure during May–October 2015.

    Fig. 8.  Scatter plot of incident solar radiation between the fusion dataset (SWDN; W m−2) and the station observations (OBS; W m−2) for January–December 2015.

    Fig. 9.  Frequency distributions of (a) RMSE (W m−2) and (b) COR from the evaluation of incident solar radiation for January–December 2015.

    Fig. 10.  Spatial distribution of mean Bias (W m−2) from the evaluation of incident solar radiation for January–December 2015.

    Fig. 11.  Frequency distributions of (a) Bias (mm), (b) RMSE (mm), and (c) COR from the EMSIP and CMORPH data merged independent and dependent evaluations of precipitation for January–December 2015.

    Fig. 12.  Comparison of (a) COR, (b) RMSE (°C), and (c) Bias (°C) from SMBFD (red color) and ERA5 (blue color) for evaluations of 2-m air temperature for 1–8 June 2015.

    Table 1.  Information related to the real-time and retrospective surface meteorological forcing dataset

    InformationDescription
    Dataset nameThe real-time and retrospective SMBFD at 1-h and 1-km resolution
    Leading developerShuai Han, the National Meteorological Information Center (NMIC), hans@cma.gov.cn
    Responsible officialChunxiang Shi, NMIC, shicx@cma.gov.cn
     developersBin Xu, NMIC, xubin@cma.gov.cn;
    Shuai Sun, NMIC, sunshuai@cma.gov.cn;
    Tao Zhang, NMIC, zhangt@cma.gov.cn;
    Lipeng Jiang, NMIC, jianglp@cma.gov.cn;
    Xiao Liang, NMIC, liangx@cma.gov.cn
    Dataset domain15°–60°N, 70°–140°E
    Dataset period2015–present
    Spatial resolution1 km × 1 km
    Temporal resolution1 h
    Data formatThe network Common Data Form (NETCDF); the ASCII, BIN, and GRIB format will be added in future study
    VariableHourly data are saved. The dataset size is 120 M. Variable names are LAT (latitude), LON (longitude), TAIR (2-m air temperature), PAIR (pressure), RHU (2-m humidity), UIND (U-component), VIND (V-component), PRCP (precipitation), and SWDN (radiation). The dataset can be processed by NCL, Fortran, and MATLAB and visualized by Panoply and ARCGIS
    Data disseminationChina Integrated Meteorological Information Service System (CIMISS) (http://data.cma.cn/),
     ftp/http website Data Center of the China Meteorological Administration
    Policy for data usage(1) The dataset can be downloaded through China meteorological information network; a written agreement must be signed with the NMIC for commercial use.
    (2) The source of the dataset must be cited following the specific reference format
    SponsorshipNational Key Research and Development Program (2018YFC1506601); National Meteorological Science and Technology Innovation Project; National Natural Science Foundation Key Research Project (91437220)
    Download: Download as CSV

    Table 2.  Data sources and their descriptions

    Data typeData formatTemporal resolutionSpatial resolution and coverageData source
    Hourly surface observations in ChinaASCII1 hNational automatic stations (2421), regional stations (38,000)NMIC
    EMSIPNETCDF1 h6.25 km × 6.25 km/entire ChinaNMIC
    FY2 satellite full disc nominal mapHDF1 h1.25 km × 1.25 km/entire ChinaNMIC
    DEM datasetHDFNA90 m/entire ChinaNASA
    ERA-InterimGRIB23 h12.5 km × 12.5 km/entire ChinaECMWF
    Download: Download as CSV
  • [1]

    Chen, F., X. C. Yang, C. X. Li, et al., 2019: Establishment and assessment of hourly high-resolution gridded air temperature data sets in Zhejiang, China. Meteorological Applications, 26, 396–408. doi: 10.1002/met.1770.
    [2]

    Dee, D. P., S. M. Uppala, A. J. Simmons, et al., 2011: The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553–597. doi: 10.1002/qj.828.
    [3]

    Fan, Y., and H. van den Dool, 2008: A global monthly land surface air temperature analysis for 1948–present. J. Geophys. Res. Atmos., 113, D01103. doi: 10.1029/2007JD008470.
    [4]

    Feng, J. M., T. B. Zhao, and Y. J. Zhang, 2004: Intercomparison of spatial interpolation based on observed precipitation data. Climatic Environ. Res., 9, 261–277. doi: 10.3878/j.issn.1006-9585.2004.02.04. (in Chinese)
    [5]

    He, J., and K. Yang, 2011: China Meteorological Forcing Dataset. Cold and Arid Regions Science Data Center at Lanzhou, doi:10.3972/westdc.002.2014.db. (in Chinese)
    [6]

    Hersbach, H., 2016: The ERA5 atmospheric reanalysis. AGU Fall Meeting Abstracts, American Geophysical Union, Washington, DC, USA.
    [7]

    Jia, B. H., Z. H. Xie, A. G. Dai, et al., 2013: Evaluation of satellite and reanalysis products of downward surface solar radiation over East Asia: Spatial and seasonal variations. J. Geophys. Res. Atmos., 118, 3431–3446. doi: 10.1002/jgrd.50353.
    [8]

    Joyce, R. J., J. E. Janowiak, P. A. Arkin, et al., 2004: CMORPH: A method that produces global precipitation estimates from passive microwave and infrared data at high spatial and temporal resolution. J. Hydrometeor., 5, 487–503. doi: 10.1175/1525-7541(2004)005<0487:CAMTPG>2.0.CO;2.
    [9]

    Kanamitsu, M., W. Ebisuzaki, J. Woollen, et al., 2002: NCEP-DOE AMIP-II reanalysis (R-2). Bull. Amer. Meteor. Soc., 83, 1631–1644. doi: 10.1175/BAMS-83-11-1631.
    [10]

    Kunkel, K. E., 1989: Simple procedures for extrapolation of humidity variables in the mountainous western United States. J. Climate, 2, 656–669. doi: 10.1175/1520-0442(1989)002<0656:SPFEOH>2.0.CO;2.
    [11]

    Li, Y. H., C. L. Zhao, T. J. Zhang, et al., 2018: Impacts of land-use data on the simulation of surface air temperature in northwest China. J. Meteor. Res., 32, 896–908. doi: 10.1007/s13351-018-7151-5.
    [12]

    Luo, Z., G. Wahba, and D. R. Johnson, 1998: Spatial–temporal analysis of temperature using smoothing spline ANOVA. J. Climate, 11, 18–28. doi: 10.1175/1520-0442(1998)011<0018:staotu>2.0.co;2.
    [13]

    Meng, X. G., J. J. Guo, and Y. Q. Han, 2018: Preliminarily assessment of ERA5 reanalysis data. J. Mar. Meteor., 38, 91–99. doi: 10.19513/j.cnki.issn2096-3599.2018.01.011. (in Chinese)
    [14]

    Molteni, F., R. Buizza, T. N. Palmer, et al., 1996: The ECMWF ensemble prediction system: Methodology and validation. Quart. J. Roy. Meteor. Soc., 122, 73–119. doi: 10.1002/qj.49712252905.
    [15]

    Rolland, C., 2003: Spatial and seasonal variations of air temperature lapse rates in alpine regions. J. Climate, 16, 1032–1046. doi: 10.1175/1520-0442(2003)016<1032:SASVOA>2.0.CO;2.
    [16]

    Sesnie, S. E., P. E. Gessler, B. Finegan, et al., 2008: Integrating landsat TM and SRTM-DEM derived variables with decision trees for habitat classification and change detection in complex neotropical environments. Remote Sens. Environ., 112, 2145–2159. doi: 10.1016/j.rse.2007.08.025.
    [17]

    Shi, C. X., Z. H. Xie, H. Qian, et al., 2011: China land soil moisture EnKF data assimilation based on satellite remote sensing data. Sci. China Earth Sci., 54, 1430–1440. doi: 10.1007/s11430-010-4160-3.
    [18]

    Shi, Y., Z. H. Jiang, L. P. Dong, et al., 2017: Statistical estimation of high-resolution surface air temperature from MODIS over the Yangtze River Delta, China. J. Meteor. Res., 31, 448–454. doi: 10.1007/s13351-017-6073-y.
    [19]

    Stamnes, K., S. C. Tsay, W. Wiscombe, et al., 1988: Numerically stable algorithm for discrete-ordinate-method radiative transfer in multiple scattering and emitting layered media. Appl. Opt., 27, 2502–2509. doi: 10.1364/AO.27.002502.
    [20]

    Sun, X. L., H. Q. Song, P. Li., et al., 2015: Analysis of drought monitoring in Inner Mongolia based on CLDAS Data. Meteor. Mon., 41, 1245–1254. (in Chinese)
    [21]

    Thornton, P. E., S. W. Running, and M. A. White, 1997: Generating surfaces of daily meteorological variables over large regions of complex terrain. J. Hydrol., 190, 214–251. doi: 10.1016/S0022-1694(96)03128-9.
    [22]

    Xie, Y., S. Koch, J. McGinley, et al., 2011: A space–time multiscale analysis system: A sequential variational analysis approach. Mon. Wea. Rev., 139, 1224–1240. doi: 10.1175/2010MWR3338.1.
    [23]

    Xu, B., C. X. Shi, L. P. Jiang, et al., 2015: Multi-satellite integrated operational system of East Asian precipitation. Meteor. Sci. Technol., 43, 1007–1014. doi: 10.3969/j.issn.1671-6345.2015.06.001. (in Chinese)
    [24]

    Xu, J. M., Q. Guo, Q. F. Lu, et al., 2014: Innovations in the data processing algorithm for Chinese FY meteorological satellites. J. Meteor. Res., 28, 948–964. doi: 10.1007/s13351-014-4034-2.
    [25]

    Yang, F., H. Lu, K. Yang, et al., 2017: Evaluation of multiple forcing data sets for precipitation and shortwave radiation over major land areas of China. Hydrol. Earth Syst. Sci., 21, 5805–5821. doi: 10.5194/hess-21-5805-2017.
    [26]

    Yang, K., G. W. Huang, and N. Tamai, 2001: A hybrid model for estimating global solar radiation. Solar Energy, 70, 13–22. doi: 10.1016/s0038-092x(00)00121-3.
    [27]

    Zhai, P. M., B. Q. Zhou, and Y. Chen, 2018: A review of climate change attribution studies. J. Meteor. Res., 32, 671–692. doi: 10.1007/s13351-018-8041-6.
    [28]

    Zhang, G., G. S. Zhou, and F. Chen, 2017: Analysis of parameter sensitivity on surface heat exchange in the Noah land surface model at a temperate desert steppe site in China. J. Meteor. Res., 31, 1167–1182. doi: 10.1007/s13351-017-7050-1.
    [29]

    Zhang, Q., X. Ruan, and A. Y. Xiong, 2009: Establishment and assessment of the grid air temperature data sets in China for the past 57 years. J. Appl. Meteor. Sci., 20, 385–393. doi: 10.3969/j.issn.1001-7313.2009.04.001. (in Chinese)
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Development and Evaluation of Hourly and Kilometer Resolution Retrospective and Real-Time Surface Meteorological Blended Forcing Dataset (SMBFD) in China

    Corresponding author: Chunxiang SHI, shicx@cma.gov.cn
  • National Meteorological Information Center, China Meteorological Administration, Beijing 100081
Funds: Supported by the National Key Research and Development Project (2018YFC1506601), National Science Foundation Key Research Project (91437220), and National Public Welfare Special Project (GYHY201306045 and GYHY201506002)

Abstract: A real-time, long-term surface meteorological blended forcing dataset (SMBFD) has been developed based on station observations, satellite retrievals, and reanalysis products in China. The observations are collected at national and regional automatic weather stations, satellite data are obtained from the Fengyun (FY) series satellites retrievals, and the reanalysis products are obtained from the ECMWF. The 90-m resolution digital terrain elevation data in China are obtained from the Shuttle Radar Topographic Mission (SRTM) for temperature and humidity elevation adjustment. The dataset includes 2-m air temperature and humidity, 10-m zonal and meridional winds, downward shortwave radiation, surface pressure, and precipitation. The spatial resolution is 1 km, and the temporal resolution is 1 h. During the data processing procedure, various data fusion techniques including the space–time multiscale variational analy-sis, the discrete ordinates radiative transfer (DISORT) model, the hybrid radiation estimation model, and a terrain correction algorithm are employed. Dependent and independent evaluations of the dataset are performed against observations. The SMBFD dataset is also compared with similar datasets produced in other major meteorological operational centers in the world. The results are as follows. (1) All variables show reasonable geographic distribution features and realistic spatial and temporal variations. (2) Dependent and independent evaluations both indicate that the gridded SMBFD dataset is close to the observations, while the dependent evaluation yields better results than the independent evaluation. (3) Compared with similar datasets produced in other meteorological operational centers, the real-time and retrospective surface meteorological fusion data obviously have higher quality. The dataset introduced in the present study is in general stable and accurate, and can be applied in various practice such as meteorology, agriculture, ecology, environmental protection, etc. Meanwhile, this dataset has been used as the atmospheric forcing data to drive the operational High-resolution Land Data Assimilation System of China Meteorological Administration. The dataset with the network Common Data Form (NETCDF) can be decoded by various programming languages, and it is freely available to non-commercial users.

1.   Introduction
  • Observations collected at surface weather stations are timely and accurate. They are the major source of meteorological data. However, weather stations cannot be infinitely dense, and there is no guarantee that there will be sufficient meteorological observations at each refined grid point, especially in complex terrain areas, where observational stations are only sparsely distributed due to the high cost for weather station setup and maintenance. As a result, the quality of the observational data is often poor in complex terrain areas (Kunkel, 1989Rolland, 2003) when compared with that in plain areas. Scientists in meteorological, hydrological, geographical, and ecologic fields have been paying more and more attention to the issue of how to incorporate the information on local/small scale observational characteristics to a large area, such as a given grid cell. To produce gridded dataset based on station observations has become a hot research topic in recent years (Thornton et al., 1997; Luo et al., 1998; Zhai et al., 2018; Chen et al., 2019).

    Since the 1990s, multiple global gridded datasets have been produced in many research institutes and meteorological operational centers in the world. The NCEP Reanalysis II produced by the NCEP of NOAA (Kanamitsu et al., 2002), the ERA-Interim reanalysis dataset produced by the ECMWF (Dee et al., 2011), and the global monthly gridded climate dataset produced by the Climate Prediction Center (CPC) of NOAA (Fan and van den Dool, 2008) are good examples of such datasets. These gridded datasets have been widely used for various meteorological studies. However, low-resolution is a major weakness in various global datasets. Furthermore, the lack of consideration of complex terrain in these dataset directly leads to inaccurate description of weather and climate in China. Apparently, these datasets are not fine enough for regional application in China (Zhang et al., 2009).

    Researchers in China have been exploring the method to produce gridded meteorological and climatological datasets based on more available observations of weather stations with consideration of the complex terrain. Feng et al. (2004) conducted quality control of the observations at weather stations in China and produced a multi-variable gridded dataset that includes temperature and precipitation on 1° × 1° grids for the period of 1951–2000 using the Cressman interpolation technique. Zhang et al. (2009) produced a gridded temperature dataset with 1° × 1° resolution over China for the period of 1951–2007 using the ordinary Kriging interpolation method. Based on reanalysis products, He and Yang (2011) produced a high-resolution (0.1° × 0.1°) temperature dataset at 3-h intervals over China for the period of 1981–2015 by merging reanalysis products with conventional meteorological observations archived in the China Meteorological Administration (CMA). All these datasets contain the observations collected at more than 700 national weather stations in China and thus can well represent the spatial and temporal variability in China, although the dataset with 0.1° spatial resolution can describe more detailed local weather and climate characteristics than the 1° spatial resolution datasets.

    As discussed above, spatial interpolation is the major method used in producing these datasets. However, many important factors like the density of the weather network, terrain, and landuse/landcover that can affect the quality of gridded dataset are less considered during the interpolation procedure. Coarse spatial and temporal resolutions are a major weakness for the datasets mentioned above, which makes them hard to meet the requirements for refined operational weather forecast and various meteorological practical applications. Therefore, it is of great importance to take advantage of observations from the higher density national weather station network and incorporate the observations, satellite retrievals, and reanalysis products to establish a surface meteorological blended forcing dataset (SMBFD) with high spatialtemporal resolution over China (Shi et al., 2011; Zhang et al., 2017; Li et al., 2018; Chen et al., 2019).

2.   Dataset summary
  • The name of the dataset, its spatial and temporal resolutions, developers, data domain, length of the dataset, and other relevant information are listed in Table 1.

    InformationDescription
    Dataset nameThe real-time and retrospective SMBFD at 1-h and 1-km resolution
    Leading developerShuai Han, the National Meteorological Information Center (NMIC), hans@cma.gov.cn
    Responsible officialChunxiang Shi, NMIC, shicx@cma.gov.cn
     developersBin Xu, NMIC, xubin@cma.gov.cn;
    Shuai Sun, NMIC, sunshuai@cma.gov.cn;
    Tao Zhang, NMIC, zhangt@cma.gov.cn;
    Lipeng Jiang, NMIC, jianglp@cma.gov.cn;
    Xiao Liang, NMIC, liangx@cma.gov.cn
    Dataset domain15°–60°N, 70°–140°E
    Dataset period2015–present
    Spatial resolution1 km × 1 km
    Temporal resolution1 h
    Data formatThe network Common Data Form (NETCDF); the ASCII, BIN, and GRIB format will be added in future study
    VariableHourly data are saved. The dataset size is 120 M. Variable names are LAT (latitude), LON (longitude), TAIR (2-m air temperature), PAIR (pressure), RHU (2-m humidity), UIND (U-component), VIND (V-component), PRCP (precipitation), and SWDN (radiation). The dataset can be processed by NCL, Fortran, and MATLAB and visualized by Panoply and ARCGIS
    Data disseminationChina Integrated Meteorological Information Service System (CIMISS) (http://data.cma.cn/),
     ftp/http website Data Center of the China Meteorological Administration
    Policy for data usage(1) The dataset can be downloaded through China meteorological information network; a written agreement must be signed with the NMIC for commercial use.
    (2) The source of the dataset must be cited following the specific reference format
    SponsorshipNational Key Research and Development Program (2018YFC1506601); National Meteorological Science and Technology Innovation Project; National Natural Science Foundation Key Research Project (91437220)

    Table 1.  Information related to the real-time and retrospective surface meteorological forcing dataset

3.   Data sources and methods
  • The ERA-Interim dataset is the third-generation reanalysis product of ECMWF after the ERA-40. A four-dimensional variational data assimilation system (4DVAR) is used to assimilate more satellite and surface observations to the ERA-Interim dataset (Table 2). Compared to the ERA-40, the ERA-Interim dataset has higher quality and finer spatial and temporal resolutions. It is one of the high-quality reanalysis datasets in the world (Molteni et al., 1996; Dee et al., 2011).

    Data typeData formatTemporal resolutionSpatial resolution and coverageData source
    Hourly surface observations in ChinaASCII1 hNational automatic stations (2421), regional stations (38,000)NMIC
    EMSIPNETCDF1 h6.25 km × 6.25 km/entire ChinaNMIC
    FY2 satellite full disc nominal mapHDF1 h1.25 km × 1.25 km/entire ChinaNMIC
    DEM datasetHDFNA90 m/entire ChinaNASA
    ERA-InterimGRIB23 h12.5 km × 12.5 km/entire ChinaECMWF

    Table 2.  Data sources and their descriptions

  • The 1-km resolution terrain elevation data over China (Table 2) are derived from the 90-m resolution terrain data of the Shuttle Radar Topographic Mission (SRTM; http://srtm.csi.cgiar.org/) provided by NASA. At the 1-km resolution, most of the mountains and rivers and orientations of mountain ranges can be well represented as shown in Fig. 1 (Sesnie et al., 2008).

    Figure 1.  Spatial distribution map of elevation in China (m).

  • Observations of surface automatic weather stations (Table 2) are provided by the National Meteorological Information Center (NMIC) of China, including observations collected at more than 2380 national automatic weather stations (Fig. 2a) and more than 38,000 regional weather stations (Fig. 2b). These automatic weather stations are regularly calibrated and maintained by CMA. In the present study, observations from about 40,400 stations are used for data fusion, and more than 2380 stations are used for evaluation of the SMBFD dataset.

    Figure 2.  Distributions of (a) the national automatic weather stations (more than 2380) and (b) the regional weather stations (about 38,000) in China.

  • EMSIP is developed by the NMIC, CMA (Table 2; Xu et al., 2015). The geostationary satellite information is extracted and calculated from hourly nominal disc map data of Fengyun-2E (FY-2E) geostationary satellite. The integrated microwave precipitation information includes the Fengyun-3B Microwave Radiation Imager (FY-3B/MWRI) precipitation rate provided by China National Satellite Meteorological Center, the Tropical Rainfall Measuring Mission Microwave Imager (TRMM/TMI) precipitation rate provided by NASA, the NOAA-18/Microwave Humidity Sounder (NOAA-18/MHS) and NOAA-19/MHS precipitation rate provided by NOAA, the Meteorological operational satellite-A MHS (MetOp-A/MHS) precipitation rate, and Defense Meteorological Satellite Program (DMSP)-F16/F7/F18 Special Sensor Microwave Imager/Sounder (SSMIS) precipitation rate provided by the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) and DMSP (Xu et al., 2015).

  • The FY-2 geostationary satellite scanning radiometer consists of one visible light and four infrared channels, which can achieve a full disc image covering about 1/3 of the earth’s surface every hour with a spatial resolution of 1.25 km (visible light; Table 2; Xu et al., 2014) .

  • The multi-grid variational analysis technique (Xie et al., 2011) and the terrain correction algorithm (Shi et al., 2017) are used for the fusion of 2-m temperature and humidity, 10-m wind speed, and surface pressure. Using the multi-grid variational analysis technique, the ERA-Interim background data are first divided into various coarse-gridded data, and observations inside individual coarse grids are then used to correct the background data in the corresponding grids. The corrected data on the coarse grids are further divided into fine-gridded data, and data on these fine grids are then corrected based on observations. The above procedure is repeated until the optimal results are obtained.

  • The Discrete Ordinates Radiative Transfer (DISORT) model (Stamnes et al., 1988) is used to retrieve the incident solar radiation data in China. In this process, the ozone, precipitable water vapor (PWV), and air pressure in the Global Forecast System (GFS) numerical products (https://www.emc.ncep.noaa.gov/index.php?branch=GFS) are taken as necessary parameters, and the 1-km resolution visible information is derived from FY-2 geostationary satellite L1 full disc nominal image measurements. At the same time, temperature, air pressure, relative humidity, and duration of sunshine observed by the national automatic stations are used to calculate the incident solar radiation at the locations of 2380 stations by using a hybrid model (Yang et al., 2001). The multi-grid variational analysis technique (Xie et al., 2011) is used to fuse the incident solar radiation retrieved by FY-2 satellite and that calculated at the stations to produce the incident solar radiation fusion data on hourly and kilometer resolutions.

  • In the process of generating precipitation fusion data, the automatic station observation data and the background data are used. Observations from the surface automatic weather stations include those from the national automatic stations and the regional automatic stations (Figs. 2a, b). The background data is EMSIP, which is obtained from NMIC, CMA (Xu et al., 2015). After preprocessing the EMSIP dataset, the multi-grid variational analysis technique (Xie et al., 2011) is used to fuse the background data and the automatic station observation data to produce the precipitation fusion data on hourly and kilometer resolutions.

  • In order to evaluate the dataset objectively and comprehensively, two blended datasets are made during the evaluation process. The fusion algorithm and background of the two blended datasets are the same, while the number of ground observation stations used in the fusion is different. In the first dataset, only the observation data from the regional stations (Fig. 2b, about 38,000 stations) are fused. In the second dataset, the observation data from both the national automatic stations (Fig. 2a, more than 2380 stations) and the regional automatic stations (Fig. 2b, about 38,000 stations) are fused, which are consistent with the final published dataset. In the process of validation, the observation data of the national automatic stations (Fig. 2a, more than 2380 stations) are used as the “true value” (validation data) to evaluate two blended datasets. Therefore, the evaluation of the first dataset is called independent evaluation, and the evaluation of the second dataset is called dependent evaluation.

    Bi-linear interpolation method is implemented in the present study. The algorithm is written as:

    $$Z({I_1},J) = \frac{{J - {J_2}}}{{{J_1} - {J_2}}}Z({I_1},{J_1}) + \frac{{J - {J_1}}}{{{J_2} - {J_1}}}Z({I_1},{J_2}),$$ (1)
    $$Z\left({{I_2},J} \right) = \frac{{J - {J_2}}}{{{J_1} - {J_2}}}Z({I_2},{J_1}) + \frac{{J - {J_1}}}{{{J_2} - {J_1}}}Z({I_2},{J_2}).$$ (2)

    A linear interpolation is further conducted along the J-direction:

    $$Z(I,J) = \frac{{I - {I_2}}}{{{I_1} - I{}_2}}Z({I_1},{J_{}}) + \frac{{I - {I_1}}}{{{I_2} - {I_1}}}Z({I_2},{J_{}}),$$ (3)

    where Z(I1,J1), Z(I1,J2), Z(I2,J1), and Z(I2,J2) are values of the variable on the corresponding grids; Z(I1,J) and Z(I2,J) are results at I1 latitude and I2 latitude after the linear interpolation; Z(I,J) is the value at the specific station after the interpolation.

    The metrics for data evaluation include the mean bias (Bias), root mean square error (RMSE), and correlation coefficient (COR). They are calculated by:

    $${\rm{Bias}} = \frac{1}{N}\mathop \sum \nolimits_{i = 1}^N \left({{G_i} - {O_i}} \right), \quad\quad\quad\quad\quad\;\;\quad\quad$$ (4)
    $${\rm{RMSE}} = \sqrt {\frac{1}{N}\mathop \sum \nolimits_{i = 1}^N {{\left({{G_i} - {O_i}} \right)}^2}}, \quad\quad\quad\quad\quad$$ (5)
    $${\rm{COR}} = \frac{{\mathop \sum \nolimits_{i = 1}^N \left({{G_i} - \overline {{G_i}} } \right)\left({{O_i} - \overline {{O_i}} } \right)}}{{\sqrt {\mathop \sum \nolimits_{i = 1}^N {{\left({{G_i} - \overline {{G_i}} } \right)}^2}} \sqrt {\mathop \sum \nolimits_{i = 1}^N {{\left({{O_i} - \overline {{O_i}} } \right)}^2}} }},$$ (6)

    where Oi indicates observations at weather station i (true values), Gi denotes the forcing data interpolated to a given station i, and N is the total number of stations used for verification.

4.   Verification results
  • Spatial patterns of monthly mean meteorological variables from the blended forcing dataset are displayed in Fig. 3. The 2-m air temperature, surface pressure, and 2-m specific humidity in January, April, July, and October 2015 are shown in Figs. 3a–d, and the 10-m wind speed, downward shortwave radiation, and precipitation in January, April, July, and October 2015 are shown in Figs. 3e–h.

    Figure 3.  Spatial distributions of the monthly mean (a1–d1) 2-m temperature (°C), (a2–d2) surface pressure (hPa), and (a3–d3) humidity (g kg−1); and (e1–h1) 10-m wind (m s−1), (e2–h2) incident solar radiation (W m−2), and (e3–h3) precipitation (mm) for (a1–a3 and e1–e3) January, (b1–b3 and f1–f3) April, (c1–c3 and g1–g3) July, and (d1–d3 and h1–h3) October 2015 from the blended forcing dataset

    Figure 3.  (Continued).

    It is indicated that the 2-m air temperature is low in northeastern and northwestern China and high in southeastern China as expected. For year around, 2-m air temperature in Qinghai–Tibetan region is lower when compared that in other regions due to the higher elevation. This also leads to small surface pressure and humidity as shown in Fig. 3. Overall, these spatial distributions and patterns are well captured. For seasonal variation analysis, 2-m air temperature in January is obviously lower than in other three months, and the temperature in the Tibetan Plateau and Northeast China is generally below −10°C. 2-m air temperature in July overall is high over entire China with the value above 3 °C in Xinjiang and Southeast China.

    Surface pressure has a small seasonal variability. Spatial distribution exhibits a “three-step staircase” pattern, with the lowest surface pressure of less than 700 hPa in the Tibetan Plateau, the highest pressure of around 1000 hPa in Southeast China, and the intermediate value between 800–900 hPa in other regions (Figs. 3a–d, middle panels).

    The spatial distribution of 2-m humidity shows an increasing trend from northwest to southeast. There is a clear seasonal variation, that is, the surface humidity gradually increases from January to July and decreases from July to October (Figs. 3a–d, right panels).

    The 10-m wind speed is easily affected by terrain and local meteorological condition. Therefore, there are no distinct distribution patterns in different months. Note that 10-m wind speed is slightly larger in the Inner Mongolia Autonomous Region and the northwestern Tibetan Plateau than in other regions, probably because the terrain is relatively flat in these two regions. Meanwhile, anthropogenic influence on surface wind speed is also small due to low population density (Figs. 3e–h, left panels).

    The spatial distribution of incident solar radiation shows that there is smaller solar radiation in January and October than in April and July. The incident solar radiation changes continuously over different regions and the value in the Tibetan–Qinghai Plateau region is slightly larger than in other regions (Figs. 3e–h, middle panels).

    Spatial distribution of monthly mean accumulative precipitation well reflects the seasonal rainfall variability in China, i.e., precipitation is large in the summer and relatively small in the winter (Figs. 3e–h, right panels).

  • The 2-m temperature and humidity, 10-m wind, and surface pressure are evaluated for the period of May–October 2015, and hourly incident solar radiation and precipitation are evaluated for the period of January–December 2015, respectively. The details are discussed below.

  • The 2-m temperature evaluations indicate that, for the independent evaluation (Figs. 4a–c), the Bias is between −0.5 and 0.5°C at 52.6% of the total stations (Fig. 4a), the RMSE is smaller than 1°C at 42.6% of the total stations and smaller than 1.5°C at 73.5% of the total stations (Fig. 4b). COR is larger than 0.98 at 73.4% of the total stations (Fig. 4c). For the dependent evaluations (Figs. 4d–f), 94.1% of the total stations has a Bias ranging between −0.5 and 0.5°C (Fig. 4d), 85.7% of the total stations has the RMSE smaller than 0.5°C (Fig. 4e), and 96.4% of the total stations has the COR above 0.99 (Fig. 4f). Overall, the evaluation results show that 2-m air temperature produced in this study has high COR, small Bias and RMSE through both dependent and independent evaluations.

    Figure 4.  Frequency distributions of the (a, d) Bias (°C), (b, e) RMSE (°C), and (c, f) correlation coefficient (COR) from (a–c) independent and (d–f) dependent evaluations of 2-m air temperature for May–October 2015.

  • The evaluation results of 2-m humidity are presented in Fig. 5. For the independent evaluations (Figs. 5a–c), the Bias is distributed between −4% and 4% at 73.9% of the total stations (Fig. 5a), the RMSE is smaller than 10% at 69.3% of the total stations (Fig. 5b), and the COR is larger than 0.9 at 61.9% of the total stations (Fig. 5c). For the dependent evaluations (Figs. 5d–f), the Bias is within the range of −4% to 2% at 96.3% of the total stations (Fig. 5d), the RMSE is smaller than 5% at 88.1% of the total stations (Fig. 5e), and the COR is larger than 0.98 at 85.2% of the total stations (Fig. 5f).

    Figure 5.  As in Fig. 4, but for 2-m humidity during May–October 2015.

  • The evaluation results of 10-m wind speed are shown in Fig. 6. For the independent evaluations (Figs. 6a–c), the Bias is distributed between −1 and 1 m s−1 at 83.0% of the total stations (Fig. 6a), the RMSE is smaller than 1.5 m s−1 at 75.3% of the total stations (Fig. 6b), and the COR is larger than 0.5 at 65.5% of the total stations (Fig. 6c). For dependent evaluation case (Figs. 6d–f), the Bias is within the range of −0.5 to 0.5 m s−1 at 94.9% of the total stations (Fig. 6d), the RMSE is smaller than 0.5 m s−1 at 77.6% of the total stations (Fig. 6e), and the COR is larger than 0.9 at 87.7% of the total stations (Fig. 6f).

    Figure 6.  As in Fig. 4, but for 10-m wind speed during May–October 2015.

  • The results of surface pressure evaluation are displayed in Fig. 7. For the independent evaluations (Figs. 7a–c), the Bias is between −5 and 5 hPa at 63.0% of the total stations (Fig. 7a), the RMSE is smaller than 5 hPa at 61.9% of the total stations (Fig. 7b), and the COR is above 0.98 at 68.1% of the total stations (Fig. 7c). For the dependent evaluations (Figs. 7d–f), the Bias is between −5 and 5 hPa at 63.1% of the total stations (Fig. 7d), the RMSE is smaller than 5 hPa at 62.4% of the total stations (Fig. 7e), and the COR is larger than 0.98 at 67.5% of the total stations (Fig. 7f). The overall results suggest that surface pressure has similar evaluation statistics results for both dependent and independent cases.

    Figure 7.  As in Fig. 4, but for surface pressure during May–October 2015.

  • The scatter plot of daily mean incident solar radiation for the fusion dataset and the observations at weather stations during January–December 2015 is displayed in Fig. 8. The results show a good corresponding relationship between them. The COR is above 0.9. Further analysis indicates that in areas of low solar radiation, the value in the dataset is slightly larger than the observation, and the opposite is true in areas of high solar radiation. One possible reason is the spatial scale mismatch issue, which generally exists in validation of gridded data against station data. As expected, the accuracy of gridded data can be further improved as the spatial resolution is increased. In addition, the generated radiation data errors also come from DISORT transfer model, satellite retrievals, and GFS products. A further analysis and investigation is needed in the future.

    Figure 8.  Scatter plot of incident solar radiation between the fusion dataset (SWDN; W m−2) and the station observations (OBS; W m−2) for January–December 2015.

    In order to compare the accuracy of radiation data at different observation sites in more details, evaluation is conducted at individual stations and overall results are given in Fig. 9. It is found that the COR is above 0.9 at 80% of the stations (Fig. 9b). The RMSE frequency is the largest over the range from 30 to 35 W m−2 and accounts for 28.6% of the total stations, followed by the range within 25–30 and 35–40 W m−2, which account for 18.7% and 19.8% of the total stations, respectively (Fig. 9a). In general, about 67% stations have a RMSE within 25–40 W m−2. Figure 10 shows Bias spatial distribution. There are negative biases in Northwest, Southwest, and Northeast China, and positive biases in central China and coastal regions of Southeast China. The reason remains unclear and needs to be further investigated in the future.

    Figure 9.  Frequency distributions of (a) RMSE (W m−2) and (b) COR from the evaluation of incident solar radiation for January–December 2015.

    Figure 10.  Spatial distribution of mean Bias (W m−2) from the evaluation of incident solar radiation for January–December 2015.

    In addition, radiation datasets for FY-satellite retrievals, the Fast Longwave and Shortwave Radiative Flux (FLASHFlux) satellite retrievals, and ERA-Interim and NCEP-Department of Energy (DOE) reanalysis are evaluated (Jia et al., 2013). The results indicate that the FY-satellite retrieved radiation data are close to the FLASHFlux satellite retrieval product. Both products have smaller RMSE than the reanalysis products of ERA-Interim and NCEP-DOE when compared with the observations. Meanwhile, the FY-satellite retrievals have higher spatial and temporal resolutions and are timely products when compared with other datasets (Jia et al., 2013).

  • In the present study, the CPC Morphing Technique precipitation product (CMORPH; Joyce et al., 2004) and EMSIP (Xu et al., 2015) precipitation are used as background respectively, and the multi-grid variational analysis technique is applied to fuse precipitation observations collected at the weather stations. Four experiments have been conducted. The first set of experiment fused EMSIP precipitation with observations of the regional automatic stations (Fig. 2b, about 38,000 sites), named EMSIP-merged independent experiment. The second one fused EMSIP precipitation with the observations of the national (Fig. 2a, more than 2380 sites) and regional automatic stations (Fig. 2b, about 38,000 sites), named EMSIP-merged dependent experiment. The third experiment fused CMORPH precipitation with observations of the regional automatic stations (Fig. 2b, about 38,000 sites), named CMORPH-merged independent experiment. The last one fused CMORPH data with observations of the national (Fig. 2a, more than 2380 sites) and regional automatic stations (Fig. 2b, about 38,000 sites), named CMORPH-merged dependent experiment. The observation data of the national automatic stations are used as the “true value” (Fig. 2a, more than 2380 stations) to evaluate the four groups of experimental results.

    The results are presented in Fig. 11. Overall, merged precipitation generated by using CMORPH precipitation as background or using the EMSIP precipitation as background are similar. The Bias, RMSE, and COR are quite close. For dependent evaluation, the Bias and RMSE are smaller than those from independent evaluation as expected, because some data are used for fusion procedure and evaluation. For example, for the fused EMSIP dataset, results of dependent evaluation indicate that the Bias distributes between −0.01 and 0.01 mm at 97.1% of the total stations (Fig. 11a), the RMSE is smaller than 0.4 mm at 92.5% of the total stations (Fig. 11b), and the COR is above 0.9 at 96.7% of the total stations (Fig. 11c). However, results of independent evaluation show that the Bias distributes between −0.03 and 0.03 mm (Fig. 11a) at 87.2% of the total stations, the RMSE is smaller than 0.6 mm at 64.8% of the total stations (Fig. 11b), and the COR is above 0.7 at 67% of the total stations (Fig. 11c). Furthermore, the 6.25 km merged precipitation produced by using the same method and data has been evaluated by independent precipitation observations from the Ministry of Water Resources of China, and the results show a better performance when compared to NASA Global Land Data Assimilation System (GLDAS) precipitation (Yang et al., 2017). It further illustrates the reliability and applicability of the precipitation fusion products in this paper.

    Figure 11.  Frequency distributions of (a) Bias (mm), (b) RMSE (mm), and (c) COR from the EMSIP and CMORPH data merged independent and dependent evaluations of precipitation for January–December 2015.

5.   Discussion
  • In this study, incident solar radiation and precipitation are compared with not only stations observations, but also reanalysis products over the world. The 2-m temperature and humidity, 10-m wind, and surface pressure are only evaluated dependently and independently by using station observations, but they have not been compared with the reanalysis products. The major reason is that there is no hourly, 1-km resolution real-time product specifically for China produced by other operational centers, as well as academia and private enterprises.

    The ERA5 is a new-generation reanalysis dataset produced in Copernicus Climate Change Service operated by ECMWF and sponsored by the European Union. It covers the period from 1950 to present. The data are updated in near real time mode with a three-month lag. As the 5th-generation ECMWF global reanalysis product, ERA5 has finer spatial and temporal resolutions (regular latitude–longitude grids at 0.25° × 0.25° resolution) and higher quality (Meng et al., 2018) compared to the ERA-15, ERA-40, and the most recent ERA-Interim of ECMWF products (Hersbach, 2016).

    Hourly observations of 2-m temperature collected at more than 2380 national weather stations in China during 1–8 June 2015 are used to evaluate the ERA5 dataset and the SMBFD hourly surface air temperature dataset on 1 km × 1 km grids. RMSE, Bias, and COR between observed and calculated 2-m air temperature by this study and ERA5 are presented in Fig. 12. The results show that CORs are around 0.99 and 0.9 for the data generated in this study and ERA5, respectively. The RMSEs are around 0.5 and 2–3°C, respectively. The Bias is close to 0, but there is a large bias and temporal variation for ERA5. The reasons may come from two aspects. The first one is that this is a dependent evaluation as these stations data have been integrated into current study. For ERA5, this is an independent evaluation as these stations’ observations have not been integrated. The second reason is that the present study has a much higher spatial resolution than ERA5, and thus many local orographic and vegetation effects (e.g., elevation, slope, aspect, vegetation type, etc.) have been considered. Nevertheless, 2-m air temperature with 1-km resolution still has its advantages for its practical applications (e.g., assessment of summer hot and winter cold events, drought, etc.)

    Figure 12.  Comparison of (a) COR, (b) RMSE (°C), and (c) Bias (°C) from SMBFD (red color) and ERA5 (blue color) for evaluations of 2-m air temperature for 1–8 June 2015.

    Since the release of the hourly, 1-km resolution real-time SMBFD dataset in December 2016, many researchers in Jiangsu, Zhejiang, Hubei, Sichuan provinces and the Inner Mongolia have evaluated the dataset against their local observations. They have reached the consensus that the data quality is high when compared to the usually used reanalysis products. Therefore, this dataset has been widely applied to studies in numerical weather prediction, agricultural drought monitoring, high temperature warning, environmental ecology, etc. (Sun et al., 2015).

    It needs to be noted that the data quality over South-west China, particularly over Sichuan and Chongqing, is still not so good as that over East China. One possible reason for the relative low data quality over Southwest China is the terrain complexity there. The capability of gridded data is still limited in describing local condition in complex terrain area. In addition, the sparse distribution of weather stations in Southwest China leads to smoothed extreme maximum and minimum temperatures, so that the extreme values cannot be well represented. In the future, various types of observations will be collected from local agencies and short-term/temporarily projects, and will be integrated into SMBFD to further improve the data quality over complex terrain area, where weather stations are only sparsely distributed. We will also calculate the density of observation information in each grid as an auxiliary to judge the quality of the gridded data.

Reference (29)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return