The Per-Minute Precipitation Dataset was produced based on data information (R01) extracted from self-recording graph paper. The self-recording per-minute precipitation dataset for 31 provinces, including data from 2253 national-level weather stations from the time self-recording observations first became available in 1951 to 2012, has been developed. The metadata information for this dataset is listed in Table 1.
Item Description Dataset name China Surface Self-Recording Per-Minute Precipitation Dataset (V1.0) Dataset code SURF_CLI_CHN_PRE_MIN* Geographic region All regions of China except Taiwan, Hong Kong, and Macau Time period 1951–2012 Data format Text file (.txt) Size 116 GB (uncompressed) Composition Per-minute precipitation data from individual stations. Each file contains monthly data from a single station, including parameters and observations. The parameters include station number, latitude, longitude, elevation, year and month of the data, data source, and so on. The data for a particular time are listed on each line, including station number, time of the data (Beijing time, year-month-day-hour), minute-by-minute precipitation during this hour (the accuracy of the data is 0.01 mm), and the quality control code. In order to save storage space, entire (day) months without precipitation or entire (day) months with missing values of precipitation are abbreviated. More details can be found in the dataset format documentation. Quality control code In total, there are 3 quality control codes: “0” indicates that the data are correct, “3” indicates that the data are corrected per-minute-averaged precipitation values over a period of abnormal curves, and “8” indicates missing value or no measurement. No measurements are conducted during sub-freezing periods in high-latitude or high-elevation regions. Website for data Offline storage, with meta data information available at http://data.cma.cn/ Accessibility User support is provided by the China Meteorological Data Network (http://data.cma.cn/) Confidentiality In accordance with “Measures for the Management of Meteorological Information Services” issued by the China Meteorological Administration *Code definitions: SURF_CLI denotes the data category, which is surface climatic data; PRE indicates precipitation; MIN represents per-minute data; and CHN indicates the area covered by the data, which is China.
Table 1. Metadata information for China Surface Self-Recording Per-Minute Precipitation Dataset (V1.0)
The spatial distribution of the weather stations whose observations were used for the development of the China Surface Self-Recording Per-Minute Precipitation Dataset (V1.0) is shown in Fig. 9. According to the report entitled Fundamental Work on the Development and Reform of Basic Meteorological Data (2011–12) provided by the National Meteorological Information Center, there were 2481 national-level surface stations in China during 1951–2012, among which 2253 stations (90.8%) provided self-recording precipitation observations. This indicates that 9.2% of the weather stations in China did not require self-recording precipitation observations. These stations are mainly distributed in remote western provinces such as Qinghai, Tibetan Region, and Xinjiang Region (denoted by black dots in Fig. 9), and typically have generated self-recording precipitation observations for less than 30 yr (denoted by orange dots in Fig. 9). In contrast, most of the stations in eastern China have generated self-recording precipitation observations for more than 30 yr, although only a small proportion of these have more than 45 yr of records.
The interannual variation of the number of stations with self-recording precipitation observations that were used in the development of the China Surface Self-Recording Per-Minute Precipitation Dataset (V1.0) for 1951–2012 is shown in Fig. 10 (solid line). The dashed line in this figure depicts the interannual variation of the number of stations that conducted precipitation observations during the same period. In the early stages of the establishment of national-level stations during 1951–60, the number of stations with self-recording precipitation observations was significantly smaller than the total number of stations (only about half of the total stations). From then until the 1980s, the number of stations with self-recording precipitation observations increased year by year. After 1980, the number stabilized at approximately 2200, reaching its highest level in 2000. However, there are still a little less than 200 stations that do not have self-recording precipitation records. After 2003, automatic weather stations began to appear, and self-recording precipitation observations started to be replaced by automatic observations. The number of stations with self-recording precipitation observations then decreased sharply. Automatic observations were realized at all stations around 2011, thereby reducing the number of self-recording precipitation stations to 0.
Figure 10. Interannual variation of the number of stations with self-recording precipitation observations.
Since self-recording measurements do not occur during sub-freezing periods in high-latitude and high-elevation regions, the rate of data availability during June–August is used to represent data integrity. The rate of data availability is the ratio of effective observational data to total observational data. The effective observational data are the non-missing data, and the total observational data are the data that should be observed during the operatio-nal period of the self-recording measurement. The rate of data availability from June to August for per-minute precipitation is presented in Fig. 11, which reveals that the rate is > 99% over most (79.8%) of the stations in summer, indicating that the data integrity is very good. The rate is between 80% and 90% at 8.9% of the stations. Only 29 stations (1.3% of the total) have rates < 95%. During 1951–2012, there are a total of approximately 1.163 × 1010 per-minute precipitation records and 1.156 × 1010 effective records, reflecting an overall data availability rate of approximately 99.42%.
Figure 11. Availability of self-recording per-minute precipitation data in summer months of June–August.
By comparing the data availability rate with that of the China Hourly Precipitation Dataset at National-Level Stations (Zhang et al., 2016), it is found that the two are roughly equivalent, with the exception of a few stations, for which the data availability rates for per-minute precipitation are slightly lower, which is attributed to the fact that the self-recording graph paper strips at these stations were damaged by insects or water, were lost, and so on, and therefore could not be scanned. Note that the hourly precipitation data in weather reports are often recorded immediately after the measurements, allowing them to be well maintained. Self-recording precipitation measurements began at Jieshou station (ID 58108) of Anhui Province in 1958 and continued until 2005, when the measurement method was converted to automatic. All of the self-recording graph papers for 1961 at this station were lost, leading to missing per-minute precipitation data for the entire year of 1961. The missing rate of the per-minute precipitation data at this station is 2.56% in 1958–2004, while the missing rate of the hourly precipitation data during the same period is 0.44%. Note that in general, hourly precipitation is derived from the accumulation of per-minute precipitation; when per-minute data are missing, data from A6 are used to preserve data integrity. This result indicates that the preservation of paper files has certain limitations, and digitization of these paper records should be performed as soon as possible.
The accuracy rates, corrected-data rates, and missing-data rates of the national self-recorded per-minute precipitation data for the summer period June–August are shown in Fig. 12. The corrected data refers to the per-minute data calculated based on the spherical average of hourly total precipitation when partial minutes are missing during the hour. The accuracy rate of the per-minute precipitation over most of the stations in China (73%) in summer is > 99%, while the rates at 25.8% of the stations fall between 95% and 99%, and only 29 stations (1.3% of the total) have rates < 95%. In total, there are approximately 1.154 × 1010 accurate per-minute precipitation records, and the overall accuracy rate is 99.22%.
Figure 12. Spatial distributions of (a) accuracy rate (%) and (b) corrected-data rate (%) across China.
In the process of extracting data from self-recording graph paper, corrections are made for data extracted from vague curves and from data recorded when instrumental failure occurred. The corrected data only account for a small part of the total data. The corrected-data rate is < 0.1% over Xinjiang Region, Qinghai Province, northwestern Gansu Province, Ningxia Region, western Inner Mongolia, southwestern Heilongjiang Province, and some other locations. The corrected-data rate over most of the stations in central and eastern China ranges from 0.1% to 0.5%, with the rate > 0.5% but < 1.8% over provinces of Guizhou, Hunan, Hubei, Henan, and other locales (Fig. 11).
Figure 13 shows the annual variations of accuracy rate, corrected-data rate, and missing-data rate averaged across China. The missing-data rate is the highest and the accuracy rate the lowest in the 1950s, although the accuracy rate still exceeds 85%. The slightly lower accuracy rate in the 1950s is attributed to the following: (1) during the period when self-recording precipitation measurements were first applied, the operational norms were poor, and missing data occurred frequently; (2) the self-recording strips were damaged or the curves became vague due to the extended preservation time, and special procedures were implemented to extract data from these graph strips. Since the 1960s, the accuracy rate gradually increased and remained stable until after 1980, which is due to the fact that the management of self-recording precipitation observations was relatively loose prior to 1979, and there were no strict requirements for remedial measures when instrument failure occurred. As a result, missing data occurred frequently. Since 1980, the CMA’s Surface Meteorological Observation Specification has been implemented, which has improved both instrument maintenance and data evaluation of self-recording precipitation measurements. These procedures have led to a significant decrease in the missing-data rate. The corrected-data rate exhibits an increasing trend before 1995, but decreases rapidly after 1995, with its value remaining within 0.3% for all years.
Figure 13. Annual variations of accuracy rate, corrected-data rate, and missing-data rate averaged across China.
Compared with the accuracy rate of the China Hourly Precipitation Dataset at National-Level Stations (Zhang et al., 2016), the accuracy rate of the per-minute precipitation dataset is found to be slightly lower, which is attributed to the fact that when abnormal conditions or instrumental failure occurred during self-recording graph paper measurements, hourly data could still be calculated, whereas per-minute data were taken as either missing or corrected.