-
Variations in surface soil moisture (SM) reflect the processes of surface energy exchange, land surface evapotranspiration, the carbon cycle, and the energy cycle, among others. For agriculturalists, monitoring surface SM is necessary in predictions of crop growth, soil degradation, and vegetation coverage (Yan and Zhou, 2017). However, obtaining SM data across a large area at a desired spatial–temporal resolution and accuracy has remained a challenge because of its limitations, such as the sparse distribution of observation sites associated with traditional measurement methods (Yu and Zhao, 2011).
Remote sensing technology has been recently used for an increasing number of applications as it can rapidly obtain large area land surface coverage information with high spatial and temporal resolutions and low costs. Information on SM can be extracted from remote sensing images by using the relationship between band data and SM (Yang et al., 2010). Electromagnetic wave information from the surface contained in remote sensing images can be analyzed to obtain a larger range of SM data (Liu H. et al., 2012). Different remote sensing technologies, e.g., optical remote sensing, multi/hyperspectral remote sensing, and microwave remote sensing, each have their own advantages and are suitable for a certain type of surface (Mao et al., 2007). Owing to the highly complicated nonlinear relationship between remote sensing band data and SM, obtaining SM from remote sensing imagery requires an appropriate and efficient inversion method (Wang et al., 2019).
Because microwaves can penetrate soil and are sensitive to SM content, they are often used to retrieve surface SM (Zhao et al., 2010). When the active microwave images are used to retrieve surface SM, the input is typically a backscattering coefficient. When a passive microwave is used for inversion, the inputs are typically brightness and temperature. The low- and high-frequency backscattering coefficients of active microwave remote sensing are highly sensitive to SM and vegetation, respectively. The L- and C-bands are usually used in inversion methods (Yu et al., 2012). Inversion methods using microwaves have a clear physical meaning, as they are mainly affected by surface roughness and vegetation cover (Liu et al., 2011). Although SM retrieval models, such as the Michigan microwave canopy scattering model, can eliminate the effect of vegetation, their complex parameter structure affects model application (Yu and Zhao, 2011). Active microwaves have a high spatial and low temporal resolution, whereas passive microwaves have a high temporal and low spatial resolution (Liu Y. Y. et al., 2012). Therefore, the selection of the microwave type should be based on the research/application target.
When multispectral or hyperspectral images are used to construct the inversion method, previous studies have commonly used principal component analysis (PCA) to determine the bands that have a strong correlation with SM (Zhang and Sun, 2009); the fundamental aspect of PCA is the ability to establish a relationship between the channel data and SM data (Li et al., 2015). However, owing to the complex nonlinear relationship between image channel and SM data, linear regressions (LRs) typically lead to inaccurate inversion results (Paloscia et al., 2013). Considering that deep learning technology, such as a neural network, can learn and mine the information hidden in massive amounts of data, the neural network technique can be used to construct an inversion method (Chen et al., 2018). Training processes based on relevant training data can reveal the important relationship between band data and SM, resulting in significant improvements to the accuracy of the inversion results using deep learning technology (Feng et al., 2018).
A deep belief network (DBN) is a branch of deep learning that has more advantages than the traditional back propagation (BP) neural network with respect to mining deep information hidden in data (Larochelle and Bengio, 2008). A DBN can be defined as a stack of restricted Boltzmann machines (RBMs; Larochelle et al., 2012). By introducing binary random variables into the hidden layer, RBMs simulate the joint probability distribution of the input and output data. Previous studies have demonstrated the effectiveness of DBN-based methods for data simulation or predicting various types of data (Yin et al., 2015; Liu et al., 2019; Peng et al., 2019).
When using a DBN to invert SM from multi-spectral images, the key is to select appropriate input variables. Both the temperature and vegetation index have a high correlation with surface SM, so that the physical significance of this correlation is relatively clear and are suitable as input variables (Wang et al., 2019).
Because the heat capacity of water is larger than that of soil and the temperature of water is typically lower than that of soil, the SM content affects the thermal infrared characteristics of soil. Therefore, SM can be indirectly monitored by using these thermal infrared characteristics (Abu-Hamdeh, 2003). As land surface temperature (LST) can reflect the thermal infrared characteristics of soil, LSTs are a suitable input variable to obtain SM (Patel et al., 2009; Tian et al., 2014). However, because of the influence that various factors have on surface temperature, an SM inversion directly based on LSTs and its derived index has uncertainties and limitations (Sandholt et al., 2002).
Based on the thermal characteristic differences between vegetation and soil, Moran et al. (1994) divided the surface covered by vegetation into vegetation and surface layers, constructed a trapezoid theory of vegetation index and temperature, and established the water deficit index model (WDI). The scope of application of the WDI can be extended to bare land, fully covered vegetation, and partially covered ground. In addition, there are numerous regional models. For example, Kondo et al. (1998) established the surface temperature–modified soil adjusted vegetation index (Ts–MSAVI) by replacing the normalized vegetation index with MSAVI. Kogan (1998) proposed a vegetation health index based on the vegetation condition and temperature condition indices (Kondo et al., 1998), where the determination of the coefficient mainly depends on human experience. Based on the temperature vegetation dryness index (TVDI), Qi et al. (2005) established the difference temperature vegetation dryness index (DTVDI) by using the Moderate Resolution Imaging Spectroradiometer (MODIS) data.
The amount of SM affects vegetation growth. By capturing the changes in the vegetation canopy structure, vegetation health status, and photosynthesis, the vegetation index (VI) can reflect the response of the vegetation to changes in SM (Liu et al., 1997; Sun et al., 2007; Liu S. S. et al., 2012). The normalized difference vegetation index (NDVI), which is now a commonly used index throughout the world, was proposed based on the difference in pigment absorption characteristics between red and near-infrared bands (Jackson et al., 1983; Kogan, 1990; Ratana et al., 2005; Yang et al., 2007). Existing studies have proven that the NDVI has a strong correlation with SM (Adegoke and Carleton, 2002; Mallick et al., 2009). As the NDVI is prone to saturation in areas with a high vegetation cover, it is mainly suitable for estimating the SM in areas with a low vegetation cover (Brusca, 2002; Didan and Huete, 2004). The enhanced vegetation index (EVI) was established by using blue, red, and near-infrared bands; this model can reduce the impact of atmosphere and vegetation canopy backgrounds, effectively improve the sensitivity of vegetation information in areas with higher vegetation coverage, and is suitable for estimating SM in areas with higher vegetation coverage (Liu and Huete, 1995). In addition, the vegetation condition index (VCI; Feng et al., 2004), anomaly vegetation index (AVI; Chen et al., 1994; Song et al., 2017), modified perpendicular drought index (MPDI; Zhang et al., 2015), and TVDI (Sandholt et al., 2002; Qi et al., 2003; Li et al., 2008; Chen et al., 2011) can be used in the SM inversion model. Because of the time lag between SM and water on vegetation, the accuracy of using vegetation indices alone also has uncertainties and limitations (Sandholt et al., 2002).
The effect that SM has on the crop growth state can only be observed after a certain phase delay. Zhang et al. (2016) analyzed the correlation between the vegetation index and SM by using data from the central and western regions of China and observed that this delay time was between 5 and 10 days. The time delay must be considered when using the VI for inversion. This is an effective method to obtain a complete VI time series by interpolating an incomplete time series VI index dataset, after which the appropriate VI can be selected from the complete time series. The artificial neural network (ANN) and Savitzky–Golay methods can be used to construct a complete VI time series (Liu et al., 2018; Patel et al., 2019; Zhao et al., 2019).
The Fengyun-3D (FY-3D) meteorological satellite is a second-generation Chinese polar orbit meteorological satellite, whose aim is to provide satellite observation data for mesoscale numerical weather forecasting as well as monitoring the ecological environment and large-scale natural disasters. A Medium Resolution Spectral Imager-II (MERSI-II) is one of the main loads of the FY-3D satellite; FY-3D is equipped with 25 channels, including 16 visible near-infrared channels, 3 shortwave infrared channels, and 6 medium–long-wave infrared channels. Of the 25 channels, 6 are 250-m spatial resolution channels and 19 are 1000-m spatial resolution channels. Using the FY-3D data to obtain high accuracy SM data, to reduce the operating costs of large-scale monitoring, and to improve the stability, security, and performance of monitoring systems are all important goals in China’s strategies to optimize the use of national satellites.
Previous studies have examined the relationship between surface temperatures and vegetation indexes at various spatial scales and temporal resolutions and have determined that there is an important negative correlation between the LST and NDVI at different scales (Price, 1990; Carlson et al., 1994). The combination of the VI and LST improves the surface SM inversion effect (Yao et al., 2004). Several previous studies have investigated the potential to obtain information for energy and water status based on the relationship between a remotely sensed LST and the VI (Carlson et al., 1995; Gillies et al., 1997; Goetz, 1997; Nemani and Running, 1997).
Based on the above analysis, we used the DBN to build a model to invert SM from FY-3D MERSI-II images, referred to as SM-DBN. To use the SM-DBN to extract high-quality SM data, we selected the LST and VI with a clear physical meaning and a strong correlation with SM as the input variables to retrieve SM. Considering the vegetation coverage differences in the different regions, we used both the NDVI and EVI. We first obtained LST, NDVI, and EVI from the FY-3D MERSI-II images and used them to obtain large scale SM coverage.
-
The Ningxia Hui Autonomous Region of China was selected as the study area. Ningxia is located in the transitional zone between the Loess Plateau and Inner Mongolia Plateau. Its geographical location is over 35°14'–39°14'N, 104°17'–109°39'E, with a total area of approximately 664,400 km2. The terrain is high in the south and low in the north and is characterized by runoff-eroded loess in the south as well as drought denudation and wind erosion in the middle and northern regions. The area experiences a typical continental climate, with semi-humid areas in the southernmost Liupanshan area, arid areas to the north of Weining Plain, and a semi-arid climate elsewhere. According to the climatic conditions, the distribution of agriculture and animal husbandry, the ecological environment, and traditional customs of the inhabitants, Ningxia is typically divided from north to south into the irrigation area of the Yellow River, an arid zone in the middle of the region, and a mountainous area in the south. Figure 1a shows the geographical location of the Ningxia Hui Autonomous Region from a Fenyun-3D image. The meteorological and topographic surface conditions of the Ningxia Hui Autonomous Region are representative of Northwest China, rendering it an appropriate study area.
-
Ningxia has an established ecological and agrometeorological observation network consisting of 37 in-situ SM observation stations and 36 in-situ temperature observation stations. Figure 1b shows the distribution of the in-situ SM and temperature observation stations.
The observation items of each SM observation station included air humidity and SM. An SM sensor was placed every 10 cm from the surface to a depth of 1 m. The observation items of each temperature observation station included the air temperature and LST. The observed hourly data of SM and temperature at all the in-situ observation stations were collected and transmitted to a server through a wireless network.
Based on Fig. 1b, these observation stations encompassed the entire Ningxia region. We collected data from all 37 in-situ SM observation stations and all 36 temperature observation stations from January 2018 to December 2019 to create datasets for comparison experiments.
-
We collected 863 cloudless or minimal cloud FY-3D MERSI-II images of the Ningxia region from the satellite receiving station at the Ningxia Meteorological Bureau of China. All the FY-3D MERSI-II images had a 12-h temporal resolution and a 250-m spatial resolution. All images were captured from 2018 to 2019.
The FY-3D MERSI-II data were preprocessed by using the image processing software developed by the National Satellite Meteorological Center, China. The pre-processing procedures included multi-channel calibration, geographic positioning, splicing, and projection conversion. After pre-processing, all images had a spatial resolution of 250 m and were stored in the Tiff format.
-
The Environment for Visualizing Images (ENVI) is a remote sensing image processing software that integrates numerous mainstream image processing tools, and can improve the efficiency of image processing and the value of image use. Specifically, ENVI can use the interactive data language (IDL) to develop image processing programs according to the user’s requirements, which can further improve the work efficiency.
We developed a program by using IDL to extract band values according to the point coordinate of the in-situ observation station, referred to as a band value of point exactor (BVPE).
Each sample in the temperature dataset consisted of a band value vector and LST observation value. The band value vector was used as the input for the temperature subnetwork of the SM-DBN while the temperature observation value was used as the expected output when training the temperature subnetwork. According to the data illustration from the FY-3D MERSI-II imagery, we extracted band values station by station from bands 1, 2, 3, 4, 24, and 25 by using the BVPE to form a band value vector. For each sample, according to the time at which the image was captured, we selected the LST observation value with an observation time nearest the captured time from the associated station.
Overall, there were 17,300 samples in the temperature dataset, which was stored by using a text format.
-
(1) We extracted the NDVI from the FY-3D MERSI-II images by using the methods reported in Ghulam et al. (2007) and extracted the EVI by using the methods reported in Wang et al. (2003). We used the Tiff format to save the extracted NDVI and EVI.
(2) We geo-corrected the extraction results.
(3) We extracted the NDVI by using the BPVE according to location and arranged the NDVI of the same location of each year to form a time series. We used the same procedure to process the EVI.
(4) We used the reconstruction method proposed by Liu et al. (2018) to process the NDVI time series and EVI time series.
(5) Similar to the temperature dataset, each sample in the SM dataset consisted of an LST–NDVI–EVI vector and an SM observation value. The LST–NDVI–EVI vector was used as the input for the SM subnetwork of the SM-DBN, and the SM observation value was used as the expected output when training the subnetwork. The SM observation value used in our study represented land surface SM, and the measurement was relative SM (%). Each LST–NDVI–EVI vector consisted of LST, NDVI, and EVI. The LST was generated by the trained temperature subnetwork while the NDVI and EVI were selected from the reconstructed time series. For each sample, according to the time at which the image was captured, we selected the SM observation value with an observation time nearest the captured time from the associated station. Considering the delay associated with soil water impact on vegetation, we used the NDVI and EVI with a delay of 5 days (Zhang et al., 2016). Overall, there were 11,710 samples in the SM dataset.
-
Figure 2 shows the overall procedure for SM-DBN modeling, which consists of the following six parts: input, temperature subnetwork, NDVI extraction reconstruction, EVI extraction reconstruction, SM subnetwork, and the output. In the model training process, the FY-3D images and their corresponding observation data were used as inputs, while during the model testing process, only the FY-3D images were used as the input. The temperature subnetwork was used to extract LSTs from the FY-3D images. The extracted LST, together with the NDVI and EVI, were input into the SM subnetwork. Then, the SM data were generated from the SM subnetwork. More details on the two subnetworks are provided in the following sections.
-
The temperature subnetwork consisted of 11 RBM layers. The final RBM layer had only one output node. Besides the last layer, the output from each layer served as the input of the next layer. In the temperature subnetwork, the LST dataset described in Section 2.4 was used as the training dataset. Figure 3 shows the structure of each RBM, which was composed of a visible layer (the upper layer) and a hidden layer (the lower layer). There were no connections between the elements in the same layer, whereas there were dual connections between one element from one layer and all elements from the other layers.
In the RBM, let w be the weight, indicating the connection strength between two connected neurons (one in the visible layer and the other in the hidden layer). Let the bias coefficients of the two neurons be b (visible) and c (hidden), so that the energy function of the connection can be expressed as follows:
$$ {{E}} = - \mathop \sum \nolimits_{{{i}} = 1}^{{n}} {{{b}}_{{i}}}{{{v}}_{{i}}} - \mathop \sum \nolimits_{{{j}} = 1}^{{n}} {{{c}}_{{j}}}{{{h}}_{{j}}} - \mathop \sum \nolimits_{{{i}} = 1}^{{n}} \mathop \sum \nolimits_{{{j}} = 1}^{{n}} {{{v}}_{{i}}}{{{w}}_{{{i}},{{j}}}}{{{h}}_{{j}}}, $$ (1) where E is the function, i and j are the indices of the nodes in the visible and hidden layers, respectively, and vi and hj are the values of the above i-th and j-th nodes, respectively.
The probability that the hidden layer neuron, hj, is activated is as follows:
$$ {{P}}\left( {{{{h}}_{{j}}}{{|v}}} \right) = {{\sigma }}\left( {{{{b}}_{{j}}} + \mathop \sum \nolimits_{{i}} {{{w}}_{{{i}},{{j}}}}{{{v}}_{{i}}}} \right). $$ (2) Because of the dual connection, the neurons in the visible layer can also be activated by the neurons in the hidden layer:
$$ {{P}}\left( {{{{v}}_{{i}}}{{|h}}} \right) = {{\sigma }}\left( {{{{c}}_{{i}}} + \mathop \sum \nolimits_{{j}} {{{w}}_{{{i}},{{j}}}}{{{h}}_{{j}}}} \right), $$ (3) where σ denotes a Sigmoid function.
The RBM was trained by a contrast divergence algorithm. First, the state of the hidden layer was obtained from data for the visible layer. Then, the visible layer was reconstructed by the hidden layer. Subsequently, a new hidden layer vector was generated from the state of the visible layer and used as the input for the next RBM to train the next layer. Therefore, each layer of the RBM was trained sequentially to extract deep information from the input eigenvalues. Table 1 lists the training parameters of the temperature subnetwork. The main parameters of the RBM structure were the number of input nodes and number of the RBM layers. The number of the nodes in the first half was larger than the number of input nodes, so we were able to extract the high level features of the eigenvalues. The number of the nodes gradually decreased in the second half, which resulted in the reduction of redundant features and improved the fitting results.
Momentum 0.1 Structure [n, n × 3, n × 5, n × 7, n × 9, n × 10, n × 8, n × 6, n × 4, n × 2, 1] cd_k 1 RBM learning rate e–4 RBM epoch 600 BP learning rate e–4 Dropout 0.0005 Batch size 50 Note: structure denotes the number of RBM layers and the number of neurons of each layer; n denotes the number of input parameter; cd_k denotes the sampling times; learning rate denotes the parameter in the optimization algorithm that determines the step size at each iteration while moving toward the minimum of the loss function; RBM epoch denotes RBM training times of each layer; dropout denotes the probability of abandonment of neurons; batch size denotes the number of samples per training. Table 1. Training parameters for the temperature subnetwork
-
The SM subnetwork consisted of 13 RBM layers, where each of the RBM layers was similar to the corresponding component in the temperature subnetwork. The SM dataset described in Section 2.5 was used as the input to the SM subnetwork. Table 2 lists the training parameters of the subnetwork.
Momentum 0.1 Structure [n, n × 8, n × 14, n × 16, n × 17, n × 18, n × 12, n × 11, n × 10, n × 9, n × 6, n × 2, 1] cd_k 1 RBM learning rate e–4 RBM epoch 200 BP learning rate e–4 Dropout 0.0005 Batch size 50 Table 2. Training parameters for the SM subnetwork
-
The loss function for the temperature- and SM-subnetworks can be defined by using the mean square deviation:
$$ {\rm{loss}} = \frac{1}{{{{t_{\rm{s}}}}}}\mathop \sum \nolimits_{{{i}} = 1}^{t_{\rm{s}}} |{{p}} - {{t}}{|^2}, $$ (4) where ts represents the total number of samples, p represents the value predicted by the model, and t represents the observation value. When the temperature subnetwork was trained, t was the LST observation value, whereas when the SM subnetwork was trained, t was the SM observation value.
-
As previously mentioned, there were two training processes for the SM-DBN, i.e., one for the temperature subnetwork and the other for the SM subnetwork, and the procedures for training were identical as follows:
(1) Determine the hyper-parameters in the training process and initialize the weights and bias in the training subnetwork.
(2) Input the FY-3D images and corresponding ground observations to generate the LST and SM datasets. Notably, as the vegetation index was the main input to the SM-DBN, site data for non-vegetated areas could not be used. Furthermore, for areas covered by vegetation, observation data for non-vegetative growth seasons could not be used.
(3) Use an unsupervised method to train the two layers of each RBM, input all pre-processed sample data to the visible layer of the RBM, transmit the data to the hidden layer through the excitation function, and train each layer of the RBM using greedy layer-wise training. The sample data were collected by using the Gibbs sampling method, where the weight and bias value were updated by using the contrast divergence algorithm to ensure that the feature vector map was optimal. As previously mentioned, the output from one hidden layer was used as the input for the visible layer of the next RBM. When all RBMs were trained following the above steps, the pre-training process was complete.
(4) Use the supervised learning method to train the last layer of the BP neural network in the SM-DBN. Compare the model results with measured data and use the gradient descent method to reversely propagate the error based on the preset learning rate for each layer to fine tune the weight of the DBN.
-
The LR and BP neural network models are the most widely used models for SM inversion, and we constructed an SM LR model (SM-LR) and an SM BP neural network model (SM-BP) for comparison with the SM-DBN model. The structures of SM-LR and SM-BP are similar to that of SM-DBN, as both consist of a temperature subnetwork and an SM subnetwork.
In this study, a graphics workstation with a 12-GB Nvidia graphics card was used to conduct the comparison, with a Linux Ubuntu 16.04 operating system.
We used cross validation techniques in the comparison experiments. All samples of the LST and the SM datasets were used in the training procedure. As described in Section 2, the time period of the samples was January 2018 to December 2019.
First, we used the LST dataset to train the temperature subnetwork of the SM-DBN, SM-LR, and SM-BP, respectively. Each subnetwork was trained for five rounds. In each round, we selected 80% of the samples of LST dataset as training samples and used the remaining 20% of the samples as a test sample. When selecting the samples, we followed two principles: On the one hand, we ensured that each sample would be tested at least once. On the other hand, considering that the time variation of SM was generally assessed, test samples selected from the same site were required to be continuous over time.
Second, the trained temperature subnetwork was used to generate temperature. The generated temperatures were used in the SM dataset for SM-DBN, SM-LR, and SM-BP.
Third, the SM subnetwork of the SM-DBN, SM-LR, and SM-BP were trained by using the same strategy using the respective SM dataset.
Table 3 lists the number of samples used in each round of the comparison experiments.
Type Test sample Training sample Temperature 3460 13,840 SM 2342 9368 Table 3. Number of samples used in each round of the comparison experiments
-
Figure 4 shows the experimental results of the SM distribution from three models tested on five days: 6 March, 3 April, 23 May, 29 October, and 31 October 2019. In the SM-DBN model results, the variation in SM was relatively smooth, which was more consistent with reality than the other two model results. After five experimental rounds, each model produced 11,710 test samples of SM. Figure 5 shows the correlation between the measured data and inversion results of each sample of SM; the coefficient of determination R2 and the root mean square error (RMSE) for the overall accuracy of the SM prediction based on each model are displayed in Table 4. R2 values closer to 1 yield a higher fitting degree for a model while smaller RMSE values indicate a more accurate model. The SM-DBN results exhibited the most important correlation with the measured data and the best accuracy, with R2 and RMSE values of 0.913 and 0.032, respectively. Thus, the SM-DBN was highly superior to the other comparison models.
Figure 4. Distributions of SM over the Ningxia Hui Autonomous Region of China based on the (a) SM-DBN, (b) SM-LR, and (c) SM-BP models for five days: 6 March, 3 April, 23 May, 29 October, and 31 October 2019 (from the left to right panels sequentially).
Model R2 RMSE SM-DBN 0.913 0.032 SM-LR 0.638 0.101 SM-BP 0.813 0.083 Table 4. Accuracy of the SM prediction from three models
Figure 5. Correlation between the observed and predicted SM based on the (a) SM-DBN, (b) SM-LR, and (c) SM-BP models. The dashed line in each panel denotes the fitting line from all testing points.
Considering that the time variation of SM is generally assessed, we selected two typical stations, i.e., Tongxin station and Ligang station, as experimental stations and conducted an SM series experiment. Tongxin station is located in the connecting zone between Ordos platform and the north of Loess Plateau, in the core area of arid zone in the middle of Ningxia. Ligang station is located in the central area of Ningxia Plain. Figure 6 presents a series of Tongxin and Ligang, respectively. The prediction data used in Fig. 6 are generated by the trained SM-DBN, and all the data are independent with the training data. The trained SM-DBN model can also be used to generate time series data in other places that have required basic data. Figure 6 demonstrates that the time series obtained by the SM-DBN are highly consistent with the actual time series.
-
The LR, BP neural network, and DBN models are all machine learning methods and can be applied to establish a regional model to predict SM. All these models employ a hierarchical inversion where the input eigenvalues are directly related to the output data, ensuring that the machine learning algorithm has practical significance with respect to mining deep information between the input eigenvalues. However, the three models have considerable differences in performance in terms of accuracy; these are likely caused by their different learning abilities, which are discussed below.
-
In the temperature subnetwork, the number of RBM nodes of layers 2–6 increased; this structure has an advantage in mining the deep correlation among the input variables. The number of nodes in layers 7–11 gradually decreased, thereby reducing redundant information and facilitating better surface temperature retrievals. The aim of the SM subnetwork is to determine the relationships among the surface temperature, NDVI, and EVI, which are all closely related to SM. Similar to the temperature subnetwork, the number of nodes of layers 2–6 in the SM-subnetwork also increased. Therefore, we successfully mined the deep correlation among the input data, which then decreased layer by layer, leading to the removal of unnecessary information and improvements to the efficiency and accuracy of the inversion.
The SM-DBN model could initialize the weight for each connection by using the contrastive divergence (CD) algorithm and determine the opening or closing of the corresponding hidden elements, which was followed by the use of the least mean square error criterion to back propagate the errors. Using the gradient descent algorithm, the weight and bias of the entire network were continuously updated, so that we could obtain the correlation between the input variables and reflect the physical meaning. Although the number of layers in the network and number of nodes in each RBM were still subjective, the relationship between the extracted data was relatively stable, which was proven by the successful inversion of the multi-phase FY-3D data images by using the SM-DBN model.
-
The experimental results showed that the accuracy of the SM-DBN was the highest among the three tested models because of the following reasons. First, there was a complex, nonlinear correlation between the output data and input data, regardless of whether the output data was SM or surface temperature. Although the LR method had a self-learning ability, it was more suitable for linear cases, and it is unlikely to perform strongly for a complex non-linear correlation. In contrast, the BP neural network and DBN models were significantly better at modeling these complex cases. Second, the BP neural network is a multi-layer feedforward neural network trained by an error BP algorithm, which directly performs BP after random initialization without the need for pre-training. Consequently, its efficiency was high in the single hidden layer. However, with an increasing number of hidden layers, the efficiency of the BP decreased, and the model tended to fall into the local optimal solution because of the long training time. In contrast, the DBN model pretrains the RBM layer-by-layer, followed by inputting the trained eigenvalues into the single-layer BP neural network. This method overcomes the shortcomings of the traditional BP neural network, thus leading to superior inversion results.
-
To obtain high spatial–temporal resolution and high accuracy SM data at a large scale, we proposed an approach known as the SM-DBN model, which extracted relevant information from Chinese domestic FY-3D MERSI-II weather satellite images. Based on the characteristics of the MERSI-II images, a hierarchical inversion strategy was adopted for an SM inversion. In this method, a DBN was used to construct a complete network for the inversion of LST and SM data to directly extract SM. Data for experiments used to validate the new model and compare between the LR and BP models were collected from the Ningxia Hui Autonomous Region of China. Our results show that the SM-DBN model significantly outperformed the other two models in terms of their accuracies. These results are promising for ensuring the optimal use of data from the Chinese domestic satellites, meaning that the dependence of the meteorological research and applications in China on foreign data can be mitigated. This can reduce operating costs and improve the safety of the monitoring systems built on national data.
As the vegetation index is the main input for the SM-DBN, it cannot be used in non-vegetative growing seasons or in areas without vegetation coverage. Future studies will focus on a determination of more suitable supplementary parameters to address this issue.
Momentum | 0.1 |
Structure | [n, n × 3, n × 5, n × 7, n × 9, n × 10, n × 8, n × 6, n × 4, n × 2, 1] |
cd_k | 1 |
RBM learning rate | e–4 |
RBM epoch | 600 |
BP learning rate | e–4 |
Dropout | 0.0005 |
Batch size | 50 |
Note: structure denotes the number of RBM layers and the number of neurons of each layer; n denotes the number of input parameter; cd_k denotes the sampling times; learning rate denotes the parameter in the optimization algorithm that determines the step size at each iteration while moving toward the minimum of the loss function; RBM epoch denotes RBM training times of each layer; dropout denotes the probability of abandonment of neurons; batch size denotes the number of samples per training. |