Faculty of Information Science and Engineering, Ocean University of China, Qingdao 266100
2.
Institute of Urban Meteorology, China Meteorological Administration, Beijing 100089
3.
Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters (CIC-FEMD), Nanjing University of Information Science & Technology, Nanjing 210044
Supported by the National Key Research and Development Program of China (2022YFC3004103), Beijing Natural Science Foundation (8222051), China Meteorological Administration Key Innovation Team (CMA2022ZD04 and CMA2022ZD07), and Nanjing Joint Institute for Atmospheric Sciences Beijige Open Research Fund (BJG202407).
Thunderstorm gusts are a common and hazardous type of severe convective weather, characterized by a small spatial scale, short duration, and significant destructive power. They often lead to severe disasters, highlighting the critical importance of their accurate forecasting. Previous studies have explored the environmental factors and spatiotemporal distribution characteristics of thunderstorm gusts, highlighting the need for improved forecasting methods. In recent years, artificial intelligence techniques have shown promise in enhancing the accuracy of thunderstorm gust forecasting, with various machine learning algorithms and models having been developed. This paper proposes a multiscale feature fusion module called Thunderstorm Gusts Block (TG-Block) and a deep learning model named Thunderstorm Gusts net (TG-net) based on the Attention U-net and TG-TransUnet models, and employs interpretable methods such as Integrated Gradient, Deep Learning Importance Features, and Shapley Additive exPlanations to validate the model’s practical relevance and reliability. The analysis of feature importance underscores the model’s ability to capture key thermodynamic and multiscale weather characteristic information for thunderstorm gust nowcasting. It is, however, worth emphasizing that these conclusions are only based on a limited number of thunderstorm gust examples, and the evaluation results may be affected by specific weather types and sample sizes. Nonetheless, TG-net has been put into real-time operation at the Institute of Urban Meteorology, and we will continue to rigorously validate its performance and make any necessary optimizations and enhancements based on feedback to ensure the robustness and stability of the model.
Thunderstorm gusts are a major hazardous weather phenomenon in severe convective weather, posing one of the most challenging targets for forecasters when issuing thunderstorm warnings. Thunderstorm gusts can have serious impacts on people’s lives, agricultural production, and infrastructure (Doswell III, 2003; Ashley and Mote, 2005; Ashley et al., 2007; Brown et al., 2023).
Previous studies have shown that thunderstorm gusts may be caused by various physical mechanisms and can occur in various atmospheric environments (Wakimoto, 2001), exhibiting significant regional and temporal variability (Brown et al., 2023). Although the accuracy of numerical weather forecasts has improved with the continuous improvement of high-resolution numerical models, forecasting extreme convective weather still poses significant challenges, especially in the forecasting abilities for thunderstorm gusts, which are significantly lower than those for thunderstorms and short-term heavy rainfall (Tang et al., 2017; Zhang et al., 2020). Convective nowcasting methods for thunderstorm gusts based on radar and other products are susceptible to the sparsity of observational data and are mainly applicable to nowcasting forecasts. Potential forecasting methods that rely on various meteorological data or numerical weather prediction data face the challenge of information overload, making it difficult for forecasters to subjectively extract key information.
In recent years, artificial intelligence methods have shown new application prospects in extracting effective spatiotemporal information from different scales of multisource heterogeneous observations and model data, as well as in short-term forecasting of thunderstorm gusts. Utilizing machine learning methods for predicting thunderstorm gusts can eliminate the subjectivity of manual forecasting and make effective use of latent information in radar data to achieve more accurate predictions. Recently, many researchers have explored the application of artificial intelligence in this field (Karpatne et al., 2019).
The NOAA and the University of Wisconsin’s Cooperative Institute for Meteorological Satellite Studies (CIMSS) have developed the NOAA/CIMSS ProbSevere model system for severe weather prediction based upon years of research in satellite, radar, lightning, numerical forecasting, and image science fields. This system combines dense local observational datasets with numerical model data. Its primary aim is to predict the probability of severe convective weather, including hail, high winds, and tornadoes, within 0–60 min for any given thunderstorm over the contiguous United States (Cintineo et al., 2014, 2018, 2020). Lagerquist et al. (2021) developed an hourly tornado prediction system using convolutional neural networks, and reported skill levels comparable to the ProbSevere system. Guastavino et al. (2022) combined convolutional neural networks with long short-term memory artificial neural network algorithms to achieve forecasting and warning of approaching thunderstorm weather using radar data. Xiao et al. (2023) proposed a deep learning (DL) method called CGsNet for quantitative forecasting of thunderstorm gusts within a lead time of 0–2 h. The forecasting results are superior to traditional numerical weather prediction methods such as the Integrated Nowcasting through Comprehensive Analysis (INCA; Haiden et al., 2011). Liu Y. Q. et al. (2024) proposed a convolutional neural network (CNN) and transformer-based DL method that can be used to forecast thunderstorm gusts with a lead time of 1–6 h in the Beijing–Tianjin–Hebei region based on multisource data such as radar, lightning, and automatic weather stations (AWSs), and the forecasting results are better than traditional methods.
While significant strides have been made in the prediction of severe convective weather through systems like NOAA/CIMSS ProbSevere, which leverage a combination of dense observational data and numerical models, there remains a need for more targeted and interpretable models. The existing algorithms, though effective in identifying severe weather, have not been specifically tailored to focus on thunderstorm gusts, and the high dimensionality of atmospheric data presents challenges for the predictive capabilities of DL models, which are often criticized for being black boxes. By employing techniques such as Garson’s algorithm (Ramseyer and Mote, 2016), Shapley Additive exPlanations, and guided backpropagation (Dikshit and Pradhan, 2021a, b; Silva et al., 2022), researchers are making progress in interpreting the inner workings of DL models and identifying the key features that influence their predictions. These studies not only enhance the transparency and credibility of the models but also empower decision-makers with a clearer understanding of the model’s predictions, thereby improving their ability to process information effectively (Gagne II et al., 2019; Liu S. J. et al., 2024).
In this study, we combined the thermodynamic and microphysical parameters from numerical weather forecasts with real-time observational data and high-resolution ensemble forecasts, integrating the advantages of high-frequency and novel multisource observations and predictions. Additionally, a multiscale feature fusion module named Thunderstorm Gusts Block (TG-Block) is proposed. This study concentrated on two main objectives. First, the structure of the Thunderstorm Gusts net (TG-net) network was developed and its benefits for forecasting thunderstorm gusts were examined. Second, the Integrated Gradient (IG; Sundararajan et al., 2017), Deep Learning Importance FeaTures (DeepLIFT; Shrikumar et al., 2017), and Shapley Additive exPlanations (SHAP) techniques (Lundberg and Lee, 2017) were employed to validate the practical relevance of features extracted by the TG-net model. The analysis of feature importance results to some extent explains the effectiveness of the model in capturing key thermodynamic and multiscale weather characteristic information for the nowcasting of thunderstorm gusts.
The rest of this paper is structured as follows: Section 2 provides an overview of the study area and dataset. Section 3 presents the DL models that we used for forecasting thunderstorm gusts. The outcomes of estimation, along with their interpretation and analysis, are detailed in Sections 4 and 5, respectively. Lastly, Section 6 offers discussion and conclusions.
2.
Input data
2.1
Data sources
We used three types of input data. Radar data, as well as products from the Rapid-refresh Multi-scale Analysis and Prediction System-Short Time (RMAPS-ST) and the Rapid-refresh Integrated Seamless Ensemble system (RMAPS-RISE), were used to create predictors for the thunderstorm gusts event. AWS data, radar data, and lighting data were used to determine when and where the thunderstorm gusts event occurred. The input data used in this study can be seen in Table 1 (Liu Y. Q. et al., 2024). We centered circles with a 20-km radius on the location of each lightning strike, and selected all the grid points within these circles that had a radar reflectivity factor exceeding 30 dBZ, designating them as “thunderstorm grid points.” Next, the AWSs with observed instantaneous wind speeds of 17.2 m s−1 or higher were identified and used as centers to draw several circles with a radius of 2 km, with all the grid points within these circles being designated as “gust grid points.” Eventually, all grid points identifiable as both thunderstorm and gust grid points were selected. If the contiguous area contained 50 or more such grid points, all thunderstorm grid points within this region were classified as a “thunderstorm gusts area” (Liu Y. Q. et al., 2024).
Table
1.
Description of the input data used in this study
The input data used in the training process were gridded data, which is like an image composed of pixel points. Since the original data were partitioned into multiple subgraphs of the same size, the model was trained for each subgraph. That is, each subgraph was fed into the model, and the corresponding output was part of the final forecast, with the entire forecast being stitched together from multiple outputs. Following numerous experiments, the optimal subgraph size was determined to be 48 × 48. Once the model had been trained, it was possible to forecast thunderstorm gusts at future moments. For example, when the processed RMAPS-ST, RMAPS-RISE, and the radar data were made into a thunderstorm gusts dataset and fed into the model for 1-h forecasting, these data were divided into 775 subgraphs, and the 775 subgraphs produced 775 subresults of forecasting, with the final 1-h forecasting result obtained by splicing all the forecast subresults. The process is illustrated in Fig. 1.
Fig
1.
Flowchart of thunderstorm gust forecasting based on DL approaches.
2.2
Training and testing period
This study added 20 thunderstorm gust days that had not been used in 2023. Ultimately, a total of 54 thunderstorm gust days from May to September in the years 2021–2023 were selected. These data were divided into three parts: training set, validation set, and testing set. The testing set consisted of the data of four specific days with significant thunderstorm gust processes [20210731(yyyymmdd): 0800–1700 UTC; 20220612: 1100–1900 UTC; 20220818: 1300–1900 UTC; 20230816: 0600–1600 UTC].
The thunderstorm gusts occurred on 31 July 2021, with a total of 221 AWSs observing winds of magnitude 8 or higher, and the maximum extreme wind speed reached 37.4 m s−1 within 1 h, which appeared in Fengfeng Mining District, Handan City, Hebei Province. From the perspective of the weather pattern, this event was primarily influenced by an upper-level trough and a low-level shear line. The extreme wind process from 0800 to 1200 UTC in the Hebei and Henan junction area was mainly caused by a supercell. An intense convective cell first developed in the southwestern part of Hebei Province at 0700 UTC, starting at a very small scale, and over time evolved into a supercell, with the center of the strong echo exceeding 70 dBZ. At 1000 UTC, the supercell split into two, gradually moving towards the southwest and exiting at the border of Hebei. There was a large-scale gale in northeastern Hebei, formed by a bow echo, from 1200 to 0200 UTC 1 August. A total of 32 AWSs recorded strong surface winds of 25 m s−1 and above. In this case, potential forecasts based on numerical models could predict the convective gusts 6 h in advance, but could not forecast the damaging severe winds. Meanwhile, proximity-based forecasts from observation systems also cannot issue early warnings for extreme wind events.
On 12 June 2022, thunderstorm gusts were observed by 116 AWSs, recording wind speeds of magnitude 8 or higher. The highest wind speed recorded within 1 h was 25.7 m s−1, located at Gualanyu Station in Chengde City, Hebei Province. The process was mainly influenced by the Mongolian cold vortex occurring at that time, and this low-pressure vortex induced a northward shift of warm and moist air, ultimately leading to the formation of a shear line over Beijing, which had the potential to amplify the level of convective instability. The radar echo moved from northwest to southeast from Beijing, forming a belt-like distribution.
The thunderstorm gusts event on 18 August 2022 was influenced by an upper-level trough, causing the radar echoes to move from west to east and exhibit a scattered distribution. By 1300 UTC, the radar echo had moved to an area along the Bohai Gulf, evolving into a linear arrangement of multiple cells, corresponding to the time when thunderstorms and strong winds occurred in the Bohai Gulf area. The intensity and range of strong winds in this process were significantly weaker than the intensity of strong winds caused by the supercell on 31 July 2021. A total of 38 AWSs observed winds of magnitude 8 or higher, and the maximum extreme wind speed within 1 h was also recorded at Gualanyu Station, reaching 25.8 m s−1.
The thunderstorm gusts on 16 August 2023 were influenced by a low-level converging system (weak system with good energy conditions). The radar echoes for this process moved from northeast to southwest, while the other three processes moved from northwest to southeast. The radar echo morphology was primarily characterized by multiple cells distributed along a band, as well as locally triggered new strong cells. Wind speeds exceeding magnitude 8 were observed at 122 stations, with the maximum extreme wind speed within 1 h recorded at Chenjialiu in Xiong’an New Area reaching 28.8 m s−1.
The remaining 50 days of data were then randomly divided into a training set and validation set in a ratio of 9 : 1, i.e., 90% for the training set and 10% for the validation set, respectively. The specific moments of occurrence of the added thunderstorm gust events in 2023 are shown in detail in Table 2. Finally, the original 1521 × 1221-sized grid point data were sequentially cropped into multiple 48 × 48-sized subgraphs to enrich the thunderstorm gusts dataset (Liu Y. Q. et al., 2024). The size of the enhanced thunderstorm gusts dataset is shown in Table 3.
Table
2.
Occurrence times of the added thunderstorm gust events in 2023
This paper proposes a multiscale feature fusion module (TG-Block) by integrating the sub-pixel convolution and coordinate attention (CA) mechanism based on the established thunderstorm wind dataset. Additionally, two new thunderstorm gust DL forecasting network models, TG-Attention U-net and TG-net, were designed. Table 4 presents the characteristics and advantages of seven different DL models.
Table
4.
Characteristics of the different DL models
Attention U-net adds a spatial attention mechanism to the feature splicing part on top of U-net, which is essentially based on a CNN, and cannot be separated from the modules of convolution, pooling, and activation function. The spatial attention mechanism simulates the process of human processing of information, by suppressing the unimportant parts of the input data to weight the information, to highlight the features in specific areas, and extract the key information in the data (Oktay et al., 2018). Attention U-net’s gate signal is the post-processing result of the feature map after deep sampling, and the same as its previous layer of the sampled feature map (i.e., the features to be spliced), both of which will be altered by the convolution operation to change the number of channels and the activation function. The merged result sequentially passes through ReLU (rectified linear unit) activation, convolution, and Sigmoid activation modules, ultimately obtaining attention weight information. This weight is multiplied by the feature to be stitched, resulting in the final concatenated features after attention mechanism processing.
Therefore, in this paper, in the final design of the thunderstorm gusts DL forecasting network model, a multiscale feature fusion module named TG-Block is proposed for the up-sampling recovery part of the different levels of features of the thunderstorm gusts. This module combines the sub-pixel convolution and the CA channel attention mechanism (Hou et al., 2021; Liu Y. Q. et al., 2024) to solve the problem of information loss and insufficient resolution in the traditional CNN up-sampling process, to further improve the model’s feature recovery ability. The TG-Block flowchart is shown in Fig. 2. To correspond to the name of the previous network, the DL thunderstorm gust forecasting model proposed here is named “Thunderstorm Gusts Attention U-net” (TG-Attention U-net). The network structure of TG-Attention U-net is shown in Fig. 3.
Fig
3.
Architecture of TG-Attention U-net (encoder–decoder on the left; TG-Block and CA on the right).
3.2
TG-net
U-net and Attention U-net cannot achieve good results without feature splicing, and multidimensional feature splicing at different scales can integrate the feature maps after shallow sampling and deep sampling. Shallow sampling can extract the lower-level features of the input data to quickly obtain some common information, while deep sampling will extract complex features from the output of the previous layers to learn a more abstract and advanced representation of the input data. Nevertheless, U-net, Attention U-net, and TG-Attention U-net undergo only four down-sampling stages, limiting their capacity to extract intricate features. This is somewhat insufficient for extracting features from nonlinear meteorological data and conducting in-depth studies of the local bursts and rapid evolution of severe convective weather over the complex terrain of the Beijing–Tianjin–Hebei region.
Since transformers excel in processing global information and CNNs are better at processing details (Si et al., 2022), integrating transformers with CNNs can capitalize on their individual strengths while mitigating their respective limitations, thereby yielding a more effective and robust model. Therefore, in this paper, we propose a DL thunderstorm gust forecasting model, named “Thunderstorm Gusts-net” (TG-net), based on the foundation of TG-TransUnet (Liu Y. Q. et al., 2024). The encoder of TG-net consists of two parts; namely, a CNN (ResNet-50; He et al., 2016) and transformer (ViT; Dosovitskiy et al., 2020). The decoder still retains the traditional up-sampling mechanism and cascades the downstream information through feature splicing to minimize information loss. In addition, Strudel et al. (2021) pointed out that different size patches in the transformer will also have an impact on the final effect of the network. In this study, the input data size is 48 × 48, so the size of the patch is relatively small, which is also favorable for the processing of the edge details.
Unlike TG-TransUnet, TG-net incorporates the CA module in the feature splicing part. This module plays a pivotal role in feature fusion, allowing for seamless integration of information from different sources. Between the encoder of the CNN and the transformer, the incorporation of the CA module ensures that the feature maps undergo preliminary selection before entering the transformer module. This pre-filtering step optimizes the input for the subsequent stages of processing, enhancing the model’s ability to extract meaningful insights from the data. The network structure diagram of TG-net is shown in Fig. 4.
Due to the extremely unbalanced proportions of positive and negative samples in thunderstorm gusts, the loss function is defined as a combination of Dice loss (Milletari et al., 2016) and Focal loss (Lin et al., 2020), which are commonly used for tasks with imbalanced proportions of positive and negative samples. Training the model by simply adding Dice loss and Focal loss may result in bias towards one loss function. Common methods used to balance multiple losses include adding fixed weights and grid search techniques. However, experimental findings indicate that simply adding fixed weights does not give good results. Therefore, an uncertainty loss (Liebel and Körner, 2018) is used to balance Dice loss and Focal loss to optimize the model’s performance. The loss function in this paper is defined as follows:
where σ1 and σ2 are the dynamic weights of the Dice loss and the Focal loss, respectively, which are continuously updated with the training of the thunderstorm gusts DL forecasting model to reach an optimal solution. Dice loss and Focal loss can be expressed respectively as:
lossDice=1−2∑Ni=1yiy′i+ε∑Ni=1(yi+y′i)+ε,
(2)
lossFocal=−α(1−yi)γlog(yi)−(1−α)yγilog(1−yi).
(3)
After several experiments and comparisons, γ was set to 2 and α to 0.25. In addition, the batch size was set to 256, epoch to 50, and the initial learning rate to 0.01. During the process of training the model, if the skill scores in the testing set do not decline in two consecutive rounds of scoring, the learning rate will decay to half of the original.
4.
Experiments and analyses
4.1
Evaluation metrics
To evaluate the performance of the forecasting, the following metrics are used: the probability of detection (POD), false alarm ratio (FAR), critical success index (CSI), equitable threat score (ETS), and Heidke skill score (HSS) (Schaefer, 1990; Mesinger, 2008; Hyvärinen, 2014).
In addition, performance diagrams are often used to assess the forecast effectiveness of thunderstorm wind events, as shown in Fig. 5. Performance diagrams can illustrate the classification statistical relationship, with the horizontal axis representing the success ratio (1 − FAR) and the vertical axis representing the POD. Different background colors represent different CSI values, while different shades of gray radiating from the origin represent different frequency biases. The 45° diagonal line indicates perfect frequency bias.
For the 4-day testing set selected for this study, the results of specific forecast test comparisons can be found in Table 5 (the best results are formatted in bold). The forecasting of thunderstorm gusts is more challenging than other severe convective weather. RISEgust, a traditional forecasting method based on the gust factor, serves as the baseline for comparison in this study. The results show that the CSI, ETS, and HSS of the RISEgust method are the lowest, and the remaining six models have greater improvements than the traditional method with lead times of 1 and 3 h, and the improvement percentages of CSI, ETS, and HSS, and the corresponding performance diagrams, can be seen in Figs. 6 and 7, respectively. The CSI values of TG-Attention U-net in the forecasts of 1 to 3 h in advance are 0.353, 0.253, and 0.220, respectively. Meanwhile, the CSI values of TG-net are 0.366, 0.272, and 0.241, respectively, which are 186%, 178%, and 194% higher compared to RISEgust; 4%, 8%, and 10% higher compared to TG-Attention U-net; and 1%, 5%, and 9% higher compared to TG-TransUnet. In addition, the ETS and HSS of TG-net are much higher than those of RISEgust, and slightly higher than those of TG-Attention U-net and TG-TransUnet. It can also be seen from the performance diagrams that the best results are obtained with TG-net, which is shifted to the upper right and closer to the 45° diagonal compared to the other model positions.
Table
5.
Skill scores of RISEgust and various DL algorithms in forecasting thunderstorm gusts with a lead time from 1 to 3 h. The bolded entries correspond to the highest results among the eight methods
Fig
6.
CSI, ETS, and HSS variations in forecasting thunderstorm gusts with a lead time of 1–3 h using various methods, as well as the improvement percentages of DL models compared to RISEgust (with the horizontal axis representing forecast lead time and the vertical axis representing evaluation metric scores and improvement percentages).
Fig
7.
Performance diagrams for forecasting thunderstorm gusts with lead times of (a) 1 h, (b) 2 h, and (c) 3 h.
Overall, the Attention U-net model outperforms the standard U-net model in all forecasts from 1 to 3 h in advance. This suggests that the attention mechanism’s feature splicing process more effectively recovers feature information. The attention mechanism plays a crucial role in automatically extracting and processing key data, thereby enhancing the model’s generalization and forecasting capabilities. In addition, the CSI, ETS, and HSS of TG-net are the highest at all moments, and its success is largely attributable to the combination of TG-Attention U-net and TG-TransUnet, which can be better used for the forecasting of thunderstorm gusts in the Beijing–Tianjin–Hebei region. Since the location of thunderstorm gusts is the focus of this paper and different types of meteorological data may have an impact on the experimental results, this paper introduces a method of embedding the location information into the channel attention mechanism (i.e., CA) and combines it with sub-pixel convolution to propose the TG-Block module, applicable to thunderstorm gust forecasting. The CSI, ETS, and HSS results for TG-TransUnet and TG-Attention U-net are marginally superior to those of their counterparts, TransU-net and Attention U-net, in the 1–3-h lead time forecasts. This slight advantage can primarily be attributed to the increasing challenge of feature extraction with extended forecast duration, and the beneficial influence of TG-Block on enhancing the forecast performance. Since our testing set was from four thunderstorm gust days [20210731 (yyyymmdd): 0800–1700 UTC; 20220612: 1100–1900 UTC; 20220818: 1300–1900 UTC; 20230816: 0600–1600 UTC], the testing results were also based on them as well. The small number of thunderstorm gust days used for the testing set will be questionable for some researchers. However, thunderstorm gusts are a relatively less common form of strong convective weather for which accumulating a large number of events can be a long process, and the results of this study, despite the small sample size, can still provide an important reference for future planning.
4.3
Case studies: 1700 UTC 12 June 2022
The 1-h lead time forecast results for thunderstorm gusts at 1700 UTC 12 June 2022 are presented in Table 6 and illustrated in Fig. 8. The forecast results from RISEgust indicate that parts of northeastern Hebei and southern–central Tianjin are expected to experience winds of magnitude 8 or higher, and the actual range of thunderstorm gusts also falls within this region. RISEgust demonstrates certain forecasting capabilities. However, its forecast range is obviously smaller than the ground truth, the POD is obviously low, and its CSI is only 0.082; the overall effect is much worse than the DL models. The other seven DL models forecast the occurrence of thunderstorm gusts more accurately. However, it is worth noting that these methods mostly fail to forecast thunderstorm gusts in the southwest Tianjin area, exhibiting some degree of underestimation. TG-net outperforms U-net, CU-net, Attention U-net, TG-Attention U-net, and TransU-net, but its performance is similar to that of TG-TransUnet.
Table
6.
Comparison of skill scores for forecasting thunderstorm gusts by RISEgust and multiple DL methods at a lead time of 1 h at 1700 UTC 12 June 2022. The bolded entries correspond to the highest results among the eight methods
Fig
8.
Thunderstorm gust forecast results at a lead time of 1 h at 1700 UTC 12 June 2022: (a) ground truth of thunderstorm gusts; (b) RISEgust (color shading represents wind speed; m s−1); (c) U-net; (d) CU-net; (e) Attention U-net; (f) TG-Attention U-net; (g) TransU-net; (h) TG-TransUnet; and (i) TG-net.
The results of the 2-h advance forecast for thunderstorm gusts at 1700 UTC 12 June 2022 are shown in Table 7 and Fig. 9. The forecasts of all the models are worse than those at the 1-h lead time, and RISEgust predicts that there will be strong winds of magnitude 8 or above in northeastern Hebei and central–southern Tianjin. Compared to the ground truth, RISEgust exhibits significant underreporting, with a CSI of only 0.017. Although the remaining seven DL models are more accurate in predicting the real situation of thunderstorms gusts, it is evident that there are some misses in the southern region of Tangshan. In addition, CU-net has obvious overprediction, mainly concentrated in the Beijing area and some parts of northeastern Hebei. Compared with other models, TG-net has the highest POD, CSI, ETS, and HSS, which are 0.295, 0.259, 0.240, and 0.388, respectively.
Table
7.
As in Table 6, but for a lead time of 2 h
The results of the 3-h lead time forecast for thunderstorm gusts at 1700 UTC 12 June 2022 are shown in Table 8 and Fig. 10. The forecasting performance of RISEgust is very poor, with a CSI of only 0.022. Among the seven DL models, U-net performs the worst, and the CSI is only 0.068. In addition, Attention U-net is also relatively ineffective, with a CSI not exceeding 0.1. TransU-net, TG-TransUnet, and TG-net all show a certain range of empty reports in some areas of southern Hebei, but overall they accurately forecast the areas of thunderstorm gusts in the eastern and northern regions of the Beijing–Tianjin–Hebei area. CU-net, Attention U-net, and TG-Attention U-net have serious under-reporting and relatively small forecast ranges, while TG-net has the highest CSI (0.207), ETS (0.173), and HSS (0.296), and has the best overall forecast effect. The specific changes in CSI, ETS, HSS, and performance diagrams can be seen in Figs. 11 and 12. Apart from the conventional method, RISEgust, all other DL models exhibit a gradual decrease in the various evaluation metrics as the forecast lead time increases.
Table
8.
As in Table 6, but for a lead time of 3 h
Fig
11.
CSI, ETS, and HSS variations for forecasting thunderstorm gusts with a lead time of 1–3 h using eight different methods at 1700 UTC 12 June 2022. The abscissa denotes the lead time of forecasts, while the ordinate represents the scores of the evaluation metrics.
Fig
12.
Performance diagrams for forecasting thunderstorm gusts with a lead time of (a) 1 h, (b) 2 h, and (c) 3 h at 1700 UTC 12 June 2022.
5.
Interpretability analysis
While DL models like TG-net have shown considerable improvements in predicting thunderstorm gusts, they are still regarded as black-box models with limited interpretability. This implies that while these models generate outputs from specific input data, the underlying mechanisms are not fully transparent, which may influence feature selection and model optimization efforts. In weather forecasting, particularly for severe convective events like thunderstorm gusts, a stable and interpretable model is essential. Currently, the most commonly used interpretability methods are primarily post-hoc approaches (Ji et al., 2019). These approaches involve understanding the predictive outcomes of models after training is completed, typically through various methods or techniques, including assessing the contributions of different factors to the model’s predictions. Because thunderstorm gusts occur and dissipate rapidly, with a small impact range and strong local characteristics, this section primarily discusses the contribution of multiple meteorological elements to the forecast results of thunderstorm gusts. Specifically, it focuses on prioritizing the importance of the inputs of multiple meteorological elements.
The most commonly used interpretability methods include IG, DeepLIFT, and SHAP. Among them, IG is a prevalent post-hoc interpretability approach. In traditional gradient calculations, a common issue is reaching a “saturation zone” where gradients vanish, yielding no useful information. IG addresses this by establishing a reference baseline and integrating the gradient information. For each input of multisource meteorological data, IG calculates the gradient from the baseline to the specific input change, and finally integrates the gradient for each individual feature. DeepLIFT is also a post-hoc interpretability method, and is often used to analyze the importance of features. It is based on the chain derivation and back propagation algorithms. By comparing the differences between the input data and the baseline that needs to be referenced, it assigns the results to each input feature, obtaining the contribution of each input feature; namely, feature importance. Unlike IG, DeepLIFT uses discrete gradients, which also avoids situations where the gradient is zero when entering the saturation zone. It is worth noting that DeepLIFT may have negative values, which indicates that an input feature has an inhibitory effect on the model’s decision, but this is also a reflection of the importance of the feature. Therefore, the possible “negative contribution” needs further special treatment.
SHAP is based on the Shapely value in game theory and calculates the degree of contribution corresponding to different features for each input sample, and the results can reflect the importance of different features. Specifically, SHAP may assign a negative value to a feature, signifying that it positively suppresses the model’s decision-making and exhibits inhibitory effects in its interactions with other features. However, SHAP considers all feature subsets and therefore has a high temporal complexity. Based on this, GradientShap uses a stochastic sampling method to calculate the gradient expectation value of different baselines to obtain the approximate integral gradient, and then obtain the Shapely value. The temporal complexity of GradientShap is much lower in comparison, which is more suitable for practical applications. In this paper, the baseline corresponds to a vector of all zeros, representing a pure black image in computer vision.
Since the thunderstorm gusts dataset utilized in this study has been augmented, the determined importance of features is applicable to all augmented samples. To mitigate the impact of extreme values in meteorological variables (such as pressure), all meteorological data underwent Min–Max normalization to the (0, 1) range and were retrained, and finally the model with the best results was selected for analyzing the importance of different meteorological variables. Furthermore, in the process of calculating feature importance, negative values (such as the “negative contribution” of DeepLIFT) may arise. Thus, each feature importance calculated from independent samples is taken as an absolute value and normalized uniformly. The feature importance rankings for predicting thunderstorm gusts with a lead time of 1 h using TG-net are presented in Table 9 and Fig. 13. From top to bottom, the importance of features gradually decreases. It is evident that RADAR, UVana, and PRES are consistently identified as the most important features for TG-net based on three different interpretability methods within the 1-h forecast lead time. Remarkably, in the case of DeepLIFT, the cumulative importance of these three features exceeds 50%.
Table
9.
Feature importance distribution for TG-net’s 1-h lead time forecasting of thunderstorm gusts and the ratio of CSI reduction with removed features
Fig
13.
Feature importance ranking for thunderstorm gust forecasting using TG-net at a lead time of 1 h, where the horizontal axis represents the feature importance score and the vertical axis represents the feature names.
The feature removal experiment (Table 9) also indicates that the removal of these three variables significantly affects the model performance, demonstrating the importance of these three feature variables within the 1-h forecast lead time. RADAR represents the composite reflectivity factor from radar observations at the current moment. Thunderstorm gusts typically accompany intense precipitation, which corresponds to high composite reflectivity factors. Therefore, RADAR can be viewed as a manifestation of precipitation intensity and its spatial distribution. Intense precipitation areas often coincide with strong upward and downward airflows, which may lead to the occurrence of severe thunderstorm gusts. Thus, the current composite reflectivity factor can help identify regions experiencing heavy precipitation, aiding in the prediction of potential thunderstorm gust occurrences. Additionally, the composite reflectivity factor can reflect the vertical development trend of individual thunderstorm cells. In convective storms, the thermal effect caused by precipitation particles descending from higher altitudes can induce the formation of downdrafts, thereby influencing thunderstorm gusts (Roberts and Wilson, 1989). UVana represents the analysis field of gust wind speed at the current moment. Strong winds are typically caused by abrupt changes in the wind field. The current wind speed provides temporal and spatial indications for forecasting wind speed changes in the next hour, reflecting the development and trends of future wind fields. PRES represents atmospheric pressure, and under specific conditions horizontal pressure gradients generate horizontal wind fields, while vertical pressure gradients cause air to rise or sink. Wind speed is directly influenced by pressure gradients, which in turn affect the formation and development of thunderstorm gusts. Recent observations and analysis fields more effectively capture the storm’s status and offer insights into its impending movement velocity and trajectory for the 1-h forecast.
Given that the features UVpred_ST, SHEAR2, TMPdiff, and CAPE, all of which are derived from numerical models, consistently receive low rankings in feature importance evaluations, it is noteworthy that the model’s performance actually improves following their removal. We introduced a supplementary experiment to assess the collective influence of these features on the CSI. When all four features were removed, the CSI increased by 3.95% compared to the previous one, indicating that their simultaneous absence had a minimal impact on the model, which is very helpful for feature simplification and improving model efficiency.
The feature importance ranking for the 2-h forecast is similar to that of the 1-h forecast (Table 10 and Fig. 14). RADAR and UVana remain the two most critical features, but TQdiff also ranks relatively high. TQdiff signifies the temperature disparity between future and current time intervals, primarily employed for assessing the formation of cold pools. Cold pools denote regions of cooler, denser air typically accumulated beneath thunderstorm clouds, generated by descending airflow. As warm, moist air ascends within thunderstorm clouds, precipitation-induced evaporative cooling results in the descent of colder air, forming cold pools. Upon reaching the ground, these colder and denser air masses swiftly spread horizontally, accelerating the downdrafts within thunderstorm clouds. Subsequently, upon reaching the ground, these downdrafts rapidly disperse horizontally, giving rise to intense linear storms, thereby constituting thunderstorm gusts. Furthermore, when the leading edge of a cold pool encounters surrounding warm air, it can trigger the formation of new convective cells, occasionally intensifying or prolonging thunderstorms. Moreover, TQdiff usually has a strong correlation with PRES.
Table
10.
As in Table 9, but for a lead time of 2 h
Fig
14.
As in Fig. 13, but for a lead time of 2 h.
In comparison, the importance scores for the variables SHEAR1, SHEAR2, TMPdiff, RADARST, and CAPE are still relatively lower. However, it should be noted that the feature variables UVpred_ST, SHEAR2, TMPdiff, and CAPE from the four model patterns showed no change or an increase in CSI after removal in the 1-h forecast, but a decrease in CSI after removal in the 2-h forecast. The UVana feature has a high importance score, but the CSI reduction ratio is 0%, which may indicate that the predictive signal of this feature is already captured by other features in the model.
By comparing the 1- and 2-h results, it can be observed that the importance of numerical model features changes with the extension of the forecast lead time, indicating that the model’s dependency on features varies across different timescales. Relying solely on observations and analysis fields for thunderstorm gust forecasting is insufficient as the forecast lead time increases.
The feature importance rankings at the lead time of 3 h in the forecasting of thunderstorm gusts by TG-net are shown in Table 11 and Fig. 15. The feature importance ranking for the 3-h forecast is somewhat different from the first two hours. RADARST becomes the most important feature. In addition, the scores of RADAR, PRES, and SHEAR1 are also ranked at the top. By examining the variation in CSI scores following the removal of variables (as shown in Table 11) alongside the assessment of feature importance, we can determine if features deemed highly important also correspond to a notable decrease in performance when omitted from the model. The results show that RADARST, PRES, and SHEAR1 are features that have a significant impact in both feature importance evaluation and variable removal tests, and can be identified as key features of the model. Features such as TMPdiff and RRpred do not seem particularly important when evaluated alone, but their removal has a significant impact on model performance, indicating that these features may have a complementary effect with other features and are also key features of the model. The RADAR feature is considered important during feature evaluation, but its removal has a minimal impact on model performance, possibly because other features provide similar information to the RADAR feature, allowing the model to still obtain necessary information without the RADAR feature. Features like UVpred_RISE and UVpred_ST, when removed, actually improve the model’s performance. Therefore, we added a set of sensitivity tests that simultaneously removed the features of RADAR, UVpred_RISE, and UVpred_ST. When all three features were removed, the CSI score increased by 4.69% compared to before, indicating that these variables are dispensable.
Table
11.
As in Table 9, but for a lead time of 3 h
Fig
15.
As in Fig. 13, but for a lead time of 3 h.
Through the above analysis, a more comprehensive understanding of the contribution of each feature to the performance of the severe thunderstorm model under different forecast duration can be achieved. Gradually removing redundant or irrelevant features can simplify the model, reduce its computational complexity, and prevent overfitting, providing strong support for model optimization and improvement. At the same time, features that are particularly helpful for prediction should be reinforced to ensure that these key features are fully utilized in the model.
6.
Summary
In this paper, we propose a multiscale feature fusion module named TG-Block and a new interpretable DL model, TG-net, for thunderstorm gust forecasting in the Beijing–Tianjin–Hebei region. In addition, DL interpretable methods such as IG, DeepLIFT, and SHAP are used to interpretably analyze the multisource meteorological features input into TG-net. Currently, TG-net is being run operationally at the Institute of Urban Meteorology. Evaluation of the operational results demonstrates the following:
(1) The DL model outperforms the traditional method significantly, with TG-Block showing an optimization effect. For the 1-h forecast, the CSI of TG-Attention U-net and TG-TransUnet incorporating TG-Block can reach 0.353 and 0.361, respectively, marking a 3% and 4% improvement over Attention U-net and TransU-net (0.342 and 0.346), and a 176% and 182% improvement over the traditional method, respectively. The CSI of TG-net incorporating TG-Block can reach 0.366, showcasing a 4% and 1% improvement over TG-Attention U-net and TG-TransUnet.
(2) The results of the feature importance analysis of the TG-net model reveal that the radar observation at the current moment (radar combined reflectivity factor) is the most crucial factor for forecasting thunderstorm gusts with a lead time of 1–2 h. Additionally, the difference between the 2-m temperature forecast and the analysis, the analysis of wind speed at the current moment, as well as the combined radar reflectivity of the numerical mo-del forecasts, also play significant roles. Otherwise, UVpred_ST, SHEAR2, TMPdiff, and CAPE have little effect on thunderstorm gust forecasting with a lead time of 1 h, and RADAR, UVpred_RISE, and UVpred_ST can be optionally ignored at the 3-h lead time. Therefore, it is essential to pay special attention to these meteorological features in thunderstorm gust forecasting to enhance the accuracy of predictions.
It is important to note, however, that the results may be influenced by specific weather types and sample sizes, as the conclusions are derived from a limited number of instances of thunderstorm gusts. For models to have practical value, they need to be tested and validated with a large amount of data. The amount of data used at present is still relatively small and does not allow for a good determination of its performance characteristics, meaning that further validation and optimization of the DL model is needed before it can be applied in practice. In addition, the causes of thunderstorm gusts are complex, and the analysis of the importance of characteristics can enhance our understanding in this regard. There is, however, much work to be done before a systematic and scientifically sound theory can be fully formed.
Acknowledgments
The authors express their deep gratitude to the editors and anonymous reviewers, as well as to the Institute of Urban Meteorology for providing the pertinent radar, lightning, and automatic weather station data. Additionally, the authors acknowledge the invaluable support from the Beijing Meteorological Service Data Centre in facilitating access to GPU computing resources.
Fig.
3.
Architecture of TG-Attention U-net (encoder–decoder on the left; TG-Block and CA on the right).
Fig.
6.
CSI, ETS, and HSS variations in forecasting thunderstorm gusts with a lead time of 1–3 h using various methods, as well as the improvement percentages of DL models compared to RISEgust (with the horizontal axis representing forecast lead time and the vertical axis representing evaluation metric scores and improvement percentages).
Fig.
8.
Thunderstorm gust forecast results at a lead time of 1 h at 1700 UTC 12 June 2022: (a) ground truth of thunderstorm gusts; (b) RISEgust (color shading represents wind speed; m s−1); (c) U-net; (d) CU-net; (e) Attention U-net; (f) TG-Attention U-net; (g) TransU-net; (h) TG-TransUnet; and (i) TG-net.
Fig.
11.
CSI, ETS, and HSS variations for forecasting thunderstorm gusts with a lead time of 1–3 h using eight different methods at 1700 UTC 12 June 2022. The abscissa denotes the lead time of forecasts, while the ordinate represents the scores of the evaluation metrics.
Fig.
13.
Feature importance ranking for thunderstorm gust forecasting using TG-net at a lead time of 1 h, where the horizontal axis represents the feature importance score and the vertical axis represents the feature names.
Table
5
Skill scores of RISEgust and various DL algorithms in forecasting thunderstorm gusts with a lead time from 1 to 3 h. The bolded entries correspond to the highest results among the eight methods
Table
6
Comparison of skill scores for forecasting thunderstorm gusts by RISEgust and multiple DL methods at a lead time of 1 h at 1700 UTC 12 June 2022. The bolded entries correspond to the highest results among the eight methods
Table
9
Feature importance distribution for TG-net’s 1-h lead time forecasting of thunderstorm gusts and the ratio of CSI reduction with removed features
Ashley, W. S., and T. L. Mote, 2005: Derecho hazards in the United States. Bull. Amer. Meteor. Soc., 86, 1577–1592, https://doi.org/10.1175/BAMS-86-11-1577.
Ashley, W. S., T. L. Mote, and M. L. Bentley, 2007: The extensive episode of derecho-producing convective systems in the United States during May and June 1998: A multi-scale analysis and review. Meteor. Appl., 14, 227–244, https://doi.org/10.1002/met.23.
Brown, A., A. Dowdy, T. P. Lane, et al., 2023: Types of severe convective wind events in eastern Australia. Mon. Wea. Rev., 151, 419–448, https://doi.org/10.1175/MWR-D-22-0096.1.
Chen, J. N., Y. Y. Lu, Q. H. Yu, et al., 2021: TransUNet: Transformers make strong encoders for medical image segmentation. arXiv, 2102.04306, https://doi.org/10.48550/arXiv.2102.04306.
Cintineo, J. L., M. J. Pavolonis, J. M. Sieglaff, et al., 2014: An empirical model for assessing the severe weather potential of developing convection. Wea. Forecasting, 29, 639–653, https://doi.org/10.1175/WAF-D-13-00113.1.
Cintineo, J. L., M. J. Pavolonis, J. M. Sieglaff, et al., 2018: The NOAA/CIMSS ProbSevere model: Incorporation of total lightning and validation. Wea. Forecasting, 33, 331–345, https://doi.org/10.1175/WAF-D-17-0099.1.
Cintineo, J. L., M. J. Pavolonis, J. M. Sieglaff, et al., 2020: NOAA ProbSevere v2.0—ProbHail, ProbWind, and ProbTor. Wea. Forecasting, 35, 1523–1543, https://doi.org/10.1175/WAF-D-19-0242.1.
Dikshit, A., and B. Pradhan, 2021b: Interpretable and explainable AI (XAI) model for spatial drought prediction. Sci. Total Environ., 801, 149797, https://doi.org/10.1016/j.scitotenv.2021.149797.
Dosovitskiy, A., L. Beyer, A. Kolesnikov, et al., 2020: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv, 2010.11929, https://doi.org/10.48550/arXiv.2010.11929.
Doswell III, C. A., 2003: Societal impacts of severe thunderstorms and tornadoes: Lessons learned and implications for Europe. Atmos. Res., 67–68 , 135–152, https://doi.org/10.1016/S0169-8095(03)00048-6.
Gagne II, D. J., S. E. Haupt, D. W. Nychka, et al., 2019: Interpretable deep learning for spatial analysis of severe hailstorms. Mon. Wea. Rev., 147, 2827–2845, https://doi.org/10.1175/MWR-D-18-0316.1.
Guastavino, S., M. Piana, M. Tizzi, et al., 2022: Prediction of severe thunderstorm events with ensemble deep learning and radar data. Sci. Rep., 12, 20049, https://doi.org/10.1038/s41598-022-23306-6.
Haiden, T., A. Kann, C. Wittmann, et al., 2011: The Integrated Nowcasting through Comprehensive Analysis (INCA) system and its validation over the eastern alpine region. Wea. Forecasting, 26, 166–183, https://doi.org/10.1175/2010WAF2222451.1.
Han, L., M. X. Chen, K. K. Chen, et al., 2021: A deep learning method for bias correction of ECMWF 24–240 h forecasts. Adv. Atmos. Sci., 38, 1444–1459, https://doi.org/10.1007/s00376-021-0215-y.
He, K. M., X. Y. Zhang, S. Q. Ren, et al., 2016: Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, NV, USA, 770−778, https://doi.org/10.1109/CVPR.2016.90.
Hou, Q. B., D. Q. Zhou, and J. S. Feng, 2021: Coordinate attention for efficient mobile network design. Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, TN, USA, 13,708−13,717, doi: 10.1109/CVPR46437.2021.01350.
Ji, S. L., J. F. Li, T. Y. Du, et al., 2019: Survey on techniques, applications and security of machine learning interpretability. J. Comput. Res. Dev., 56, 2071–2096, https://doi.org/10.7544/ISSN1000-1239.2019.20190540. (in Chinese)
Karpatne, A., I. Ebert-Uphoff, S. Ravela, et al., 2019: Machine learning for the geosciences: Challenges and opportunities. IEEE Trans. Knowl. Data Eng., 31, 1544–1554, https://doi.org/10.1109/TKDE.2018.2861006.
Lagerquist, R., J. Q. Stewart, I. Ebert-Uphoff, et al., 2021: Using deep learning to nowcast the spatial coverage of convection from Himawari-8 satellite data. Mon. Wea. Rev., 149, 3897–3921, https://doi.org/10.1175/MWR-D-21-0096.1.
Liebel, L., and M. Körner, 2018: Auxiliary tasks in multi-task learning. arXiv, 1805.06334, https://doi.org/10.48550/arXiv.1805.06334.
Lin, T.-Y., P. Goyal, R. Girshick, et al., 2020: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell., 42, 318–327, https://doi.org/10.1109/tpami.2018.2858826.
Liu, S. J., W. J. Yan, X. R. Liu, et al., 2024: Interpretable convolutional neural network for analyzing precipitation in the pre-rainy season of South China. J. Appl. Meteor. Climatol., 63, 387–399, https://doi.org/10.1175/JAMC-D-23-0075.1.
Liu, Y. Q., L. Yang, M. X. Chen, et al., 2024: A deep learning approach for forecasting thunderstorm gusts in the Beijing–Tianjin–Hebei region. Adv. Atmos. Sci., 41, 1342–1363, https://doi.org/10.1007/s00376-023-3255-7.
Lundberg, S. M., and S. I. Lee, 2017: A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems, Curran Associates Inc., Long Beach, CA, USA, 4768−4777.
Milletari, F., N. Navab, and S.-A. Ahmadi, 2016: V-Net: Fully convolutional neural networks for volumetric medical image segmentation. Proceedings of 2016 4th IEEE International Conference on 3D Vision, IEEE, Stanford, CA, USA, 565−571, https://doi.org/10.1109/3DV.2016.79.
Oktay, O., J. Schlemper, L. Le Folgoc, et al., 2018: Attention U-Net: Learning where to look for the pancreas. arXiv, 1804.03999, https://doi.org/10.48550/arXiv.1804.03999.
Ramseyer, C. A., and T. L. Mote, 2016: Atmospheric controls on Puerto Rico precipitation using artificial neural networks. Climate Dyn., 47, 2515–2526, https://doi.org/10.1007/s00382-016-2980-3.
Roberts, R. D., and J. W. Wilson, 1989: A proposed microburst nowcasting procedure using single-Doppler radar. J. Appl. Meteor. Climatol., 28, 285–303, https://doi.org/10.1175/1520-0450(1989)028<0285:APMNPU>2.0.CO;2. doi: 10.1175/1520-0450(1989)028<0285:APMNPU>2.0.CO;2
Ronneberger, O., P. Fischer, and T. Brox, 2015: U-Net: Convolutional networks for biomedical image segmentation. Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015, Springer, Munich, Germany, 234−241, https://doi.org/10.1007/978-3-319-24574-4_28.
Schaefer, J. T., 1990: The critical success index as an indicator of warning skill. Wea. Forecasting, 5, 570–575, https://doi.org/10.1175/1520-0434(1990)005<0570:TCSIAA>2.0.CO;2. doi: 10.1175/1520-0434(1990)005<0570:TCSIAA>2.0.CO;2
Shrikumar, A., P. Greenside, and A. Kundaje, 2017: Learning important features through propagating activation differences. Proceedings of the 34th International Conference on Machine Learning, JMLR.org, Sydney, Australia, 3145–3153.
Si, J. W., B. X. Huang, H. Yang, et al., 2022: A no-reference stereoscopic image quality assessment network based on binocular interaction and fusion mechanisms. IEEE Trans. Image. Process., 31, 3066–3080, https://doi.org/10.1109/TIP.2022.3164537.
Silva, S. J., C. A. Keller, and J. Hardin, 2022: Using an explainable machine learning approach to characterize Earth system model errors: Application of SHAP analysis to modeling lightning flash occurrence. J. Adv. Model Earth Syst., 14, e2021MS002881, https://doi.org/10.1029/2021MS002881.
Strudel, R., R. Garcia, I. Laptev, et al., 2021: Segmenter: Transformer for semantic segmentation. Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, QC, Canada, 7242–7252, https://doi.org/10.1109/ICCV48922.2021.00717.
Sundararajan, M., A. Taly, Q. Y. Yan, 2017: Axiomatic attribution for deep networks. Proceedings of the 34th International Conference on Machine Learning, JMLR.org, Sydney, Australia, 3319–3328.
Tang, W. Y., Q. L. Zhou, X. H. Liu, et al., 2017: Analysis on verification of national severe convective weather categorical forecasts. Meteor. Mon., 43, 67–76, https://doi.org/10.7519/j.issn.1000-0526.2017.01.007. (in Chinese)
Wakimoto, R. M., 2001: Convectively driven high wind events. Severe Convective Storms. C. A. Doswell, Ed., American Meteorological Society, Boston, MA, 255–298, https://doi.org/10.1007/978-1-935704-06-5_7.
Xiao, H. X., Y. Q. Wang, Y. Zheng, et al., 2023: Convective-gust nowcasting based on radar reflectivity and a deep learning algorithm. Geosci. Model Dev., 16, 3611–3628, https://doi.org/10.5194/gmd-16-3611-2023.
Zhang, X. L., J. H. Sun, Y. G. Zheng, et al., 2020: Progress in severe convective weather forecasting in China since the 1950s. J. Meteor. Res., 34, 699–719, https://doi.org/10.1007/s13351-020-9146-2.
Liu, Y. Q., L. Yang, M. X. Chen, et al., 2025: TG-net: A physically interpretable deep learning forecasting model for thunderstorm gusts. J. Meteor. Res., 39(1), 59–78, https://doi.org/10.1007/s13351-025-4080-y.
Liu, Y. Q., L. Yang, M. X. Chen, et al., 2025: TG-net: A physically interpretable deep learning forecasting model for thunderstorm gusts. J. Meteor. Res., 39(1), 59–78, https://doi.org/10.1007/s13351-025-4080-y.
Liu, Y. Q., L. Yang, M. X. Chen, et al., 2025: TG-net: A physically interpretable deep learning forecasting model for thunderstorm gusts. J. Meteor. Res., 39(1), 59–78, https://doi.org/10.1007/s13351-025-4080-y.
Citation:
Liu, Y. Q., L. Yang, M. X. Chen, et al., 2025: TG-net: A physically interpretable deep learning forecasting model for thunderstorm gusts. J. Meteor. Res., 39(1), 59–78, https://doi.org/10.1007/s13351-025-4080-y.
Table
5.
Skill scores of RISEgust and various DL algorithms in forecasting thunderstorm gusts with a lead time from 1 to 3 h. The bolded entries correspond to the highest results among the eight methods
Table
6.
Comparison of skill scores for forecasting thunderstorm gusts by RISEgust and multiple DL methods at a lead time of 1 h at 1700 UTC 12 June 2022. The bolded entries correspond to the highest results among the eight methods
Table
9.
Feature importance distribution for TG-net’s 1-h lead time forecasting of thunderstorm gusts and the ratio of CSI reduction with removed features