A Review of Research Progress on Numerical Model Verification and Evaluation in China

中国数值模式评估方法的研究进展综述

+ Author Affiliations + Find other works by these authors
Funds: 

Supported by the National Natural Science Foundation of China (U2142214), Key Research and Development Program of Yunnan Province (202403AC100040-2), and S&T Development Fund of Chinese Academy of Meteorological Sciences (2023KJ028).

Note: This paper will appear in the forthcoming issue. It is not the finalized version yet. Please use with caution.

PDF

  • Evaluation constitutes a crucial component in the creation and application of numerical models. As these models have escalated in complexity in recent years, there has been an increasing demand for more accurate forecasting, thereby catalyzing substantial advancements in model verification and evaluation techniques. This paper begins with a review of recent domestic and international progress in model evaluation and verification, and subsequently delves into the significant contributions made by Chinese scholars in evaluating the dynamical frameworks of next-generation numerical models, the spatiotemporal variability of the East Asian monsoon, and the assessment of weather and climate conditions around the Qinghai–Xizang Plateau (QXP). Given the identification of precipitation as a key process, the paper presents a comprehensive review of evaluation efforts pertaining to diurnal variations and process evolution in precipitation. The paper concludes by outlining potential future research outlooks, including the integration of multi-source observation data, the development of evaluation techniques specific to certain models and application contexts, and the synergy between model evaluation and machine learning technology.

    评估是数值模式研发与应用中的重要环节。近年来,随着数值模式复杂程度的提升和精细化预报需求的增加,模式评估方法也有了迅速发展。首先概要回顾了国内外在数值模式评估方法方面的研究动态,进而关注中国学者近些年在新一代数值模式动力框架评估、东亚季风多时空尺度变率和青藏高原天气、气候模拟评估方面的主要研究成果,并针对降水这一关键过程,回顾了降水日变化及过程演变等精细化评估的工作。最后提出多源观测资料的应用、针对不同模式研发和应用场景的评估方法研究,以及模式评估与机器学习等新技术的结合是未来的研究方向。

  • Numerical models have become indispensable tools in weather forecasting, climate prediction, and climate change research (Bauer et al., 2015). However, the long-standing incompleteness and inaccuracy of numerical models result in uncertainties in their simulations or products. How to conduct a more reasonable and efficient evaluation, improvement, and application of numerical models is an important scientific issue in the field of weather and climate (Stevens and Bony, 2013; Dorninger et al., 2020; Pagano et al., 2024). Model evaluation, as an effective means to understand model biases, improve numerical models, and enhance the quality of forecast service products, has become one of the fundamental steps in the development of modern numerical models (Yu et al., 2019a).

    In the past few decades, model evaluation has played an important role in the research of numerical simulation methods, the development and improvement of numerical weather prediction (NWP) systems, and the application of NWP products (Leung et al., 2022; Frassoni et al., 2023). With the increasing complexity of numerical models and the growing demand for fine-scale diagnostic analysis of forecast products (Casati et al., 2008), advanced evaluation methods and tools have become increasingly significant (Ebert et al., 2013; Brown et al., 2021). To promote the development and application of quantitative verification and diagnostic evaluation methods, the Joint Working Group on Forecast Verification Research (JWGFVR) was established in 2003 by the World Weather Research Programme (WWRP) and the Working Group on Numerical Experimentation (WGNE) of the World Meteorological Organization (WMO). It continuously promotes the research and international cooperation on the evaluation of weather and climate products with a focus on precipitation, and develops process-oriented model evaluation and diagnostic methods.

    Model evaluation research primarily addresses two objectives. The first strand of such research targets the development and refinement of numerical models. By conducting targeted diagnostic evaluations of different calculation schemes and physical parameterization processes in the model, potential problems and possible causes in the initial field, dynamical framework, and physical processes of the model can be identified, which may provide a scientific basis for model improvement (Eyring et al., 2016; Neelin et al., 2023). The second strand focuses on assessing the accuracy of forecast products. Through quantitative verification and evaluation of model products, users may deeply understand their performance, such as the simulation ability at different spatiotemporal scales and key bias characteristics, providing quantitative references for the post-processing and correction of model products (Dorninger et al., 2018a; Brown et al., 2021).

    This paper will first summarize the main research progresses in the field of model evaluation worldwide in recent years and then review the main work carried out by Chinese scholars in the evaluating the dynamical framework of next-generation models, the evaluation of the East Asian monsoon and weather–climate in the Qinghai–Xizang Plateau (QXP), and the verification of precipitation forecasts. Finally, it will explore the current challenges and future research outlooks in this field.

    The purposes and focuses of model evaluation vary across different research and application scenarios. For model development and refinement, how to improve the key dynamic and physical processes of the models based on bias evaluation and attribution is the most concerned issue. In contrast, for weather forecasting and climate prediction applications, the evaluation mainly focuses on verifying the accuracy of model results, which provides a direct basis for the post-processing and product application. This section will briefly review the main international research progress in model evaluation in recent years from these two aspects.

    Early diagnostic evaluations of numerical models, especially climate models, mainly focused on monthly average or long-term climate average statistical analyses. In recent years, as more research has focused on weather and climate processes, process-oriented diagnostic (POD)has been developed. POD aims to provide more direct information for the improvement of physical process parameterization, which can ultimately address the long-standing model bias problem (Maloney et al., 2019).

    Table 1 summarizes relevant research projects applying POD-based evaluation over the past decade. Major international organizations have successively established working groups for this purpose. For example, the WGNEproposed evaluation metrics to enhance the Madden–Julian Oscillation (MJO) simulations in global models, providing rich diagnostic information on the simulation bias and its causes (Wheeler et al., 2013). The Process Evaluation Study (PROES) of the Global Energy and Water Cycle Experiment (GEWEX) uses multiple satellite observations to diagnose the processes related to the water and energy balance, so as to better understand the global water and energy cycle, diagnose the causes of model biases in simulating related physical processes, and improve the description of key processes of the water and energy cycle in models (Stephens et al., 2015). To support the diagnostic evaluation of climate models in the Coupled Model Intercomparison Project (CMIP), the Program for Climate Model Diagnosis and Intercomparison (PCMDI) of the Lawrence Livermore National Laboratory (LLNL) of the U.S. (Gleckler et al., 2016), the Model Diagnostics Task Force (MDTF) of the NOAA (Maloney et al., 2019), and the Earth System Model Evaluation Tool (ESMValTool) of the European Union (Eyring et al., 2016; Weigel et al., 2021) have successively proposed a series of comprehensive evaluation metrics for climate models, such as monsoon, El Niño–Southern Oscillation (ENSO), land–atmosphere interaction, and climate sensitivity, and released relevant diagnostic evaluation tools, which have become key supports for major climate assessments such as the Intergovernmental Panel on Climate Change (IPCC) reports. In addition, some research projects focus on key physical processes. For example, the Cloud Feedback Model Intercomparison Project (CFMIP) proposed to support the evaluation of cloud processes and understand the cloud-climate feedback process, providing information for improving the evaluation of cloud feedbacks in climate change (Tsushima et al., 2017).

    Table  1.  Research projects using process-oriented diagnostic (POD) evaluation
    Existing POD effort Scientific objective Evaluation of physical process
    WGNE MJO task force Promote the improvement of the MJO simulation in global models Sensitivity of tropical convection to low-level water vapor; atmospheric stability; strength of cloud-radiation feedback
    GEWEX PROES Understand the global energy and water cycles, diagnose the causes of simulation biases in related physical processes, and improve the representation of the basic processes of the energy and water cycles in climate models Upper tropospheric clouds and convective processes; radiation processes; surface mass and energy balance of ice sheets; midlatitude cyclones
    PCMDI Coordinated Model Evaluation Capabilities (CMEC) Provide a diagnostic evaluation platform for climate research and the development of earth system models, and offer a framework for community engagement Annual cycle from regional to global scales; tropical and extratropical variability; ENSO; MJO; regional monsoon systems; multi-time scale characteristics of precipitation; cloud feedback
    NOAA MDTF Support the research, development, and diagnostic workflows of the model center, and provide a physical understanding of the sources of model biases Cloud microphysical processes; tropical and extratropical cyclones; ENSO teleconnections and atmospheric dynamical processes; land–atmosphere interactions; MJO moisture, convection, and radiation processes; diurnal variation of precipitation; Atlantic meridional overturning circulation (AMOC); Arctic sea ice; lake processes; North American monsoon; radiative forcing and cloud-circulation feedbacks; temperature and precipitation extreme events
    ESMValTool Support the diagnostic comparison of CMIP models and understand possible causes of model biases and differences among models Tropical climate change; global and regional monsoons; Southern Ocean processes; land–atmosphere interactions; climate sensitivity; atmospheric CO2 budget; ozone and tropospheric aerosols
    CFMIP Diagnostic Codes Catalogue Improve the understanding of cloud-climate feedback mechanisms and processes, and understand the errors and uncertainties in key physical processes Global and regional cloud distribution; seasonal variations in cloud type distribution; vertical structure of low clouds; cloud-radiation feedback; warm cloud microphysical processes
     | Show Table
    DownLoad: CSV

    NWP product verification provides important guidance for forecasters and other decision-makers to reasonably apply model products. Early verifications of NWP products primarily adopted quantitative verification scores such as the threat score (TS), equitable threat score (ETS), false alarm ratio, missing ratio, and forecast bias (BIAS), which are widely used in precipitation forecast verification (Schaefer, 1990; Rodwell et al., 2010). For the precipitation evaluation of short-term climate prediction, commonly used metrics mainly include spatial correlation coefficient, temporal correlation coefficient, anomaly correlation coefficient, root mean square error (RMSE), and anomaly sign consistency rate (James et al., 2018).

    These verification scores are concise and are commonly used in forecasting operations. However, these traditional verification scores cannot truly reflect the performance of forecast products. For precipitation, the “double penalty” phenomenon often occurs in both temporal and spatial dimensions. That is, the area, shape, and intensity of the rain belt predicted by the model are basically consistent with the observations, but a slight shift in the position of the rain belt can also cause false alarms or misses, resulting in a low final forecast skill score (Dorninger et al., 2018b). At the same time, these methods lack capacity to explain the physical meaning of forecast errors, and may neglect the spatial correlation of meteorological variables (Casati et al., 2008; Ebert et al., 2013). With the increasing requirements for the accuracy and refinement of weather forecasts, novel diagnostic evaluation methods have undergone rapid development. Hoffman et al. (1995) introduced the feature calibration and alignment (FCA) method based on the concept of optical flow. Forecast errors can be decomposed into displacement, amplitude, and residual components, all directly measured in physical units, which lays the foundation for spatial forecast verification. Subsequently, feature-based methods have been increasingly employed in precipitation forecast verification. For example, Davis et al. (2006) developed the Method for Object-based Diagnostic Evaluation (MODE), Wernli et al. (2008) introduced the structure, amplitude, and location (SAL) method, and Ebert and McBride (2000) proposed the contiguous rain area (CRA). These are all objective approaches for evaluating the morphological, positional, and intensity-related forcast attributes and observed objects. By systematically identifying and analyzing characteristic features (including position, size, and intensity) in forecast and observed fields, these approaches provide comprehensive quality assessment. The scale-decomposition method decomposes the forecast and observed fields into spatial components of different scales for separate verification and evaluation, which enables multiscale analysis of forecast quality and skill. The neighborhood (fuzzy) method is particularly suitable for evaluating high-resolution forecasts as it accounts for uncertainties in both space and time of forecasts and observations.

    In recent years, the application scope of spatial verification methods in forecasting operations has been continuously expanding. They are not only used for precipitation forecast verification but also for the verification and evaluation of different forecast scenarios, such as wind, extreme events, tropical cyclone formation, weather warnings, and weather in complex terrain. For example, Roy and Turcotte (2007) evaluated the average distance between observed and forecast extreme weather events (e.g., hail, gusts, heavy rain, and tornadoes) in their extreme event forecasting study. They calculated classification scores by defining contingency table counts based on increasing radius distances along with the extreme weather probability index. Additionally, they employed the Brier score (BS) for verification.

    Ensemble forecasting plays an increasingly important role in weather and climate forecasting. To assess the overall forecasting performance of ensemble forecasts, most current studies employ two-classification scores such as the BS to evaluate numerical models’ predictive capability (Hersbach, 2000). On this basis, the relative operating characteristic (ROC) curve is used to examine the forecasting performance of the hit rate and false alarm rate of ensemble forecasts. At the same time, to verify the reliability and divergence degree of ensemble forecast members, the Talagrand histogram can be applied based on the “equal accuracy” principle among different forecast members (Hamill and Colucci, 1997; Hamill, 2001). This method measures probability distribution differences between each ensemble member and the corresponding observations, thereby quantifying the divergence degree of ensemble forecasts. At the same time, probability forecast verification techniques have undergone significant advancements, and verification metrics such as reliability, sharpness, and resolution are applied to evaluate the performance of ensemble forecasts (Brown et al., 2010).

    With the increasing demand for the refinement and timeliness of numerical weather forecasts in social development, major international operational forecasting centers and scientific research institutions have launched the research and development of next-generation model dynamical frameworks. Developing next-generation seamless NWP systems has become a key frontier scientific issue in the field of earth sciences (Brown et al., 2012; Yu et al., 2019b). The dynamical framework is the core component of an atmospheric numerical model used to solve the governing equations of atmospheric motion. Through its interaction with the model’s physics, the dynamical framework drives the model integration forward, directly enhancing the accuracy, stability, and computational efficiency of model forecasts. Systematically evaluating the performance of the dynamical framework and assessing the applicability of the constructed high-precision, efficient, and highly scalable discretization algorithms and solvers are crucial for iteratively improving the model’s overall performance.

    The scalar-field correlation evaluation method provides an important indicator for assessing numerical algorithms’ applicability in dynamical frameworks by examining the spherical transport algorithm’s ability to characterize correlations between intrinsically correlated scalar fields. It can also indirectly reflect the computational accuracy and shape-preserving positive-definiteness ability of the algorithm. For example, Zhang et al. (2019) evaluated the correlation between two fields of their proposed spherical transport algorithm (Zhang et al., 2017; Zhang, 2018). Their analysis demonstrated the algorithm’s ability to effectively limit spurious numerical oscillations, confirming that this unstructured grid algorithm can be successfully applied to positive-definite substance transport processes. Wang et al. (2019) analyzed the kinetic energy spectrum and nonlinear spectral flux of the shallow-water model to evaluate its rationality and stability in addressing computational noise and numerical stability issues. Zhang (2018) systematically evaluated the evolution of calculation errors of the two-dimensional barotropic model, examining both its temporal development and resolution sensitivity, and confirmed that the proposed high-precision upwind potential vorticity flux transport strategy improved the model’s computational performance in terms of both dispersion and dissipation errors. Li et al. (2019) evaluated the application effect of the proposed non-negativity correction algorithm in the multi-moment finite-volume advection model through benchmark tests, proving that the algorithm effectively ensured the non-negativity of numerical solutions while maintaining computational accuracy.

    To assess the accuracy of the model in representing multi-scale atmospheric motion, Zhang et al. (2019) evaluated the unstructured mesh grid dry dynamical framework’s ability to accurately reproduce multi-scale waves, multi-year climate states, and related vortex statistical characteristics by simulating midlatitude baroclinic waves as well as non-hydrostatic mountain gravity waves, and analyzing the model’s statistical features under dry atmospheric ideal climate forcing. They also assessed the non-hydrostatic model’s reasonable performance in hydrostatic regimes, leveraging the similar characteristics between hydrostatic and non-hydrostatic models. Chen et al. (2023) verified the effectiveness of the high-order accuracy scheme of the dynamical framework based on the multi-moment constrained finite-volume method (MCV; Li et al., 2013) through a series of benchmark experiments such as three-dimensional Rossby–Haurwitz waves and mountain Rossby waves. The results of idealized experiments showed that the dynamical framework of the China Meteorological Administration Mesoscale Numerical Weather Prediction System (CMA-MESO) exhibited high stability and accuracy in atmospheric dynamic simulations, which may provide strong technical support for weather forecasts (Yang et al., 2008; Shen et al., 2020). Based on the evaluation of the dry dynamical framework (Li et al., 2015), Li and Peng (2018) further verified the advantages of the global non-hydrostatic Yin–Yang grid model in simulating the global atmospheric circulation through long-term integration experiments on an aqua planet. For the wet atmospheric dynamic model, Zhang et al. (2020) employed idealized physical processes and performed numerical experiments such as supercells and tropical cyclones [Dynamical Core Model Intercomparison Project (DCMIP); Ullrich et al., 2016] to investigate the convergence of simulation results with increasing resolution, compare different transport algorithm sensitivities, and assess the model’s moist atmospheric climate characteristics under prescribed forcing and aqua-planet conditions (Fig. 1). This demonstrated the reliability of the multi-scale atmospheric dynamic model as the prototype basis for the continuous development of weather–climate integrated models.

    Fig  1.  Evaluation of the next-generation Global–Regional Integrated Forecast System (GRIST) using the DCMIP idealized moist baroclinic wave test. The field is relative vorticity (10−5 s−1) at the 23rd model level at 0000 UTC of day 11 (initialized from day 0) simulated in the (a) global and (b) regional models, adapted from Zhang et al. (2024).

    Regarding the impact of increased local resolution in global models on numerical calculation accuracy and stability, systematic evaluations of global variable-resolution models are critically essential. Zhou et al. (2020) conducted idealized experiments and verified that the variable-resolution model could accurately capture fine-scale fluid structures in the refinement region and effectively control the potential adverse effects in the transition zone and coarse-resolution region. Zhou Y. H. et al. (2023) used the hourly-scale dynamic and thermodynamic process diagnostic method, and the results show that the variable-resolution model could accurately capture the features of precipitation diurnal cycle through real precipitation processes and show a reasonable dynamic–thermodynamic response. Such evaluation work has laid an important foundation for the continuous advancement of domestically developed global variable-resolution atmospheric models.

    The accurate and physically reasonable representation of model topography remains a critical challenge in numerical modeling. Particularly in regions with extreme terrain complexity like the QXP, developing numerically robust algorithms for steep topography presents computational obstacles, which is one of the most persistent research challenges of numerical simulation research in China. Evaluating the impact of different transport schemes on the simulation of the advection-condensation process in the model can provide solutions for improving the long-standing precipitation bias in high-elevation areas. For example, Yu et al. (2015b) replaced the semi-Lagrangian transport algorithm in the Community Atmosphere Model (CAM) version 5 (CAM5) with the two-step shape-preserving scheme (TSPAS; Yu, 1994; ZhangY. et al., 2013). The evaluation demonstrated that the TSPAS effectively constrained long-distance cross-grid water vapor transport, causing it to condense and precipitate at the lower windward slope, thereby significantly reducing spurious precipitation in the steep terrain of the QXP’s southern slope. Similarly, the dynamical framework of the climate system model of the Chinese Academy of Meteorological Sciences (CAMS-CSM) also achieved significant improvements in simulation biases over the terrain by introducing this advection scheme (Rong et al., 2018). Zhang et al. (2021) evaluated how increasing nominal accuracy and incorporating upwind terms affected precipitation simulations on the QXP’s southern slope. The results reveal how separating the model’s advection-condensation process contributes to precipitation overestimation in the plateau’s steep terrain areas. Zheng et al. (2024) assessed the sensitivity of water vapor advection scheme selection and class-controlled advection scheme variables across steep terrain regions spanning from the eastern QXP to the western Sichuan basin. Based on their study, the advection scheme in the high-resolution regional numerical prediction model’s dynamical framework was subsequently optimized.

    Terrain-idealized experiments and case-based simulation evaluations prove to be particularly effective in assessing dynamical framework improvements targeting complex terrain forecasting challenges. For example, based on the Advanced Regional Eta-coordinate Model (AREM; Yu and Xu, 2004) that uses the η as the vertical coordinate (Yu, 1989), Cheng et al. (2019) comprehensively evaluated the performance of the designed non-hydrostatic dynamical framework through idealized convective thermal bubble experiments and case-based rainstorm process experiments. The comparisons between hydrostatic and non-hydrostatic framework simulation at different resolutions demonstrated the stability and accuracy of the newly developed non-hydrostatic model, which may provide guidance for better handling complex terrains at high resolutions. Zhang et al. (2015) evaluated the calculation errors of the pressure gradient force under different coordinates through idealized resting atmosphere experiments and proved that the new height-based terrain-following coordinate could more accurately simulate gravity waves approaching the analytical solution. Li C. et al. (2020) evaluated the impact of different smoothed-level hybrid coordinates on the simulation of the precipitation process of the quasi-stationary front over the QXP, providing a better reference for coordinate selection to effectively alleviate problems such as wind field forecast biases, false precipitation, and false weather systems. Cheng et al. (2022) conducted mountain wave idealized experiments and case studies to compare the impact of vertical coordinate selection on the simulation of systems such as terrain waves and westerly troughs, and evaluate the correctness and effectiveness of the Weather Research and Forecasting (WRF) model’s dynamical framework after introducing the η-coordinate. They found that under the settings of increasing vertical resolution or extending the simulation time, the stepped-terrain coordinate dynamical framework could effectively capture mountain waves.

    China is situated in the East Asian monsoon region, with the QXP dominating its southwestern interior and the Pacific Ocean to the east. China’s diverse terrain and complex land–sea distributions create significant challenges for numerical weather and climate modeling, primarily due to the country’s unique geographical environment and monsoon climate system. To understand the key biases in models and enhance the performance of numerical simulations, Chinese scholars have conducted extensive and valuable research on diagnosing the physical processes within these models. This section reviews the key advancements in evaluating numerical simulations of monsoon multi-scale variations and the weather and climate of the TXP and its surrounding areas.

    The East Asian monsoon is a crucial component of the Asian monsoon system, characterized by a blend of tropical and subtropical monsoon influences. Shaped by both external forcings and internal climate dynamics, it displays substantial fluctuations across temporal scales and intricate spatial patterns. Therefore, the East Asian monsoon has long been used as a key reference for evaluating climate model performance, and its accurate simulation remains one of the enduring challenges in China’smodel development (Ding and Chan, 2005; Yao et al., 2017; Li J. et al., 2020; Wang et al., 2024).

    In the mean climate state, models exhibit differential skill in simulating various monsoon components. Generally, they demonstrate superior performance over flat terrain relative to that over complex topography, better representation of circulation patterns than precipitation fields, and higher accuracy in capturing zonal versus meridional circulation features (Chen et al., 2010; Boo et al., 2011; Gong et al., 2014; Liu et al., 2018; Yu et al., 2023). Jiang et al. (2005) evaluated the performance of coupled ocean–atmosphere models in simulating surface air temperature and precipitation over East Asia. Their analysis showed that the models consistently achieved higher accuracy in temperature simulations compared to precipitation, with particularly degraded performance in the TXP region relative to flat terrains. In their evaluation of the CAM version 3.5 (CAM3.5), Chen et al. (2010) found that its simulation of the East Asian summer monsoon (EASM) mean state exhibited pronounced biases in both precipitation patterns and meridional circulation. These errors were primarily linked to the model’s underestimation of the meridional land–sea thermal gradient.

    On decadal timescales, the EASM has shown a weakening trend since the 1970s, and precipitation in eastern China has shifted to a “southern flooding and northern drought (SFND)” pattern (Yu et al., 2004). Sun and Ding (2008) conducted an assessment using multiple observational datasets, finding that most models struggled to simulate the decadal changes in the EASM due to their failure to capture the dynamic and thermodynamic mechanisms behind precipitation changes. Xin et al. (2020) evaluated the linear trends of the EASM and its precipitation as simulated by eight CMIP phase 6 (CMIP6) models, quantitatively comparing these changes with CMIP phase 5 (CMIP5) results. They found that some models reproduced the weakening trend of the monsoon from 1961 to 2005, with two models exhibiting high skill in capturing the negative correlation between the monsoon and sea surface temperatures in the East Indian Ocean and Western Pacific.

    The East Asian monsoon also exhibits pronounced interannual variability. Song and Zhou (2014b) examined CMIP phase 3 (CMIP3) and CMIP5 models’ ability to simulate interannual precipitation variability in the EASM. They found that while some models partially reproduced the observed dipole precipitation pattern, there were biases of weak intensity and southward displacement, attributed to a weak and southward-displaced Western Pacific anticyclone in the models. Yu et al. (2023) evaluated the interannual variability of the EASM using CMIP6 models and found that compared to CMIP5, CMIP6 models show significant improvement in simulating the Northwest Pacific anticyclone intensity, which reduced the bias in the precipitation dipole pattern.

    While these evaluations focus on long-term model integrations, they are limited in tracing the sources of model biases, as many errors stem from fast physical processes, such as cloud dynamics. The Transpose Atmospheric Model Intercomparison Project (Transpose AMIP or TAMIP) experiment runs climate models in short-term forecast mode. This approach reduces nonlinear interactions of systematic biases and enables comparison of simulations with refined observations, thereby helping identify key processes causing model biases (Williams et al., 2013; Li et al., 2019). For instance, Li et al. (2018) conducted a TAMIP experiment using the CAMS-CSM model to simulate a severe East Asian precipitation event in the summer of 2016. The results exhibited biases consistent with those in AMIP experiments, i.e., underestimation of maximum hourly precipitation intensity in the intensity–frequency structure, and spurious precipitation centers near mountainous regions.

    The TXP, with its high altitude, complex terrain, and unique climate, plays a significant role in influencing land-surface water and heat exchanges, boundary layer structure, convective characteristics, and cloud microphysical properties, resulting in sharp contrasts with adjacent plains. These unique features make the plateau one of the most uncertain regions in numerical simulations. Accurate precipitation simulation over the TXP and its vicinity is a global challenge and a key research topic in China’s weather and climate studies. In particular, numerical models show a notable wet bias over the eastern TXP’s high-relief terrain regions and along its southern edge’s steep terrain (Wang et al., 2018). Despite progressive model development from CMIP3 to CMIP6, the multi-model ensemble-mean simulations continue to demonstrate limited improvement in reproducing TXP’s precipitation patterns (Song and Zhou, 2014a; Zhou T. J. et al., 2018; Zhao et al., 2022). Existing studies have evaluated the possible relationships between the simulation biases of precipitation and temperature over the TXP and specific physical processes in models from aspects such as land–atmosphere exchange, convection, and cloud microphysical processes.

    Land-surface processes strongly influence the energy and mass exchange between the land and the atmosphere. The accuracy of a model’s description of land-surface processes significantly affects the simulation of weather and climate models (Dai, 2020). Regarding the potential connection between the complex terrain–atmosphere interaction and the precipitation/temperature simulation biases over the TXP, Yue et al. (2021), through the Second TXP Comprehensive Scientific Expedition, identified that the southern slope, characterized predominantly by rocky surfaces, shows significant discrepancies between WRF land-surface parameterization schemes and those developed for plains. This emphasizes the need for improving the parameterization of thermal and moisture exchange processes on rocky surfaces to better simulate temperature and precipitation over the TXP’s southern slope. Zhou et al. (2022) found that the WRF model underestimated the thermal roughness of bare ground, resulting in biases in near-surface air and surface temperatures, particularly in regions with large diurnal temperature differences. Further in-depth evaluations by Zhou X. et al. (2018, 2019) on the WRF model’s performance in simulating surface wind speed and precipitation over the TXP demonstrated that the turbulent orographic drag effect is crucial for wind and precipitation simulations in complex terrain areas. Additionally, the terrain can also affect the spatial distribution and magnitude of precipitation through the gravity-wave drag effect. Xu et al. (2023) found that the failure to simulate the gravity-wave drag effect in complex terrain led to a wet bias in winter precipitation over the TXP’s western side, underscoring the need for a terrain-gravity-wave drag parameterization scheme that incorporates more detailed terrain features and wave-propagation mechanisms.

    The TXP’s complex terrain also profoundly affects physical processes like solar-radiation effect and the snow-albedo feedback, thereby influencing the model’s simulation of precipitation and temperature. Cai et al. (2023) showed that terrain-refinement features such as height, orientation, shading, and reflection significantly affect solar radiation intensity and regional difference simulations. Zhou X. et al. (2023) noted that accurately depicting terrain height, slope, and snow cover evolution is crucial for simulating snow and near-surface temperatures. Moreover, processes such as convection, cloud microphysics, and cloud-radiation feedback can directly influence the simulation for precipitation over the plateau. Liu et al. (2024) found that the shallow-convection process in complex terrains influences precipitation formation and surface heat transfer through terrain forcing and local circulation systems such as valley and slope winds. Ou et al. (2020) found that by adopting a convection-permitting strategy in the gray zone (~9 km), the WRF model improves convective process simulations. This approach better captures the triggering time, occurrence frequency, and diurnal variation of convection, leading to a more accurate reproduction of summer precipitation spatial distribution over the TXP. Zhao et al. (2023) highlighted that the traditional cloud-physics parameterization scheme failed to accurately simulate the formation and development of clouds, leading to the overestimated summer precipitation. Process-based model evaluations demonstrate that inadequate cloud-radiation feedback representation affects large-scale circulation, causing a wet bias over the TXP (Liu et al., 2024).

    In daily weather forecasting, precipitation-related forecasts are of the utmost concern. The dynamical, physical, and even chemical processes of precipitation occurrence, development, and evolution are complex and involve numerous factors. Consequently, precipitation serves as a crucial metric for assessing the efficacy of numerical models in simulation and forecasting (Tapiador et al., 2019). China’s unique geographical conditions, characterized by complex terrain dynamics and intricate atmosphere–land–sea interactions, result in particularly multifaceted precipitation processes. This complexity exacerbates the challenges in understanding the evolution and anomalous shifts in precipitation patterns, thereby elevating the challenges in achieving accurate simulation and prediction. Notably, Chinese researchers have made significant contributions to precipitation forecast evaluation.

    In recent years, to address the “double penalty” phenomenon in traditional precipitation verification, some new verification techniques have been widely adopted in quantitative precipitation forecast evaluation, which provides more detailed information on model forecast biases. Chinese researchers have conducted systematic reviews and critical assessments of precipitation spatial verification methods and their applications (Pan et al., 2014; Chen et al., 2021). Mainstream spatial verification approaches include scale-separation, neighborhood, object-based, and full-field deformation verification. Neighborhood verification upscales higher-resolution forecast and observed elements to a coarser scale, and then adopts methods such as the Fraction Skill Score (FSS) to compare the proportion of values that exceeds a defined threshold within a certain-sized spatial region, thus quantifying the similarity characteristics between the fine-grid forecast and the observed values (Zhao and Zhang, 2018). To reduce the impact of displacement and intensity errors during scale-separation on evaluation metrics, Yu et al. (2020) developed a skill-score index, a combination of neighborhood and scale-separation methods named intensity-scale-based fraction skill score (IS_FSS), and applied it to the forecast verification of Meiyu precipitation in China. This method demonstrates better alignment with subjective expectations than traditional verification metrics.

    Object-based and process-oriented diagnostic evaluation methods can provide specific information on forecast biases in terms of location, intensity, area, and spatiotemporal evolution, helping better understand the physical process characteristics related to precipitation forecast biases. Qi et al. (2024) used the MODE verification method to evaluate the heavy-precipitation forecasts under different circulation patterns in Northeast China, and provided detailed bias information on the spatial characteristics of heavy precipitation. Pan et al. (2024) systematically reviewed the practical applications of the MODE verification method in precipitation and multi-element weather forecasts, and also pointed out the limitations of the MODE verification and the challenges in rationally applying verification results. Fu and Dai (2016) used the CRA spatial verification technique to compare and analyze the area and intensity errors of precipitation of three different circulation patterns in Southwest China. They found that morphological errors dominated precipitation forecast errors in eastern Southwest China, followed by rainfall area errors, with intensity errors contributing the least.

    In recent years, due to global climate change, the frequency of extreme events has been increasing. How to develop an integrated scoring method to evaluate forecasts under different climate backgrounds, and how to evaluate the performance of forecast models for small-probability events, such as rainstorms, have become two critical research questions. Conventional approaches are increasingly inadequate to address these emerging challenges. In the verification of extreme weather events, Chen and Chen (2015) used the Stable Equitable Error in Probability Space (SEEPS) score method to evaluate quantitative precipitation forecasts in China. This novel methodology effectively addresses two long-standing challenges in precipitation verification, i.e., excessive classification categories and excessive sensitivity to regional climate characteristics. In response to the large spatiotemporal differences in the predictability of rainstorms in China, Chen J. et al. (2019) developed a new comprehensive index named Synthetic Predictability Index (SPI) for the predictability of Chinese rainstorms, which consists of three components, i.e., rainstorm climatic frequency, area ratio of rainstorm, and rainstorm forecasting success index. On this basis, Chen F. J. et al. (2019) proposed a neighborhood-matching approach to compare forecasts with observations, yielding results that better matched forecasters’ psychological expectations. Recently, Zhang et al. (2024a) proposed a new general comprehensive evaluation approach for cross-scale precipitation forecasts, i.e., General Comprehensive Evaluation Method (GCEM), which comprehensively considers the limitations of traditional precipitation verification approaches based on threshold classification and the subjectivity in scale selection of neighborhood-based spatial verification methods.

    With the increasing demand for fine-scale forecasting, hourly precipitation forecasts have received increasing attention. At the hourly scale, the diurnal variation and other fine characteristics of precipitation are the comprehensive manifestations of the impacts of dynamic and thermal processes in the climate system on the water-cycle process. Evaluating the model’s ability in forecasting the diurnal-scale evolution characteristics of precipitation and elucidating its forecast biases in terms of precipitation intensity and frequency can strengthen the in-depth understanding of the uncertainty of precipitation forecasts. This is of great scientific value for improving the application level of numerical forecast products and enhancing the fine-scale forecasting of heavy precipitation in China (Yu et al., 2014). The diurnal variation of precipitation has become an important aspect or metric for evaluating the uncertainty of numerical models. Yu et al. (2014) presented the climate characteristics and regional differences of precipitation diurnal variation in China, summarized the revealed scientific connotations of this variation, and provided observational metrics for evaluation including hourly precipitation frequency and intensity, the amplitude and peak time of diurnal variation, single-station and regional precipitation events, as well as the start time, end time, and duration of precipitation events. Li and Yu (2014) developed a new method to evaluate the frequency–intensity distribution of precipitation using a linear approach and a double E-index fitting method to quantitatively assess the precipitation intensity features. Based on this quantitative index, the model’s ability to forecast the occurrence frequency of precipitations of varying intensity can be evaluated (Li et al., 2015; Xie and Wang, 2021). Yu et al. (2015a) defined regional rainfall events and their associated coefficients to quantify the spatiotemporal changes of precipitation events. Based on these new evaluation metrics, the CMA has established an operational evaluation index system for numerical forecast models and continuously improved it through operational trials (Chen et al., 2021).

    Based on the scientific understanding of the diurnal variation of precipitation in China, many scholars have also carried out relevant model evaluation research. Xu et al. (2017) evaluated precipitation forecast biases in the CMA-MESO 3-km model, focusing on frequency, intensity, and diurnal variation characteristics across southeastern China. Results indicated that the model’s improved precipitation forecasting skill primarily stems from its enhanced representation of cloud and precipitation microphysical processes. Lu et al. (2017) studied the forecasting ability of the Beijing Rapid Update Cycle Assimilation Forecast System for the diurnal variation of precipitation in Beijing and the causes of biases, and pointed out that better simulation of near-surface temperatures could enhance precipitation forecasting skill. Li et al. (2020a) showed that kilometer-scale convection-resolving models could more reasonably depict the afternoon precipitation peaks in South China and the middle and lower reaches of the Yangtze River, and better reproduce the intensity and diurnal variation characteristics of the low-level jet. Li et al. (2021) revealed that kilometer-scale convection-permitting models eliminated the false afternoon precipitation peaks, significantly improving the wet bias in the simulation of plateau precipitation in the large-scale models (LSM) using convective parametrization (Fig. 2). From the perspective of the evolution of precipitation events, Yuan et al. (2020) pointed out that the ECMWF Integrated Forecast System (IFS) forecasting system could better capture the spatial distribution of the amount, frequency, and intensity of precipitation in Southwest China, but may exhibit extremely high forecast precipitation frequency but low precipitation intensity, as well as earlier start and peak times of precipitation events than the observations. Hu and Yuan (2021) showed that ECMWF reanalysis version 5 (ERA5) could reasonably reflect the spatiotemporal variations of regional rainfall events.

    Fig  2.  Diurnal variations of summer mean precipitation (mm h−1) averaged over the central and eastern TXP (31.5°–36.5°N, 89.5°–102.5°E) among rain gauge station observations (station), CMA Multi-source Precipitation Analysis (CMPA), two large-scale models, i.e., LSM-35 and LSM-13, with horizontal resoluations of 35 and 13 km, respectively, and convection-permitting model (CPM), with a resolution of 4 km, adapted from Fig. 5a in Li et al. (2021).

    In recent years, some studies have shifted the research perspective of precipitation from traditional station- or grid-point-based quantitative statistics (Eulerian perspective) to the rain-cluster perspective. From a Lagrangian perspective, they dynamically track and statistically analyze the fine-scale characteristics of all rain clusters during the research period, organically establishing linkages between the spatiotemporal characteristics of precipitation systems. This enables an in-depth investigation of the dynamic evolution process of fine-scale precipitation characteristics such as the rain area, precipitation intensity, moving speed, and duration of the precipitation system during its life cycle (Moseley et al., 2019; Li et al., 2020b). Thus, it has evolved from traditional single-point evaluation to spatiotemporal continuous evaluation of precipitation processes, comprehensively examining the model’s simulation performance for the spatiotemporal synergy characteristics and evolution processes of precipitation systems. For example, Li et al. (2023) conducted a model evaluation of the fine-scale precipitation characteristics at the rain-cluster scale for the high-resolution Met Office Unified Model (MetUM). They revealed the advantages of the convection-permitting processing strategy at a resolution of 10 km in simulating the evening mountain-triggering and downstream-plain-moving characteristics of mesoscale convective system (MCSs) in the complex terrain of North China at the diurnal scale.

    Affected by multiple weather systems such as the subtropical high, westerly trough, low-level jet, Northeast cold vortex, and Southwest low vortex, as well as complex underlying surfaces, the mechanism of precipitation in China is complicated, making it difficult to forecast precipitation intensity and the distribution of heavy-precipitation areas. Evaluating the model’s ability for different precipitation processes is of great value for understanding the characteristics and causes of forecast biases under the influence of different weather systems.

    Against the background of the increasing frequency of extreme heavy precipitation events, how to comprehensively evaluate the forecasting performance of NWP models for extreme precipitation influenced by different types of weather processes from multiple perspectives has emerged as a hot research topic in recent years. Zhang D.-L. et al. (2013) conducted kilometer-scale forecasts and in-depth evaluations of the “12·7” extreme rainstorm in North China. The study revealed that although the model obtained “correct” results in terms of rainfall magnitude, these results mainly stemmed from the model’s misrepresentation of certain physical processes rather than accurately capturing the actual atmospheric processes that govern the extreme rainstorm. The study further stressed that when evaluating and refining numerical models, understanding the physical mechanisms behind forecast results is as critical as ensuring their accuracy. Regarding the “21·7” extreme rainstorm in Henan Province, Wan et al. (2024) pointed out that the weak low-level jet predicted by the model led to insufficient water vapor transport, and the model failed to reasonably predict the stable and less-mobile surface mesoscale convergence line near Zhengzhou. As a result, the model severely underestimated the precipitation intensity and the duration of severe convective storms. Moreover, the intensified easterly wind predicted by the model caused stronger orographically enhanced precipitation on the steep terrain of the Taihang Mountains, and the predicted heavy-precipitation area was displaced substantially westward of the observed location (Li et al., 2022). For the “12·7” and “23·7” extreme rainstorms in North China, previous studies indicated that various models can predict the location and intensity of the rainstorms 1–2 days in advance. However, the predicted intensity is relatively weak, and the peak time is delayed. The CMA Beijing (CMA-BJ) model’s forecasts of heavy-rainfall distribution and intensity provide valuable guidance for operational forecasting, and radar-echo extrapolation is an effective means for timely updating of numerical forecast outputs (Tao and Zheng, 2013; Zhang et al., 2024b).

    Differences in precipitation-forecast errors caused by varying influencing systems have also received extensive attention. Systematic evaluation of model performance across various weather processes enables developers and forecasters to identify characteristic biases and implement targeted improvements. Verification of numerical model performance for the prolonged Meiyu-period precipitation in the Jianghuai River basin in 2020 demonstrates a consistent northward bias in the forecasted rain belt position, coupled with spatial overprediction and intensity underestimation. The forecasting biases of the subtropical high, low-level jet, and shear line, along with the impact of the precipitation latent-heat feedback process jointly contribute to the rain-belt forecasting bias. The evaluation results indicate that the upper-level synoptic field is a primary focus for forecasting such rainstorm processes (Qi L. B. et al., 2020; Bu et al., 2022; Qi Q. Q. et al., 2023). Warm-sector rainstorms during the pre-flood season in South China are difficult to forecast because their development has a weak correlation with baroclinic forcing at the synoptic scale. Models often fail to predict the mesoscale convective systems in the warm sector, as they cannot reasonably reproduce the the preconditioning factors for warm-sector convection (i.e., boundary layer dynamic convergence, water vapor supply from the low-level jet, and orographic lifting) (Fu et al., 2020; Qin et al., 2020; Li and Fu, 2021), which leads to missed forecasts of heavy precipitation (Hu and Yuan, 2021). Cold vortex is one of the main weather systems affecting precipitation in Northeast and North China. The forecasting performance of various model products was compared in Fu et al. (2019) for two precipitation processes associated with cold vortices in North China. The results show that during the mature stage of the cold vortices, the model overestimates the dynamic conditions, resulting in false alarms of precipitation, while during the development stage, heavy precipitation events are often missed. The model has poor ability in forecasting the occurrence of such heavy precipitation events (Qi et al., 2024). Zhong et al. (2022) pointed out that the heavy precipitation forecast biases under strong synoptic forcing mainly show systematic biases that are linked to terrain height, while those influenced by weak weather systems have greater uncertainty. In contrast to precipitation systems associated with low troughs and shear lines, low-vortex-related precipitation processes exhibit poorer model predictability (Xiao et al., 2013). Evaluations of four long-lived Southwest low vortex processes show that although the models demonstrate capability in predicting the formation and evolution of the Southwest vortex, they systematically simulate its initiation earlier than observed. The predicted path of the eastward-moving Southwest low vortex shows significant northward deviation, leading to northward displacement of the predicted precipitation area (Wang et al., 2017).

    Model evaluation serves as a pivotal component in the development and application of models. The escalating complexity of numerical models, coupled with the increasing demand for sophisticated applications of model products, has led to heightened attention towards model evaluation in recent years. Consequently, novel evaluation methods and metrics have been evolving at a rapid pace. This paper provides a succinct overview of the significant advancements in model evaluation in China over the past two decades. It also reviews the substantial work conducted by Chinese scholars in verifying the dynamical framework of next-generation numerical models, simulating and evaluating the East Asian monsoon, and assessing weather and climate patterns over the TXP and its vicinity. Additionally, the paper discusses the verification and evaluation of precipitation forecasts in China. The key conclusions are summarized as follows.

    (1) Significant advancements have been made in the evaluation methods for various numerical models and application requirements. The assessment of weather and climate models has transitioned from the traditional analysis of long-term statistical characteristics to a process-oriented diagnostic evaluation. This change offers more direct references for understanding the key biases in the model’s physical processes. Similarly, the verification of numerical forecast products has evolved from quantitative verification scores to evaluations that place greater emphasis on the characteristics of forecast objects.

    (2) The dynamical framework is the central component of a numerical model designed to solve the governing equations of atmospheric motion. By using comprehensive multi-scale evaluation examples and analysis strategies, we can effectively assess the model’s performance across hydrostatic to non-hydrostatic domains and weather to climate scales. This approach provides a robust evaluation of the model’s accuracy in simulating multi-scale atmospheric motion, offering crucial support for the research and development of the dynamical framework of China’s next-generation numerical models.

    (3) Physical processes primarily contribute to model uncertainty. Numerous studies by Chinese scholars have investigated multi-scale spatiotemporal variations of the East Asian monsoon, as well as weather and climate pattern simulations and evaluations over the TXP and its surrounding areas. These investigations have identified key simulation biases in East Asian monsoon variability at various scales and their underlying dynamic and thermal mechanisms. Furthermore, they have identified the effects of unique land-surface water–heat status, boundary layer structure, convective characteristics, and cloud microphysical characteristics on the simulation of weather and climate patterns within the TXP and its surrounding regions.

    (4) Precipitation serves as a crucial metric that holistically represents the predictive accuracy of models. The assessment of precipitation simulation and forecasting has transitioned from mere quantitative verification to a more detailed evaluation focusing on the evolutionary characteristics of precipitation processes. With regard to hourly-scale features, such as the diurnal variation, several innovative evaluation metrics and methodologies have been introduced in recent years. These have subsequently been employed in the model evaluation procedures of the CMA.

    As numerical simulation technologies continue to progress and application requirements expand, the evaluation and diagnosis of models will increasingly evolve in a more sophisticated, multi-dimensional, and interdisciplinary fashion. For instance, there is an imperative need to develop an innovative evaluation system that is apt for weather–climate integration. Such a system would facilitate seamless forecasting. The cornerstone of model evaluation lies in the utilization of comprehensive and ample observational data. However, with the increasing complexity and resolution of models, conventional observational datasets are increasingly inadequate for comprehensively assessing sophisticated modeling systems. Furthermore, the influence of observational data uncertainty is likely to become more pronounced. Thus, for critical scientific research, obtaining multi-source observational data through field experiments, combined with thorough diagnosis of model physical processes, serves as a key approach to understanding and improving the physical processes associated with model biases. It is essential to build upon existing scientific knowledge by deriving deeper and more nuanced insights from multi-source observational data. This will facilitate establishing targeted, comprehensive evaluation systems and innovative methodologies, deriving quantitative metrics, and continuously improving skills to interpret and apply these metrics, ultimately laying a more robust foundation for post-processing and rectifying model outputs.

    Furthermore, the rapid advancement of machine learning technology has heightened interest in data-driven methods for weather and climate research. On one hand, the escalating complexity of numerical models presents increasing challenges in reducing model uncertainty via traditional bias correction approaches. Machine learning offers innovative approaches for post-processing numerical forecast products. However, the challenge lies in effectively providing model bias information for machine learning evaluation methods. On the other hand, as pure data-driven models continue to evolve (Bi et al., 2023), there is an urgent need to develop reasonable assessments of such models’ forecasting abilities and more effective, generalized evaluation metrics for their training. These remain pressing research topics demanding further investigation.

  • Fig.  1.   Evaluation of the next-generation Global–Regional Integrated Forecast System (GRIST) using the DCMIP idealized moist baroclinic wave test. The field is relative vorticity (10−5 s−1) at the 23rd model level at 0000 UTC of day 11 (initialized from day 0) simulated in the (a) global and (b) regional models, adapted from Zhang et al. (2024).

    Fig.  2.   Diurnal variations of summer mean precipitation (mm h−1) averaged over the central and eastern TXP (31.5°–36.5°N, 89.5°–102.5°E) among rain gauge station observations (station), CMA Multi-source Precipitation Analysis (CMPA), two large-scale models, i.e., LSM-35 and LSM-13, with horizontal resoluations of 35 and 13 km, respectively, and convection-permitting model (CPM), with a resolution of 4 km, adapted from Fig. 5a in Li et al. (2021).

    Table  1   Research projects using process-oriented diagnostic (POD) evaluation

    Existing POD effort Scientific objective Evaluation of physical process
    WGNE MJO task force Promote the improvement of the MJO simulation in global models Sensitivity of tropical convection to low-level water vapor; atmospheric stability; strength of cloud-radiation feedback
    GEWEX PROES Understand the global energy and water cycles, diagnose the causes of simulation biases in related physical processes, and improve the representation of the basic processes of the energy and water cycles in climate models Upper tropospheric clouds and convective processes; radiation processes; surface mass and energy balance of ice sheets; midlatitude cyclones
    PCMDI Coordinated Model Evaluation Capabilities (CMEC) Provide a diagnostic evaluation platform for climate research and the development of earth system models, and offer a framework for community engagement Annual cycle from regional to global scales; tropical and extratropical variability; ENSO; MJO; regional monsoon systems; multi-time scale characteristics of precipitation; cloud feedback
    NOAA MDTF Support the research, development, and diagnostic workflows of the model center, and provide a physical understanding of the sources of model biases Cloud microphysical processes; tropical and extratropical cyclones; ENSO teleconnections and atmospheric dynamical processes; land–atmosphere interactions; MJO moisture, convection, and radiation processes; diurnal variation of precipitation; Atlantic meridional overturning circulation (AMOC); Arctic sea ice; lake processes; North American monsoon; radiative forcing and cloud-circulation feedbacks; temperature and precipitation extreme events
    ESMValTool Support the diagnostic comparison of CMIP models and understand possible causes of model biases and differences among models Tropical climate change; global and regional monsoons; Southern Ocean processes; land–atmosphere interactions; climate sensitivity; atmospheric CO2 budget; ozone and tropospheric aerosols
    CFMIP Diagnostic Codes Catalogue Improve the understanding of cloud-climate feedback mechanisms and processes, and understand the errors and uncertainties in key physical processes Global and regional cloud distribution; seasonal variations in cloud type distribution; vertical structure of low clouds; cloud-radiation feedback; warm cloud microphysical processes
    Download: Download as CSV
  • Bauer, P., A. Thorpe, and G. Brunet, 2015: The quiet revolution of numerical weather prediction. Nature, 525, 47–55, https://doi.org/10.1038/nature14956.
    Bi, K. F., L. X. Xie, H. H. Zhang, et al., 2023: Accurate medium-range global weather forecasting with 3D neural networks. Nature, 619, 533–538, https://doi.org/10.1038/s41586-023-06185-3.
    Boo, K.-O., G. Martin, A. Sellar, et al., 2011: Evaluating the East Asian monsoon simulation in climate models. J. Geophys. Res. Atmos., 116, D01109, https://doi.org/10.1029/2010JD014737.
    Brown, A., S. Milton, M. Cullen, et al., 2012: Unified modeling and prediction of weather and climate: A 25-year journey. Bull. Amer. Meteor. Soc., 93, 1865–1877, https://doi.org/10.1175/BAMS-D-12-00018.1.
    Brown, B., T. Jensen, J. H. Gotway, et al., 2021: The model evaluation tools (MET): More than a decade of community-supported forecast verification. Bull. Amer. Meteor. Soc., 102, E782–E807, https://doi.org/10.1175/BAMS-D-19-0093.1.
    Brown, J. D., J. Demargne, D.-J. Seo, et al., 2010: The Ensemble Verification System (EVS): A software tool for verifying ensemble forecasts of hydrometeorological and hydrologic variables at discrete locations. Environ. Modell. Software, 25, 854–872, https://doi.org/10.1016/j.envsoft.2010.01.009.
    Bu, W. H., H. M. Chen, and P. X. Li, 2022: Analysis of the deviation of precipitation forecast of ECMWF model over the Yangtze–Huaihe River Valley in summer 2020. Torr. Rain Dis., 41, 315–323, https://doi.org/10.3969/j.issn.1004-9045.2022.03.008. (in Chinese)
    Cai, S. X., A. N. Huang, K. F. Zhu, et al., 2023: The forecast skill of the summer precipitation over Tibetan Plateau improved by the adoption of a 3D sub-grid terrain solar radiative effect scheme in a convection-permitting model. J. Geophys. Res. Atmos., 128, e2022JD038105, https://doi.org/10.1029/2022JD038105.
    Casati, B., L. J. Wilson, D. B. Stephenson, et al., 2008: Forecast verification: Current status and future directions. Meteor. Appl.,15, 3–18, https://doi.org/10.1002/met.52.
    Chen, C. G., X. L. Li, F. Xiao, et al., 2023: A nonhydrostatic atmospheric dynamical core on cubed sphere using multi-moment finite-volume method. J. Comput. Phys., 473, 111717, https://doi.org/10.1016/j.jcp.2022.111717.
    Chen, F. J., and J. Chen, 2015: The application experiment of a new score for precipitation verification based on the SEEPS principle. Adv. Meteor. Sci. Technol., 5, 6–13. Available online at http://www.cmalibrary.cn/amst/2015/201505/yjlw/fmbd/201609/t20160906_63827.htm. Accessed on 8 April 2025. (in Chinese)
    Chen, F. J., J. Chen, Q. Wei, et al., 2019: A new verification methodfor heavy rainfall forecast based on predictability II: Verification method and test. Acta Meteor. Sinica, 77, 28–42, https://doi.org/10.11676/qxxb2019.003. (in Chinese)
    Chen, H. M., T. J. Zhou, R. B. Neale, et al., 2010: Performance of the new NCAR CAM3.5 in East Asian summer monsoon simulations: Sensitivity to modifications of the convection scheme. J. Climate, 23, 3657–3675, https://doi.org/10.1175/2010JCLI3022.1.
    Chen, H. M., P. X. Li, and Y. Zhao, 2021: A review and outlook of verification and evaluation of precipitation forecast at convection-permitting resolution. Adv. Meteor. Sci. Technol., 11, 155–164, https://doi.org/10.3969/j.issn.2095-1973.2021.03.018. (in Chinese)
    Chen, J., C. H. Liu, F. J. Chen, et al., 2019: A new verification method for heavy rainfall forecast based on predictability I: Synthetic predictability index of heavy rainfall in China. Acta Meteor. Sinica, 77, 15–27, https://doi.org/10.11676/qxxb2019.002. (in Chinese)
    Cheng, R., R. C. Yu, Y. P. Xu, et al., 2019: Design of non-hydrostatic AREM model and its numerical simulation. Part II: Numerical simulation experiments. Chinese J. Atmos. Sci., 43, 1–12, https://doi.org/10.3878/j.issn.1006-9895.1712.17201. (in Chinese)
    Cheng, R., R. C. Yu, Y. P. Xu, et al., 2022: A new Eta-coordinate-based WRF dynamic core and its numerical experiments. Chinese J. Atmos. Sci., 46, 237–250, https://doi.org/10.3878/j.issn.1006-9895.2102.20173. (in Chinese)
    Dai, Y. J., 2020: Issues in research and development of land surface process model. Trans. Atmos. Sci., 43, 33–38, https://doi.org/10.13878/j.cnki.dqkxxb.20200103006. (in Chinese)
    Davis, C., B. Brown, and R. Bullock, 2006: Object-based verification of precipitation forecasts. Part I: Methodology and application to mesoscale rain areas. Mon. Wea. Rev., 134, 1772–1784, https://doi.org/10.1175/MWR3145.1.
    Ding, Y. H., and J. C. L. Chan, 2005: The East Asian summer monsoon: An overview. Meteor. Atmos. Phys., 89, 117–142, https://doi.org/10.1007/s00703-005-0125-z.
    Dorninger, M., P. Friederichs, S. Wahl, et al., 2018a: Forecast verification methods across time and space scales—Part I. Meteor. Z., 27, 433–434, https://doi.org/10.1127/metz/2018/0955.
    Dorninger, M., E. Gilleland, B. Casati, et al., 2018b: The setup of the MesoVICT project. Bull. Amer. Meteor. Soc., 99, 1887–1906, https://doi.org/10.1175/BAMS-D-17-0164.1.
    Dorninger, M., A. Ghelli, and S. Lerch, 2020: Recent developments and application examples on forecast verification. Meteor. Appl., 27, e1934, https://doi.org/10.1002/met.1934.
    Ebert, E., L. Wilson, A. Weigel, et al., 2013: Progress and challenges in forecast verification. Meteor. Appl., 20, 130–139, https://doi.org/10.1002/met.1392.
    Ebert, E. E., and J. L. McBride, 2000: Verification of precipitation in weather systems: Determination of systematic errors. J. Hydrol., 239, 179–202, https://doi.org/10.1016/S0022-1694(00)00343-7.
    Eyring, V., M. Righi, A. Lauer, et al., 2016: ESMValTool (v1.0) – a community diagnostic and performance metrics tool for routine evaluation of earth system models in CMIP. Geosci. Model Dev., 9, 1747–1802, https://doi.org/10.5194/gmd-9-1747-2016.
    Frassoni, A., C. Reynolds, N. Wedi, et al., 2023: Systematic errors in weather and climate models: Challenges and opportunities in complex coupled modeling systems. Bull. Amer. Meteor. Soc., 104, E1687–E1693, https://doi.org/10.1175/BAMS-D-23-0102.1.
    Fu, J. L., and K. Dai, 2016: The ECMWF model precipitation systematic error in the east of Southwest China based on the contiguous rain area method for spatial forecast verification. Meteor. Mon., 42, 1456–1464. Available online at http://qxqk.nmc.cn/qxen/article/abstract/20161203. Accessed on 8 April 2025. (in Chinese)
    Fu, J. L., S. Chen, X. L. Shen, et al., 2019: Comparative study of the cause of rainfall and its forecast biases of two cold vortex rainfall events in North China. Meteor. Mon., 45, 606–620. Available online at http://qxqk.nmc.cn/qxen/article/abstract/20190502. Accessed on 8 April 2025. (in Chinese)
    Fu, W., M. H. Tang, and C. Z. Ye, 2020: Analysis of two forecast failure cases of warm-sector rainstorms on Hunan–Guangxi border area in severe southwest jet. Meteor. Mon., 46, 1001–1014, https://doi.org/10.7519/j.issn.1000-0526.2020.08.001. (in Chinese)
    Gleckler, P. J., C. Doutriaux, P. J. Durack, et al., 2016: A more powerful reality test for climate models. Eos, 97, https://doi.org/10.1029/2016EO051663.
    Gong, H. N., L. Wang, W. Chen, et al., 2014: The climatology and interannual variability of the East Asian winter monsoon in CMIP5 models. J. Climate, 27, 1659–1678, https://doi.org/10.1175/JCLI-D-13-00039.1.
    Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129, 550–560, https://doi.org/10.1175/1520-0493(2001)129<0550:IORHFV>2.0.CO;2. doi: 10.1175/1520-0493(2001)129<0550:IORHFV>2.0.CO;2
    Hamill, T. M., and S. J. Colucci, 1997: Verification of Eta–RSM short-range ensemble forecasts. Mon. Wea. Rev., 125, 1312–1327, https://doi.org/10.1175/1520-0493(1997)125<1312:VOERSR>2.0.CO;2. doi: 10.1175/1520-0493(1997)125<1312:VOERSR>2.0.CO;2
    Hersbach, H., 2000: Decomposition of the continuous ranked probability score for ensemble prediction systems. Wea. Forecasting, 15, 559–570, https://doi.org/10.1175/1520-0434(2000)015<0559:DOTCRP>2.0.CO;2. doi: 10.1175/1520-0434(2000)015<0559:DOTCRP>2.0.CO;2
    Hoffman, R. N., Z. Liu, J.-F. Louis, et al., 1995: Distortion representation of forecast errors. Mon. Wea. Rev., 123, 2758–2770, https://doi.org/10.1175/1520-0493(1995)123<2758:DROFE>2.0.CO;2. doi: 10.1175/1520-0493(1995)123<2758:DROFE>2.0.CO;2
    Hu, X. L., and W. H. Yuan, 2021: Evaluation of ERA5 precipitation over the eastern periphery of the Tibetan Plateau from the perspective of regional rainfall events. Int. J. Climatol., 41, 2625–2637, https://doi.org/10.1002/joc.6980.
    James, R., R. Washington, B. Abiodun, et al., 2018: Evaluating climate models with an African lens. Bull. Amer. Meteor. Soc., 99, 313–336, https://doi.org/10.1175/BAMS-D-16-0090.1.
    Jiang, D. B., H. J. Wang, and X. M. Lang, 2005: Evaluation of East Asian climatology as simulated by seven coupled models. Adv. Atmos. Sci., 22, 479–495, https://doi.org/10.1007/BF02918482.
    Leung, L. R., W. R. Boos, J. L. Catto, et al., 2022: Exploratory precipitation metrics: Spatiotemporal characteristics, process-oriented, and phenomena-based. J. Climate, 35, 3659–3686, https://doi.org/10.1175/JCLI-D-21-0590.1.
    Li, C., D. H. Chen, X. L. Li, et al., 2020: Impact of height-based terrain-following coordinates on a case of quasi-stationary front simulations around the Tibetan Plateau. Torr. Rain Dis., 39, 117–124, https://doi.org/10.3969/j.issn.1004-9045.2020.02.002. (in Chinese)
    Li, H., X. M. Wang, and F. Zhu, 2022: Comprehensive evaluations of multi-model forecast performance of “21.7” Henan extreme rainstorm. Trans. Atmos. Sci., 45, 573–590, https://doi.org/10.13878/j.cnki.dqkxxb.20211019002. (in Chinese)
    Li, J., and R. C. Yu, 2014: A method to linearly evaluate rainfall frequency–intensity distribution. J. Appl. Meteor. Climatol., 53, 928–934, https://doi.org/10.1175/JAMC-D-13-0272.1.
    Li, J., H. M. Chen, X. Y. Rong, et al., 2018: How well can a climate model simulate an extreme precipitation event: A case study using the transpose-AMIP experiment? J. Climate, 31, 6543–6556, https://doi.org/10.1175/JCLI-D-17-0801.1.
    Li, J., B. Wang, and Y.-M. Yang, 2020: Diagnostic metrics for evaluating model simulations of the East Asian monsoon. J. Climate, 33, 1777–1801, https://doi.org/10.1175/JCLI-D-18-0808.1.
    Li, P. X., Z. Guo, K. Furtado, et al., 2019: Prediction of heavy precipitation in the eastern China flooding events of 2016: Added value of convection-permitting simulations. Quart. J. Roy. Meteor. Soc., 145, 3300–3319, https://doi.org/10.1002/qj.3621.
    Li, P. X., K. Furtado, T. J. Zhou, et al., 2020a: The diurnal cycle of East Asian summer monsoon precipitation simulated by the Met Office Unified Model at convection-permitting scales. Climate Dyn., 55, 131–151, https://doi.org/10.1007/s00382-018-4368-z.
    Li, P. X., C. Moseley, A. F. Prein, et al., 2020b: Mesoscale convective system precipitation characteristics over East Asia. Part I: Regional differences and seasonal variations. J. Climate, 33, 9271–9286, https://doi.org/10.1175/JCLI-D-20-0072.1.
    Li, P. X., K. Furtado, T. J. Zhou, et al., 2021: Convection-permitting modelling improves simulated precipitation over the central and eastern Tibetan Plateau. Quart. J. Roy. Meteor. Soc., 147, 341–362, https://doi.org/10.1002/qj.3921.
    Li, P. X., M. Muetzelfeldt, R. Schiemann, et al., 2023: Sensitivity of simulated mesoscale convective systems over East Asia to the treatment of convection in a high-resolution GCM. Climate Dyn., 60, 2783–2801, https://doi.org/10.1007/s00382-022-06471-2.
    Li, X. H., and X. D. Peng, 2018: Long-term integration of a global non-hydrostatic atmospheric model on an aqua planet. J. Meteor. Res., 32, 517–533, https://doi.org/10.1007/s13351-018-8016-7.
    Li, X. H., X. D. Peng, and X. L. Li, 2015: An improved dynamic core for a non-hydrostatic model system on the Yin–Yang grid. Adv. Atmos. Sci., 32, 648–658, https://doi.org/10.1007/s00376-014-4120-5.
    Li, X. L., and J. L. Fu, 2021: Forecast error analysis of EC model for heavy rainfall during annually first rainy season in South China based on CRA method. J. Trop. Meteor., 37, 194–206, https://doi.org/10.16032/j.issn.1004-4965.2021.019. (in Chinese)
    Li, X. L., C. G. Chen, X. S. Shen, et al., 2013: A multimoment constrained finite-volume model for nonhydrostatic atmospheric dynamics. Mon. Wea. Rev., 141, 1216–1240, https://doi.org/10.1175/MWR-D-12-00144.1.
    Liu, J. R., K. Yang, J. M. Wang, et al., 2024: Impacts of a shallow convection scheme on kilometer-scale atmospheric simulations over the Tibetan Plateau. Climate Dyn., 62, 8019–8034, https://doi.org/10.1007/s00382-024-07320-0.
    Liu, Y., H.-L. Ren, A. A. Scaife, et al., 2018: Evaluation and statistical downscaling of East Asian summer monsoon forecasting in BCC and MOHC seasonal prediction systems. Quart. J. Roy. Meteor. Soc., 144, 2798–2811, https://doi.org/10.1002/qj.3405.
    Lu, B., J. S. Sun, J. Q. Zhong, et al., 2017: Analysis of characteristic bias in diurnal precipitation variation forecasts and possible reasons in a regional forecast system over Beijing area. Acta Meteor. Sinica, 75, 248–259, https://doi.org/10.11676/qxxb2017.021. (in Chinese)
    Maloney, E. D., A. Gettelman, Y. Ming, et al., 2019: Process-oriented evaluation of climate and weather forecasting models. Bull. Amer. Meteor. Soc., 100, 1665–1686, https://doi.org/10.1175/BAMS-D-18-0042.1.
    Moseley, C., O. Henneberg, and J. O. Haerter, 2019: A statistical model for isolated convective precipitation events. J. Adv. Model. Earth Syst., 11, 360–375, https://doi.org/10.1029/2018MS001383.
    Neelin, J. D., J. P. Krasting, A. Radhakrishnan, et al., 2023: Process-oriented diagnostics: Principles, practice, community development, and common standards. Bull. Amer. Meteor. Soc., 104, E1452–E1468, https://doi.org/10.1175/BAMS-D-21-0268.1.
    Ou, T. H., D. L. Chen, X. C. Chen, et al., 2020: Simulation of summer precipitation diurnal cycles over the Tibetan Plateau at the gray-zone grid spacing for cumulus parameterization. Climate Dyn., 54, 3525–3539, https://doi.org/10.1007/s00382-020-05181-x.
    Pagano, T. C., B. Casati, S. Landman, et al., 2024: Challenges of operational weather forecast verification and evaluation. Bull. Amer. Meteor. Soc., 105, E789–E802, https://doi.org/10.1175/BAMS-D-22-0257.1.
    Pan, L. J., H. F. Zhang, and J. P. Wang, 2014: Progress on verification methods of numerical weather prediction. Adv. Earth Sci., 29, 327–335, https://doi.org/10.11867/j.issn.1001-8166.2014.03.0327. (in Chinese)
    Pan, L. J., H. F. Zhang, J. H. M. Liu, et al., 2024: Advancements in study on the application of MODE verification method in weather forecasting. Adv. Earth Sci., 39, 193–206, https://doi.org/10.11867/j.issn.1001-8166.2024.012. (in Chinese)
    Qi, D., X. P. Cui, L. Q. Chen, et al., 2024: Method of object-based diagnostic evaluation for numerical heavy-precipitation prediction based on subjective and objective circulation classification: Application and testing over Northeast China during the warm season of 2019. Chinese J. Atmos Sci., 48, 1113–1130, https://doi.org/10.3878/j.issn.1006-9895.2210.22107. (in Chinese)
    Qi, L. B., J. J. Wu, and C. H. Shi, 2020: Rethink on forecast focus of a torrential rainfall event at Jianghuai region. Torr. Rain Dis., 39, 647–657, https://doi.org/10.3969/j.issn.1004-9045.2020.06.013. (in Chinese)
    Qi, Q. Q., Y. J. Zhu, J. Chen, et al., 2023: Assessment of CMA-GEPS prediction capability for the extreme Meiyu process over China. Trans. Atmos. Sci., 46, 415–430, https://doi.org/10.13878/j.cnki.dqkxxb.20220306001. (in Chinese)
    Qin, W., G. Z. Liu, Z. Q. Lai, et al., 2020: Study on forecast errors and predictability of a warm-sector rainstorm in South China. Meteor. Mon., 46, 1039–1052, https://doi.org/10.7519/j.issn.1000-0526.2020.08.004. (in Chinese)
    Rodwell, M. J., D. S. Richardson, T. D. Hewson, et al., 2010: A new equitable score suitable for verifying precipitation in numerical weather prediction. Quart. J. Roy. Meteor. Soc., 136, 1344–1363, https://doi.org/10.1002/qj.656.
    Rong, X. Y., J. Li, H. M. Chen, et al., 2018: The CAMS climate system model and a basic evaluation of its climatology and climate variability simulation. J. Meteor. Res., 32, 839–861, https://doi.org/10.1007/s13351-018-8058-x.
    Roy, G., and V. Turcotte, 2007: Verification des Algorithms Radars du Gemlam 2.5. Internal Report, Severe Weather National Lab, Environment Canada, 13 pp.
    Schaefer, J. T., 1990: The critical success index as an indicator of warning skill. Wea. Forecasting, 5, 570–575, https://doi.org/10.1175/1520-0434(1990)005<0570:TCSIAA>2.0.CO;2. doi: 10.1175/1520-0434(1990)005<0570:TCSIAA>2.0.CO;2
    Shen, X. S., J. J. Wang, Z. C. Li, et al., 2020: Research and operational development of numerical weather prediction in China. J. Meteor. Res., 34, 675–698, https://doi.org/10.1007/s13351-020-9847-6.
    Song, F. F., and T. J. Zhou, 2014a: The climatology and interannual variability of East Asian summer monsoon in CMIP5 coupled models: Does air–sea coupling improve the simulations? J. Climate, 27, 8761–8777, https://doi.org/10.1175/JCLI-D-14-00396.1.
    Song, F. F., and T. J. Zhou, 2014b: Interannual variability of East Asian summer monsoon simulated by CMIP3 and CMIP5 AGCMs: Skill dependence on Indian Ocean–Western Pacific anticyclone teleconnection. J. Climate, 27, 1679–1697, https://doi.org/10.1175/JCLI-D-13-00248.1.
    Stephens, G., C. Jakob, and G. Tselioudis, 2015: The GEWEX process evaluation study: GEWEX-PROES. GEWEX News, 27, 4–5.
    Stevens, B., and S. Bony, 2013: What are climate models missing? Science, 340, 1053–1054, https://doi.org/10.1126/science.1237554.
    Sun, Y., and Y. H. Ding, 2008: Validation of IPCC AR4 climate models in simulating interdecadal change of East Asian summer monsoon. Acta Meteor. Sinica, 66, 765–780, https://doi.org/10.3321/j.issn:0577-6619.2008.05.010. (in Chinese)
    Tao, Z. Y., and Y. G. Zheng, 2013: Forecasting issues of the extreme heavy rain in Beijing on 21 July 2012. Torr. Rain Dis., 32, 193–201, https://doi.org/10.3969/j.issn.1004-9045.2013.03.001. (in Chinese)
    Tapiador, F. J., R. Roca, A. Del Genio, et al., 2019: Is precipitation a good metric for model performance? Bull. Amer. Meteor. Soc., 100, 223–233, https://doi.org/10.1175/BAMS-D-17-0218.1.
    Tsushima, Y., F. Brient, S. A. Klein, et al., 2017: The cloud feedback model intercomparison project (CFMIP) diagnostic codes catalogue – metrics, diagnostics and methodologies to evaluate, understand and improve the representation of clouds and cloud feedbacks in climate models. Geosci. Model Dev., 10, 4285–4305, https://doi.org/10.5194/gmd-10-4285-2017.
    Ullrich, P. A., C. Jablonowski, K. A. Reed, et al., 2016: Dynamical Core Model Intercomparison Project (DCMIP 2016) Test Case Document, 46 pp. Available online at https://github.com/ClimateGlobalChange/DCMIP2016. Accessed on 14 April 2025.
    Wan, Z. W., S. Y. Sun, B. Zhao, et al., 2024: Evaluation and error analysis of the July 2021 extremely severe rainstorm in Henan Province simulated by CMA-MESO model. Meteor. Mon.,50, 33–47, https://doi.org/10.7519/j.issn.1000-0526.2023.062101. (in Chinese)
    Wang, J., J. Chen, Y. L. Zhong, et al., 2017: Verification and evaluation of the southwest vortex forecast by GRAPES-REPS. Meteor. Mon., 43, 385–401. Available online at http://qxqk.nmc.cn/qxen/article/abstract/20170401. Accessed on 9 April 2025. (in Chinese)
    Wang, L., Y. Zhang, J. Li, et al., 2019: Understanding the performance of an unstructured-mesh global shallow water model on kinetic energy spectra and nonlinear vorticity dynamics. J. Meteor. Res., 33, 1075–1097, https://doi.org/10.1007/s13351-019-9004-2.
    Wang, N., H.-L. Ren, Y. Deng, et al., 2024: Understanding the causes of rapidly declining prediction skill of the East Asian summer monsoon rainfall with lead time in BCC_CSM1.1m. Climate Dyn., 62, 2807–2821, https://doi.org/10.1007/s00382-021-05819-4.
    Wang, X. J., G. J. Pang, and M. X. Yang, 2018: Precipitation over the Tibetan Plateau during recent decades: A review based on observations and simulations. Int. J. Climatol., 38, 1116–1131, https://doi.org/10.1002/joc.5246.
    Weigel, K., L. Bock, B. K. Gier, et al., 2021: Earth system model evaluation tool (ESMValTool) v2.0 – diagnostics for extreme events, regional and impact evaluation, and analysis of earth system models in CMIP. Geosci. Model Dev., 14, 3159–3184, https://doi.org/10.5194/gmd-14-3159-2021.
    Wernli, H., M. Paulat, M. Hagen, et al., 2008: SAL—a novel quality measure for the verification of quantitative precipitation forecasts. Mon. Wea. Rev., 136, 4470–4487, https://doi.org/10.1175/2008MWR2415.1.
    Wheeler, M., E. Maloney, and M. T. Force, 2013: Madden–Julian oscillation task force: A joint effort of the climate and weather communities. CLIVAR Exchanges, 61, 9–12.
    Williams, K. D., A. Bodas-Salcedo, M. Déqué, et al., 2013: The transpose-AMIP II experiment and its application to the understanding of southern ocean cloud biases in climate models. J. Climate, 26, 3258–3274, https://doi.org/10.1175/JCLI-D-12-00429.1.
    Xiao, Y. H., L. Kang, L. N. Xu, et al., 2013: Discussion on relationship between prediction performance of mesoscale numerical model and weather process in Southwest China. Meteor. Mon., 39, 1257–1264. Available online at http://qxqk.nmc.cn/qxen/article/abstract/20131003. Accessed on 9 April 2025. (in Chinese)
    Xie, Y. Y., and J. J. Wang, 2021: Preliminary study on the deviation and cause of precipitation prediction of GRAPES kilometer scale model in southwest complex terrain area. Acta Meteor. Sinica, 79, 732–749, https://doi.org/10.11676/qxxb2021.053.
    Xin, X. G., T. W. Wu, J. Zhang, et al., 2020: Comparison of CMIP6 and CMIP5 simulations of precipitation in China and the East Asian summer monsoon. Int. J. Climatol., 40, 6423–6440, https://doi.org/10.1002/joc.6590.
    Xu, C. L., J. J. Wang, and L. P. Huang, 2017: Evaluation on QPF of GRAPES-Meso4.0 model at convection-permitting resolution. Acta Meteor. Sinica, 75, 851–876, https://doi.org/10.11676/qxxb2017.068. (in Chinese)
    Xu, X., Y. Z. Ji, X. Zhou, et al., 2023: Reducing winter precipitation biases over the western Tibetan Plateau in the Model for Prediction Across Scales (MPAS) with a revised parameterization of orographic gravity wave drag. J. Geophys. Res. Atmos., 128, e2023JD039123, https://doi.org/10.1029/2023JD039123.
    Yang, X. S., J. L. Hu, D. H. Chen, et al., 2008: Verification of GRAPES unified global and regional numerical weather prediction model dynamic core. Chinese Sci. Bull., 53, 3458–3464, https://doi.org/10.1007/s11434-008-0417-z.
    Yao, J. C., T. J. Zhou, Z. Guo, et al., 2017: Improved performance of high-resolution atmospheric models in simulating the East Asian summer monsoon rain belt. J. Climate, 30, 8825–8840, https://doi.org/10.1175/JCLI-D-16-0372.1.
    Yu, B. Y., K. F. Zhu, M. Xue, et al., 2020: Using new neighborhood-based intensity-scale verification metrics to evaluate WRF precipitation forecasts at 4 and 12 km grid spacings. Atmos. Res., 246, 105117, https://doi.org/10.1016/j.atmosres.2020.105117.
    Yu, R. C., 1989: Design of the limited area numerical weather prediction model with steep mountains. Chinese J. Atmos. Sci., 13, 139–149, https://doi.org/10.3878/j.issn.1006-9895.1989.02.02. (in Chinese)
    Yu, R. C., 1994: A two-step shape-preserving advection scheme. Adv. Atmos. Sci., 11, 479–490, https://doi.org/10.1007/BF02658169.
    Yu, R. C., and Y. P. Xu, 2004: AREM and its simulations on the daily rainfall in summer in 2003. Acta Meteor. Sinica, 62, 715–723, https://doi.org/10.11676/qxxb2004.068. (in Chinese)
    Yu, R. C., B. Wang, and T. J. Zhou, 2004: Tropospheric cooling and summer monsoon weakening trend over East Asia. Geophys. Res. Lett., 31, L22212, https://doi.org/10.1029/2004GL021270.
    Yu, R. C., J. Li, H. M. Chen, et al., 2014: Progress in studies of the precipitation diurnal variation over contiguous China. Acta Meteor. Sinica, 72, 948–968, https://doi.org/10.11676/qxxb2014.047. (in Chinese)
    Yu, R. C., H. M. Chen, and W. Sun, 2015a: The definition and characteristics of regional rainfall events demonstrated by warm season precipitation over the Beijing plain. J. Hydrometeor., 16, 396–406, https://doi.org/10.1175/JHM-D-14-0086.1.
    Yu, R. C., J. Li, Y. Zhang, et al., 2015b: Improvement of rainfall simulation on the steep edge of the Tibetan Plateau by using a finite-difference transport scheme in CAM5. Climate Dyn., 45, 2937–2948, https://doi.org/10.1007/s00382-015-2515-3.
    Yu, R. C., J. Li, and P. Q. Jia, 2019a: Development of operational weather forecasting shaped by the “triple-in” properties of numerical models. WMO Bulletin, 68, 56–62. Available online at https://wmo.int/media/magazine-article/development-of-operational-weather-forecasting-shaped-triple-properties-of-numerical-models. Accessed on 14 April 2025.
    Yu, R. C., Y. Zhang, J. J. Wang, et al., 2019b: Recent progress in numerical atmospheric modeling in China. Adv. Atmos. Sci., 36, 938–960, https://doi.org/10.1007/s00376-019-8203-1.
    Yu, T. T., W. Chen, H. N. Gong, et al., 2023: Comparisons between CMIP5 and CMIP6 models in simulations of the climatology and interannual variability of the East Asian summer monsoon. Climate Dyn., 60, 2183–2198, https://doi.org/10.1007/s00382-022-06408-9.
    Yuan, W. H., X. L. Hu, and Y. S. Li, 2020: Evaluation of the hourly rainfall in the ECMWF forecasting over southwestern China. Meteor. Appl., 27, e1936, https://doi.org/10.1002/met.1936.
    Yue, S. Y., K. Yang, H. Lu, et al., 2021: Representation of stony surface–atmosphere interactions in WRF reduces cold and wet biases for the southern Tibetan Plateau. J. Geophys. Res. Atmos., 126, e2021JD035291, https://doi.org/10.1029/2021JD035291.
    Zhang, B., M. J. Zeng, A. N. Huang, et al., 2024a: A general comprehensive evaluation method for cross-scale precipitation forecasts. Geosci. Model Dev., 17, 4579–4601, https://doi.org/10.5194/gmd-17-4579-2024.
    Zhang, B., F. H. Zhang, X. L. Li, et al., 2024b: Verification and assessment of “23·7” severe rainstorm numerical prediction in North China. J. Appl. Meteor. Sci., 35, 17–32, https://doi.org/10.11898/1001-7313.20240102. (in Chinese)
    Zhang, D.-L., Y. H. Lin, P. Zhao, et al., 2013: The Beijing extreme rainfall of 21 July 2012: “Right results” but for wrong reasons. Geophys. Res. Lett., 40, 1426–1431, https://doi.org/10.1002/grl.50304.
    Zhang, X., W. Huang, and B. D. Chen, 2015: Implementation of the Klemp height-based terrain-following coordinate in the GRAPES regional model: Idealized tests and inter-comparison. Acta Meteor. Sinica, 73, 331–340, https://doi.org/10.11676/qxxb2015.014. (in Chinese)
    Zhang, Y., 2018: Extending high-order flux operators on spherical icosahedral grids and their applications in the framework of a shallow water model. J. Adv. Model. Earth Syst., 10, 145–164, https://doi.org/10.1002/2017MS001088.
    Zhang, Y., R. C. Yu, J. Li, et al., 2013: An implementation of a leaping-point two-step shape-preserving advection scheme in the high-resolution spherical latitude–longitude grid. Acta Meteor. Sinica, 71, 1089–1102, https://doi.org/10.11676/qxxb2013.085. (in Chinese)
    Zhang, Y., R. C. Yu, and J. Li, 2017: Implementation of a conservative two-step shape-preserving advection scheme on a spherical icosahedral hexagonal geodesic grid. Adv. Atmos. Sci., 34, 411–427, https://doi.org/10.1007/s00376-016-6097-8.
    Zhang, Y., J. Li, R. C. Yu, et al., 2019: A layer-averaged nonhydrostatic dynamical framework on an unstructured mesh for global and regional atmospheric modeling: Model description, baseline evaluation, and sensitivity exploration. J. Adv. Model. Earth Syst., 11, 1685–1714, https://doi.org/10.1029/2018MS001539.
    Zhang, Y., J. Li, R. C. Yu, et al., 2020: A multiscale dynamical model in a dry-mass coordinate for weather and climate modeling: Moist dynamics and its coupling to physics. Mon. Wea. Rev., 148, 2671–2699, https://doi.org/10.1175/MWR-D-19-0305.1.
    Zhang, Y., R. C. Yu, J. Li, et al., 2021: AMIP simulations of a global model for unified weather–climate forecast: Understanding precipitation characteristics and sensitivity over East Asia. J. Adv. Model. Earth Syst., 13, e2021MS002592, https://doi.org/10.1029/2021MS002592.
    Zhang, Y., Z. Liu, Y. M. Wang, et al., 2024: Establishing a limited-area model based on a global model: A consistency study. Quart. J. Roy. Meteor. Soc., 150, 4049–4065, https://doi.org/10.1002/qj.4804.
    Zhao, B., and B. Zhang, 2018: Application of neighborhood spatial verification method on precipitation evaluation. Torr. Rain Dis., 37, 1–7, https://doi.org/10.3969/j.issn.1004-9045.2018.01.001. (in Chinese)
    Zhao, D. C., Y. L. Lin, W. H. Dong, et al., 2023: Alleviated WRF summer wet bias over the Tibetan Plateau using a new cloud macrophysics scheme. J. Adv. Model. Earth Syst., 15, e2023MS003616, https://doi.org/10.1029/2023MS003616.
    Zhao, Y., T. J. Zhou, W. X. Zhang, et al., 2022: Change in precipitation over the Tibetan Plateau projected by weighted CMIP6 models. Adv. Atmos. Sci., 39, 1133–1150, https://doi.org/10.1007/s00376-022-1401-2.
    Zheng, Q., W. Sun, J. Li, et al., 2024: Impacts of moisture advection scheme on precipitation in the steep topography region between the Tibetan Plateau and the Sichuan basin. J. Appl. Meteor. Climatol., 63, 781–801, https://doi.org/10.1175/JAMC-D-23-0111.1.
    Zhong, Q., Z. Sun, H. M. Chen, et al., 2022: Multi model forecast biases of the diurnal variations of intense rainfall in the Beijing–Tianjin–Hebei region. Sci. China Earth Sci., 65, 1490–1509, https://doi.org/10.1007/s11430-021-9905-4.
    Zhou, T. J., B. Wu, Z. Guo, et al., 2018: A review of East Asian summer monsoon simulation and projection: Achievements and problems, opportunities and challenges. Chinese J. Atmos. Sci., 42, 902–934, https://doi.org/10.3878/j.issn.1006-9895.1802.17306. (in Chinese)
    Zhou, X., K. Yang, and Y. Wang, 2018: Implementation of a turbulent orographic form drag scheme in WRF and its application to the Tibetan Plateau. Climate Dyn., 50, 2443–2455, https://doi.org/10.1007/s00382-017-3677-y.
    Zhou, X., K. Yang, A. Beljaars, et al., 2019: Dynamical impact of parameterized turbulent orographic form drag on the simulation of winter precipitation over the western Tibetan Plateau. Climate Dyn., 53, 707–720, https://doi.org/10.1007/s00382-019-04628-0.
    Zhou, X., K. Yang, Y. Z. Jiang, et al., 2022: The influence of bare ground thermal roughness length parameterization on the simulation of near-surface air and skin temperatures over the Tibetan Plateau. J. Geophys. Res. Atmos., 127, e2022JD037245, https://doi.org/10.1029/2022JD037245.
    Zhou, X., B. H. Ding, K. Yang, et al., 2023: Reducing the cold biasof the WRF model over the Tibetan Plateau by implementing a snow coverage–topography relationship and a fresh snow albedo scheme. J. Adv. Model. Earth Syst., 15, e2023MS003626, https://doi.org/10.1029/2023MS003626.
    Zhou, Y. H., Y. Zhang, J. Li, et al., 2020: Configuration and evaluation of a global unstructured mesh atmospheric model (GRIST-A20.9) based on the variable-resolution approach. Geosci. Model Dev., 13, 6325–6348, https://doi.org/10.5194/gmd-13-6325-2020.
    Zhou, Y. H., R. C. Yu, Y. Zhang, et al., 2023: Dynamic and thermodynamic processes related to precipitation diurnal cycle simulated by GRIST. Climate Dyn., 61, 3935–3953, https://doi.org/10.1007/s00382-023-06779-7.
  • Related Articles

  • Cited by

    Periodical cited type(10)

    1. Delong Zhao, Minghuai Wang, Daniel Rosenfeld, et al. Aircraft observation of fast initiation of mixed phase precipitation in convective cloud over the Tibetan Plateau. Atmospheric Research, 2023, 285: 106627. DOI:10.1016/j.atmosres.2023.106627
    2. Dong Zheng, Penglei Fan, Yijun Zhang, et al. Deep convective clouds observed by ground-based radar over Naqu, Qinghai–Tibet Plateau. Atmospheric Research, 2023, 293: 106930. DOI:10.1016/j.atmosres.2023.106930
    3. Minghao He, Shaobo Zhang, Xianyu Yang, et al. Numerical Simulation of a Typical Convective Precipitation and Its Cloud Microphysical Process in the Yushu Area, Based on the WRF Model. Atmosphere, 2022, 13(8): 1311. DOI:10.3390/atmos13081311
    4. Xia Wan, Jiafeng Zheng, Rong Wan, et al. Intercomparison of Cloud Vertical Structures over Four Different Sites of the Eastern Slope of the Tibetan Plateau in Summer Using Ka-Band Millimeter-Wave Radar Measurements. Remote Sensing, 2022, 14(15): 3702. DOI:10.3390/rs14153702
    5. Jingshu He, Jiafeng Zheng, Zhengmao Zeng, et al. A Comparative Study on the Vertical Structures and Microphysical Properties of Stratiform Precipitation over South China and the Tibetan Plateau. Remote Sensing, 2021, 13(15): 2897. DOI:10.3390/rs13152897
    6. Yilun Chen, Aoqi Zhang, Yunfei Fu, et al. Morphological Characteristics of Precipitation Areas over the Tibetan Plateau Measured by TRMM PR. Advances in Atmospheric Sciences, 2021, 38(4): 677. DOI:10.1007/s00376-020-0233-1
    7. Yunfei Fu, Yaoming Ma, Lei Zhong, et al. Land-surface processes and summer-cloud-precipitation characteristics in the Tibetan Plateau and their effects on downstream weather: a review and perspective. National Science Review, 2020, 7(3): 500. DOI:10.1093/nsr/nwz226
    8. Lu Yu, YunFei Fu, Yuanjian Yang, et al. Trumpet‐Shaped Topography Modulation of the Frequency, Vertical Structures, and Water Path of Cloud Systems in the Summertime Over the Southeastern Tibetan Plateau: A Perspective of Daytime–Nighttime Differences. Journal of Geophysical Research: Atmospheres, 2020, 125(3) DOI:10.1029/2019JD031803
    9. R. A. Balogun, Z. D. Adeyewa, E. A. Adefisan, et al. Vertical structure and frequencies of deep convection during active periods of the West and Central African monsoon season. Theoretical and Applied Climatology, 2020, 141(1-2): 615. DOI:10.1007/s00704-020-03203-6
    10. Wen Hui, Wenjuan Zhang, Weitao Lyu, et al. Preliminary Observations from the China Fengyun-4A Lightning Mapping Imager and Its Optical Radiation Characteristics. Remote Sensing, 2020, 12(16): 2622. DOI:10.3390/rs12162622

    Other cited types(0)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return