Discrimination and validation of clouds and dust aerosol layers over the Sahara desert with combined CALIOP and IIR measurements

This study validates a method for discriminating between daytime clouds and dust aerosol layers over the Sahara Desert that uses a combination of active CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization) and passive IIR (Infrared Imaging Radiometer) measurements; hereafter, the CLIM method. The CLIM method reduces misclassification of dense dust aerosol layers in the Sahara region relative to other techniques. When evaluated against a suite of simultaneous measurements from CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations), CloudSat, and the MODIS (Moderate-resolution Imaging Spectroradiometer), the misclassification rate for dust using the CLIM technique is 1.16% during boreal spring 2007. This rate is lower than the misclassification rates for dust using the cloud aerosol discriminations performed for version 2 (V2-CAD; 16.39%) or version 3 (V3-CAD; 2.01%) of the CALIPSO data processing algorithm. The total identification errors for data from in spring 2007 are 13.46% for V2-CAD, 3.39% for V3-CAD, and 1.99% for CLIM. These results indicate that CLIM and V3-CAD are both significantly better than V2-CAD for discriminating between clouds and dust aerosol layers. Misclassifications by CLIM in this region are mainly limited to mixed cloud-dust aerosol layers. V3-CAD sometimes misidentifies low-level aerosol layers adjacent to the surface as thin clouds, and sometimes fails to detect thin clouds entirely. The CLIM method is both simple and fast, and may be useful as a reference for testing or validating other discrimination techniques and methods.


Introduction
Dust is a major component of atmospheric aerosol with important direct and indirect effects on the earthatmosphere system. Dust particles directly modulate the surface radiation budget by absorbing and reflecting solar radiation and trapping outgoing longwave radiation (Charlson et al., 1992;Kaufman et al., 2001;Wang et al., 2008). They also affect the microphysical properties of clouds by changing cloud condensation nuclei (CCN) concentrations, cloud droplet number concentrations, and the cloud droplet sizes (Twomey, 1977;Albrecht, 1989;Tegen and Lacis, 1996;Miller et al., 2004;Huang et al., 2006aHuang et al., , b, 2010. The Sahara is the world's largest source of dust Middleton and Goudie, 2001;Prospero et al., 2002), accounting for 40%-70% of the dust in the global atmosphere every year (Engelstaedter et al., 2006). Saharan dust is frequently transported across the Atlantic Ocean and the Mediterranean and Caribbean Seas (Prospero and Carlson, 1972;Ganor and Mamane, 1982;Prospero, 1996;Moulin et al., 1998;Colarco et al., 2003), and accounts for nearly 50% of the dust that settles in oceans (Miller et al., 2004). Saharan dust aerosols have wide-ranging global effects; however, the direct and indirect effects of dust aerosols on the global radiation budget have not been fully characterized yet (IPCC, 2007). Accurate monitoring and identification of the vertical and horizontal characteristics of clouds and dust aerosols over the Sahara Desert are therefore imperative for reducing uncertainty about the climatic effects of dust via the radiation budget. This monitoring and identification represent important contributions to research on climate change.
The Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) mission is an integral part of the A-Train satellite constellation (Stephens et al., 2002). CALIPSO was launched in April 2006 (Winker et al., 2006(Winker et al., , 2007 with three instruments on board: the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP), the passive Infrared Imaging Radiometer (IIR), and the Wide Field Camera (WFC). CALIOP collects global measurements of the vertical distributions of aerosols and clouds, cloud particle phase, and aerosol size (Winker et al., 2003;Hu et al., 2007aHu et al., , b, c, 2010. CALIOP's ability to accurately retrieve and depolarize profiles of aerosols and clouds at high vertical resolution makes it a superb platform for reducing uncertainty in measurements of dust aerosols (Huang et al., 2007a(Huang et al., , 2009Liu et al., 2008;Chen et al., 2009). The first step in the scene classification algorithm (SCA) component of CALIOP data processing is cloud and aerosol discrimination (CAD) . Liu et al. (2004) developed a CAD scheme based on three-dimensional probability density functions (3D PDFs). This CAD scheme was implemented in version 2 of the CALIPSO data processing algorithm (hereafter referred to as V2-CAD). Unfortunately, this scheme misclassifies dense dust layers as cloud because the PDFs of dense dust and smoke aerosols overlap with the PDFs of optically thin clouds. Previous research has demonstrated that infrared (IR) split window techniques based on brightness temperature differences (BTDs) can be used to discriminate between clouds and dust storms (Ackerman, 1997;Legrand et al., 2001;Zhang et al., 2006), although this approach tends to fail for optically thin dust layers. Chen et al. (2010) developed a new algorithm to detect dust aerosols that took the respective strengths and weakness of these two methods into account. Their method (hereafter as the CLIM method) combines CALIPSO active lidar and passive IIR measurements to greatly improve the identification of dust aerosols in desert areas, where dust aerosol layers can be particularly dense. Liu et al. (2010) implemented a new CAD scheme in version 3 of the CALIPSO data processing algorithm (hereafter, V3-CAD). This method extended V2-CAD to include additional information about volume depolarization ratio (VDR) and latitude, so that CAD is based on five-dimensional probability density functions (5D PDFs). The V3-CAD approach yields significantly better results for very dense dust and smoke layers located over or near dust source regions because the 5D PDFs allow accurate discrimination between these dense aerosol layers and clouds.
Here, we compare cloud and aerosol discrimination using the V2-CAD, V3-CAD, and CLIM methods over the Sahara Desert, with particular focus on quantifying the misidentification of aerosols and clouds using the V2-CAD method. We assess the improvement of cloud and aerosol discrimination achieved by using the V3-CAD and CLIM methods. We start by using nearly-simultaneous measurements from CALIPSO, CloudSat, and the Moderate-resolution Imaging Spectroradiometer (MODIS) to retrieve cloud and aerosol properties over the Sahara. Cloud and aerosol properties retrieved using these three datasets together are more accurate than those retrieved from the datasets separately. CAD based on the V2-CAD, V3-CAD, and CLIM methods is then compared for data obtained during spring 2008. Misclassification rates over the Sahara are computed for the V2-CAD, V3-CAD, and CLIM results. We also examine possible sources of error. Our results help to clarify and add value to results obtained in previous studies that used data based on V2-CAD.

Satellite data
We use measurements taken by the CALIPSO lidar (versions 2 and 3) and IIR, Aqua MODIS, and CloudSat over the Sahara during boreal spring (March -May) 2007 and 2008. The CALIPSO, CloudSat, and Aqua missions are all part of the NASA A-Train satellite constellation. Because these satellites fly in formation and are separated by just a few minutes, they provide approximately collocated, near-simultaneous observations of cloud and aerosol properties. Together, these observations comprise a multi-satellite observing platform for accurate retrievals of cloud and aerosol properties.

Aqua MODIS
A second MODIS instrument was launched onboard the Aqua satellite on 4 May 2002. This instrument observes a swath 2330-km wide and covers the entire surface of the earth every 1-2 days. Data are acquired in 36 spectral bands between 0.405 and 14.385 µm at three spatial resolutions: 250, 500, and 1000 m. We use Aqua MODIS Level-1B 500-m calibrated radiances in this study.
We also use global monthly data from Aqua MODIS (MYD08 − M3) at a horizontal resolution of 1 • × 1 • to derive aerosol optical depth (AOD) at 550 nm over the Sahara in 2007. The gridded Level-3 MODIS product consists of three atmospheric products, each of which covers a different temporal scale: daily, 8daily, and monthly. The deep-blue retrieval algorithm for AOD (Hsu et al., 2004(Hsu et al., , 2006 is especially valuable over bright arid surfaces (e.g., desert as rare as with bare vegetation), where the dark-target retrieval algorithm fails. The corrected algorithm is useful for dark surfaces over land. Based on monthly mean AOD distributions in 2007 (figure omitted), boreal spring was the most active season for dust generation over the Sahara. Figure 1 shows AODs over the Sahara and surrounding regions retrieved during boreal spring with the corrected and deep-blue algorithms. Almost all of the corrected AOD values are missing between 15 • and 30 • N. By contrast, the deep-blue AOD values have been retrieved effectively. The deep-blue AODs are also larger, indicating that thick dust aerosol layers are common over the Sahara region during this season. Our purpose in this study is to quantify the misclassification of dense dust aerosol layers and the improvement offered by the CLIM method. We therefore focus on the region where the dust layer is the thickest (15 • -20 • N, 0 • -20 • E). This region is indicated by a black rectangle in Fig. 1.

CALIPSO
The CALIOP lidar instrument acquires three simultaneous calibrated and geolocated lidar profiles: total attenuated backscatter at 532 nm, total attenuated backscatter at 1064 nm, and the perpendicular polarization components of backscatter at 532 nm (Hostetler et al., 2006). These profiles are provided in the CALIOP level-1B dataset and can be used to derive information about the sizes and shapes of atmospheric particles. We use versions 2 and 3 of the CALIOP Vertical Feature Mask (VFM) to identify the positions and vertical distributions of clouds and dust aerosol layers. We focus primarily on the misclassification of VFM data with the version 2 algorithm and improvements with the version 3 algorithm.
The IIR instrument has three channels centered at 8.65, 10.60, and 12.05 µm, which can be used to derive information about cirrus cloud particle size and infrared emissivity. Brightness temperature differences (BTDs) based on IIR level-2 swath data also provide an effective means for identifying dust storms (Ackerman, 1997; Legrand et al., 2001;Zhang et al., 2006).

CloudSat
The CloudSat platform carries the first satellitebased millimeter-wavelength cloud profiling radar (CPR). This instrument is more sensitive to cloud particles than existing weather radars, allowing the detection of smaller (cloud-size) liquid water and ice particles. The CPR is a 94-GHz nadir-pointing radar that measures the power backscattered by clouds as a function of distance from the radar. The vertical resolution of CloudSat profiles is 500 m with 240-m sampling. The cross-track horizontal resolution is 1.4 km while the along-track resolution is 1.7 km. We use CloudSat level 2B-CLDCLASS vertical profiles and cloud classifications to determine the existence of clouds.

Method
The CALIPSO cloud-aerosol discrimination (CAD) algorithm was initially developed in 2004. This algorithm uses a confidence function based primarily on three-dimensional (3D) probability density functions (PDFs) of layer-averaged attenuated backscatter at 532 nm (β 532 ), layer-integrated color ratio (χ ), and mid-layer altitude (z) (Liu et al., 2004). Previous studies have reported that V2-CAD correctly identifies thin clouds and aerosols but often misclassifies dense aerosol layers (Liu et al., 2004(Liu et al., , 2009Ma et al., 2011). This algorithm has recently been extended to use five-dimensional (5D) PDFs, with volume depolarization ratio (VDR) and latitude as the additional dimensions (Liu et al., 2010). Chen et al. (2010) developed the CLIM method for detecting dust aerosol by combining CALIPSO active lidar and passive IIR measurements. This method substantially improved the identification of dense aerosol layers relative to V2-CAD. They defined the dust index (DI CLIM ) as: where BTD 1 is the BTD between the IIR channels at 10.60 and 12.05 µm while BTD 2 is the BTD between the IIR channels at 8.65 and 11.60 µm; β is the layer mean attenuated backscatter at 532 nm, which indicates the intensity of the backscatter from particles; δ is the layer-mean depolarization ratio (layer-integrated ratio of perpendicular-to-parallel attenuated backscatter at 532 nm), which provides information about the shape of particles; χ is the layer-integrated 1064-532nm volume color ratio, which is sensitive to particle size; ε is the altitude above mean sea level (MSL) at the layer top; ζ is the altitude above MSL at the layer base; and A 0 -A 7 are fitting coefficients. Negative values of DI CLIM indicate aerosols, while positive values indicate clouds. The coefficients A 0 − A 7 are first determined for a given region. The dust index can then be used to quickly discriminate between clouds and aerosols in that region. The CLIM method is simple and fast once the regional coefficients are well-defined.
To obtain the fitting values for A 0 − A 7 over the Sahara, we first select known cloud and dust scenes using a combination of CALIPSO, MODIS, and Cloud-Sat imagery. If neither CloudSat nor MODIS identifies a cloud, any feature observed by CALIPSO that is not classified as a cloud is identified as dust aerosol. If neither CloudSat nor MODIS observes a cloud but CALIPSO identifies a cloud, this means that the 532nm attenuated backscatter, depolarization ratio, and color ratio from CALIPSO measurements are relatively high while the altitude of the feature is relatively low. This situation indicates a feature that has been misidentified as a cloud by CALIPSO. If CloudSat and MODIS observe a cloud and CALIPSO identifies a cloud, the phase of the cloud is determined using the CALIPSO operational cloud phase algorithm (which is based on relationships between the depolarization ratio, backscatter intensity, temperature, and attenuated back scatter color ratio). Overall, 3437 cloud segments and 17492 aerosol segments were identified over the Sahara during spring 2008. These samples are presupposed to be accurate, and are used to determine the coefficients in Eq. (1) using a Fisher discriminant analysis (Mika et al., 1999). Table 1 lists the values of the coefficients derived for the Sahara desert. Figure  2 shows the occurrence of clouds and dust aerosols associated with different values of DI CLIM . This method reliably discriminates between dust aerosol layers and clouds with very few misclassifications.
The method is validated using observations taken over the same region during boreal spring 2007. A to-    along the track corresponding to the blue segments shown in Fig. 3. The CloudSat cloud scenario classification for this segment is shown in Fig. 4d. The non-spherical shapes of dust particles mean that the depolarization ratio for dust is relatively high (> 0.1). The total attenuated backscatter and color ratio (> 0.5) are also large for dust. Depolarization ratios and color ratios are much lower (≈ 0) for other types of aerosols, with the exception of sea salt aerosols. Water clouds can be discriminated from dust aerosols using depolarization ratios because water droplets are spherical. Although ice cloud particles typically have high values of δ, dust aerosols can be distinguished from ice clouds by examining the BTD and altitude of the feature. The values of 532-nm total attenuated backscatter, depolarization ratio, and color ratio in Figs. 4a-c all indicate that the features observed along this segment were dust aerosols and not clouds. This conclusion is confirmed by the CloudSat classification along the satellite track (Fig. 4d), which also did not indicate the presence of clouds. The CALIPSO lidar level-2 VFM classifies aerosols into several types and classifies clouds into ice or liquid phases. Figure 5 shows the vertical feature masks from V2-CAD, V3-CAD, and CLIM for the case shown in Fig. 4. The V2-CAD method frequently misclassifies dust as cloud (Fig. 5a). The V3-CAD method represents a significant improvement relative to V2-CAD, but some footprints are still misclassified as clouds (Fig. 5b). The CLIM method (Fig. 5c) provides the most accurate discrimination between clouds and dust aerosol layers for this case. cloud segments as dust (Fig. 6a) and 2326 (16.39%) dust segments as cloud (Fig. 6b). The V2-CAD algorithm frequently misclassifies aerosol layers (particularly dense dust layers) as clouds because the PDFs of 532 nm total attenuated backscatter intensity and color ratio overlap for dense dust layers and ice clouds. Liu et al. (2010) added volume depolarization ratio and latitude as extra parameters for discriminating between clouds and aerosols in the V3-CAD algorithm. These additions significantly reduce the misclassification of dust layers as clouds, with only 308 (2.01%) dust layers misclassified (Fig. 6b); however, the V3-CAD algorithm misidentifies 285 (9.37%) cloud segments as dust (Fig. 6a). The CLIM algorithm misidentifies 182 (5.54%) cloud segments as dust (Fig.  6a) and 165 (1.16%) dust segments as clouds (Fig.  6b). We found that fewer cloud segments misclassified lead to a lager misidentification rate. Such as, cloud misidentified segments are 27, 285, and 182 for V2-CAD, V3-CAD, and CLIM methods, respectively. Compared with misidentification of dust aerosol segments, cloud misclassified segments (285 and 182) observed so few, but misidentification rates reach 9.37% and 5.54%. Therefore, the increase cloud identification errors generated by the V3-CAD and CLIM methods partially attributable to fewer cloud segments. Overall, the results indicate that the CLIM method can reduce misclassification rates of clouds and dust aerosol layers in the Sahara region.

Case study
Following Chen et al. (2010), the total dust identification error R d can be defined as where N ed and N ec are the number of segments in which dust or cloud were misclassified and N d is the total number of dust segments. The values of R d for V2-CAD, V3-CAD, and CLIM were 16.58%, 4.18%, and 2.45%, respectively. The total identification error R t for this class of scenes can be similarly defined as where the denominator N is the total number of segments. The values of R t for V2-CAD, V3-CAD, and CLIM are 13.46%, 3.39%, and 1.99%, respectively. These results confirm that the CLIM method improves the automated discrimination between clouds and aerosols in CALIPSO observations over the Sahara region. The following section investigates the sources of identification errors in each method.

Errors in the V3-CAD algorithm
The V3-CAD algorithm has been validated using data collected in the spring of 2007. The dust identification error was 4.18%, a significant improvement relative to the V2-CAD algorithm (16.58%). The case study presented in Section 4 also revealed frequent misclassification of very dense dust layers by V2-CAD, which was substantially improved when V3-CAD was used instead. However, the V3-CAD algorithm still misclassifies some scenes, such as low-level aerosol layers adjacent to the surface. Figure 7 shows an example of this type of misclassification.
The attenuated backscatter at 532 nm, the depolarization ratio, and the backscatter color ratio indicate the presence of dust aerosol layers in CALIPSO observations from 2 May 2008 at 16.3 • -17.0 • N and 17.8 • -18.5 • N. CloudSat did not identify clouds in either of these regions (Figs. 8a and 8b). The V2-CAD (Fig. 7d) and CLIM (Fig. 7f) algorithms both correctly identify these features as dust layers, but the V3-CAD method (Fig. 7e) misidentifies some nearsurface dust layers as clouds. This is the most common type of misidentification of dust as clouds using the V3-CAD algorithm, but the reasons behind this type of misclassification are unclear. The V3-CAD algorithm also misses some optically thin clouds, which may introduce some errors.

Errors in the CLIM algorithm
The CLIM algorithm introduced by Chen et al. (2010) yielded a total dust misidentification rate of 2.45%, the lowest rate among these three algorithms. The weakness of this method is that it only works for single-layer features, and frequently misclassifies clouds or dust aerosol layers when dust aerosols are mixed with clouds. By contrast, V3-CAD performs well in these situations. Figure 9 presents an example of this type of misclassification using the CLIM algorithm. The total 532-nm attenuated backscatter and depolarization ratio (Figs. 9a and 9b) indicate thin dust layers near 4.8 km with water clouds located above them. The Cloud-Sat cloud classification for this track supports this interpretation. V2-CAD only identifies clouds (Fig. 9c) because the optical properties of this layer were dominated by the contributions of the cloud layer (Liu et al., 2010). Dust layers without overlying clouds are correctly classified as dust in Fig. 9d, but the CLIM method identifies the layers with clouds mixed with dust aerosols as exclusively cloud layers. This misidentification occurs because the CLIM method does not account for features with multiple layers when the ver-tical spacing of adjacent layers is less than 0.6 km. Overall, the CLIM method performs better than the V2-CAD method for mixed cloud and dust aerosol layers. Fig. 7. Along-track CALIPSO (a) total attenuated backscatter 532, (b) depolarization ratio, (c) 1064/532 nm backscatter color ratio, (d) vertical feature mask using the V2-CAD method, (e) vertical feature mask using the V3-CAD method, and (f) vertical feature mask using the CLIM method from 2 May 2008. The V3-CAD algorithm misidentifies some nearsurface dust aerosol layers as clouds.

Conclusions
The Sahara Desert emits large quantities of dust aerosol into the atmosphere every year. This dust has important effects on the earth's climate system. The CALIPSO satellite was launched in 2006 to provide vertical information about clouds and aerosols. Previous studies have indicated that the V2-CAD data processing algorithm misclassifies some scenes, especially those with dense dust aerosol layers (Liu et al., 2004(Liu et al., , 2009). To address this problem, Chen et al. (2010) developed the CLIM algorithm based on simultaneous IIR and lidar measurements. Liu et al. (2010) developed the V3-CAD algorithm using an expanded set of PDFs. Here, we have used all three algorithms to validate their cloud and aerosol classifications over the Sahara Desert region.
The CLIM method yielded a total dust misclassification rate of 2.45%, lower than either V2-CAD (16.58%) or V3-CAD (4.18%). The misclassification rate of dust as clouds using the CLIM method is only 1.16%, again lower than the rates using V2-CAD (16.39%) or V3-CAD (2.01%). The overall misidentification rates (i.e., clouds as dust or dust as clouds) are 13.46% for V2-CAD, 3.39% for V3-CAD, and 1.99% for CLIM. These results demonstrate that the CLIM method classifies features more accurately over the Sahara than either V2-CAD or V3-CAD. V3-CAD also provides a substantial improvement over V2-CAD with respect to the classification of dense aerosol layers, including dust layers over the source regions. However, the V3-CAD algorithm misclassifies some dust aerosol layers adjacent to the surface as clouds, and misses some optically thin cloud edges. The lidar parameters of thin cloud edges are similar to those of dust, resulting in misidentification. Misclassification using the CLIM method is largely due to mixed clouddust layers, as this method was developed specifically for single-layer features. Feature classification using the V2-CAD and V3-CAD algorithms is more accurate over other regions than over the Sahara source region, so their overall performance on a global scale is not reflected in the results presented in this paper.
The results of this study are based on satellite data taken over the Sahara Desert during boreal spring (March-May) of 2007 and 2008. This time frame and focus area are insufficient to definitively quantify the ability of CLIM and V3-CAD to discriminate between clouds and aerosols. Further research should be conducted using surface and space-based observations to validate and further improve the CLIM method. The global application of the method should be tested for other dust source regions, during nighttime, during different seasons, and for multi-layer features, which account for 37.3% of all features globally. The method depends only on infrared and lidar measurements. It should therefore apply to both daytime and nighttime conditions. In cases of multiple dust layers or dust layers covered by clouds, microwave radiation can penetrate the dust layer with little attenuation. Moreover, Huang et al. (2007b) showed that microwave radiation is well suited for monitoring dust storms located under ice clouds, indicating that the performance of the CLIM method may be improved by integrating microwave observations. Ge et al. (2008) found that the signal from atmospheric dust could be separated from surface radiation because atmospheric dust particles produce stronger scattering at high frequencies, thereby depolarizing the background desert signature. Using these methods in combination may help to overcome some of the weaknesses that appear when each technique is used alone.
(http://www.cloudsat.cira. colostate.edu). CALIPSO data were obtained from the Atmospheric Sciences Data Center (ASDC) at NASA Langley Research Center. The MODIS data were obtained from the NASA Earth Observing System Data and Information System, Distributed Active Archive Center (DAAC) at the GSFC. The language editor for this manuscript is Dr. Jonathon S. Wright.