Forecast Systems Laboratory
Boulder, Colorado
For several years, the National Weather Service (NWS) has been engaged in activities directed toward modernizing and restructuring its operations. The activities include as major components, the development of a new radar system (whose individual hardware units are known as WSR-88D), a new Automated Surface Observing System (ASOS), and a new communications and forecaster workstation system, the Advanced Weather Interactive Processing System (AWIPS).
To assist the NWS with the deployment of the AWIPS system, NOAA's Forecast Systems Laboratory (FSL) is currently building an AWIPS prototype, known as WFO-Advanced (Grote and Biere 1998). WFO-Advanced components include national-scale data ingest, data management, on-demand product generation, meteorological display, hydrometeorological application, and Local Data Acquisition and Dissemination (LDAD).
The LDAD component automates NWS field office interactions with local data observation systems, spotter networks, cooperative observers, and members of the local decisionmaking community. The system will enable each NWS Weather Forecast Office (WFO) to 1) provide support for the acquisition, quality control, and integration of weather observations into the AWIPS processing environment, 2) facilitate two-way communications between the WFOs and state and local government agencies; and 3) disseminate AWIPS weather information using advanced visualization and integration techniques to decisionmakers in local and state communities, and to the public at large.
This paper describes the LDAD observation Quality Control and Monitoring System (QCMS). For overviews of the LDAD system, see Jesuroga et al. (1998) and Subramaniam et al. (1998a).
Requirements for the quality control (QC) of incoming data to the AWIPS system are provided by the NWS Techniques Specification Package (TSP) 88-21-R1 (1993). The techniques described in the TSP are meant to
"assure that watches, warnings, and general information disseminated to the public are based on accurate and current data by:
Two categories of QC checks, static and dynamic, are described in the TSP for a variety of observation types, including surface, buoy, ship, profiler, aircraft, and rawinsonde data. The static QC checks are single-station, single-time checks which, as such, are unaware of the previous and current meteorological or hydrologic situation described by other observations and grids. Checks falling into this category include: validity, climatological, internal consistency, and vertical consistency checks. Although useful for locating extreme outliers in the observational database, the static checks have difficulty with statistically reasonable, but invalid data. To address these difficulties, the TSP also describes dynamic checks which refine the QC information by taking advantage of other available hydrometeorological information. Examples of dynamic QC checks include: positional consistency, temporal consistency, spatial consistency, and model consistency checks.
The TSP also describes the requirement for a "data descriptor," a data structure intended to provide an overall opinion of the quality of an observation by combining the information from the various QC checks. Algorithms described to compute the data descriptor are a function of the types of QC checks applied to the observation, the sophistication of those checks, and the departure of the observation from the expected values provided by the QC checks.
Further requirements include establishing a QC database, so that the results of the QC procedures can be stored, logged, and provided to both NWS forecasters at the WFO and the data providers in charge of station maintenance. Forecasters are required to have the capability to override the results of the QC checks.
Time constraints prevent the full implementation of the TSP by AWIPS Build 4.0 deadlines. The QC capabilities that will be provided are described in this section. Note that only hourly surface and buoy observations of sea-level pressure (SLP), temperature, winds, and humidity will be processed for the Build 4.0 system.
Two QC checks, one static and one dynamic, will be used in the initial version of the LDAD QCMS. The checks are a validity check, which compares observed values to specified tolerance limits, and a spatial consistency check, which compares observed values to estimated values derived using observations at neighboring locations. Both checks were originally developed as part of a surface assimilation system (Miller and Benjamin 1992) developed at FSL as part of the Mesoscale Analysis and Prediction System (MAPS). The surface system, known as MSAS for the MAPS Surface Assimilation System, has been running hourly at FSL since 1986, and at the National Centers for Environmental Prediction (NCEP) since 1989. At NCEP, it is known as the Rapid Update Cycle Surface (RUCS) system. QC results from the MSAS and RUCS systems have been used for several years by the Profiler Control Center (PCC) to monitor the quality of the stations in the Profiler Surface Observing System (PSOS) (Miller and Fozzard 1994), and by the ASOS Operations and Monitoring Center to monitor the quality of stations in the ASOS network (Miller and Morone 1993). The initial LDAD implementation of MSAS will ingest, QC, and analyze most AWIPS surface observations contained in a domain covering the 48 contiguous states and neighboring areas of Canada and Mexico. The observations include surface data available over the Satellite Broadcast Network, i.e. standard Meteorological Aviation Reports (METARs) and reports from PSOS and buoy stations, as well as reports from the local mesonets available at each WFO.
Table 1 lists the tolerance limits used in the validity check. Observations not falling within these limits are flagged as bad by the validity check.
Table 1. Tolerance limits for the validity quality control check. Observations not falling between these limits are flagged as bad.
The spatial consistency (or "buddy") check is performed using an Optimal Interpolation (OI) technique developed by Belousov et al. (1968). At each observation location, the difference between the measured value and the value analyzed by OI is computed. If the magnitude of the difference is small, the observation agrees with its neighbors and is considered correct. If, however, the difference is large, either the observation being checked or one of the observations used in the analysis is bad. To determine which is the case, a reanalysis to the observation location is performed by eliminating one neighboring observation at a time. If successively eliminating each neighbor does not produce an analysis that agrees with the target observation (the observation being checked), the observation is flagged as bad. If eliminating one of the neighboring observations produces an analysis that agrees with the target observation, then the target observation is flagged as ``good'' and the neighbor is flagged as "suspect." Suspect observations are not used in subsequent OI analyses. Figure 1 illustrates the reanalysis procedure.
Figure 1. Graphic illustration of reanalysis procedure used in the spatial consistency check to determine if the target observation is bad or if one of the observations used in the QC analysis is bad. The reanalysis procedure is implemented only if the difference between the target observation and the analysis is greater than an error threshold that is a function of the analysis error.
To improve the performance of the OI, MSAS analysis fields from the previous hour are used as background grids. The analyses provide an accurate 1-h persistence forecast and allow the incorporation of previous surface observations, thus improving temporal continuity near stations that report less frequently than hourly. The differences between the observations and the background are calculated and then interpolated to each observation point before the OI analysis is performed. In addition, uniform distribution of the neighboring observations used in the spatial consistency check is guaranteed (whenever possible) by the search algorithm illustrated in Figure 2. The algorithm locates the nearest observation in each of eight directional sectors distributed around the target observation.
Figure 2. Observation search sectors for the QC analysis. The target observation is marked by G. Grid volumes in each sector are searched with increasing distance from the center until a grid volume is found that contains a report.
Temperature observations are converted to potential temperature before application of the spatial consistency check. Potential temperature varies more smoothly over mountainous terrain when the boundary layer is relatively deep and well mixed, a marked advantage during daytime hours. For example, potential temperature gradients associated with fronts tend to be well defined during the day even in mountainous terrain (Sanders and Doswell 1995). Unfortunately, this advantage often disappears at night when cool air pools in valleys. To improve the efficacy of the spatial consistency check in these circumstances, elevation differences are incorporated to help model the horizontal correlation between mountain stations. (Miller and Benjamin 1992).
The error threshold (to which the absolute value of the difference between analyzed and observed values is compared) is a function of the forecast error, the observational measurement error, and the expected analysis error (Belousov et al. 1968, pg. 128).
Specifically, the threshold is computed as
where sigmaf is the standard deviation of the background error, sigmao is the standard deviation of the expected observational error, wi is the OI weight for ith observation used in the analysis, rhoi is the correlation between the target observation and the ith observation used in the analysis, N is an empirical constant to be described later, and
is the expected OI analysis error.
Including the OI error in the threshold is necessary to determine whether large differences between analyses and observations are caused by observational error or uncertainty in the analyzed value at the observation point. Optimal interpolation allows this distinction by providing an estimate of the interpolation error. The magnitude of the error depends on the type, location, density, and number of surrounding observations used in the OI analysis.
The expected observational error accounts for both the measurement and the sampling (unrepresentativeness) errors present in the observations. Since all observations contain these errors, it is more effective to compare the observation minus analysis difference to the expected analysis error plus the observation error than to the analysis error alone. Observational errors in the MSAS system are empirically determined. Pressure observations are considered the highest quality data and are assigned the lowest observational error. Temperature observations are considered second highest quality and are assigned the second lowest observation error. Wind and humidity observations follow pressure and temperature in data quality, with humidity receiving the highest observational error.
Background error standard deviations are included in the threshold function as part of the expected OI analysis error. These errors, like the observational errors, are empirically determined in the MSAS system. New background errors based on MSAS 1-h persistence forecasts are under consideration, but have not yet been implemented.
The empirical constant N in the threshold formula accounts for the fact that the true variability in meteorological fields is several times greater than that which can be represented by the analysis and observational errors (Belousov et al. 1968). For example, near the center of lows and highs, observed values may differ significantly from the analyzed values but are not "bad'" in the sense that they contain errors and should be flagged. Horizontal quality control procedures that contain values of N that are too small (close to one) would incorrectly flag these observations as bad. On the other hand, values of N that are too large result in erroneous observations passing through the quality control procedures unflagged. Correct specification of this parameter is crucial to the performance of the quality control. Statistical procedures were used to set the original threshold parameters for each analysis variable in the MSAS system (Miller and Benjamin 1991). The parameters have since been refined using results from numerical simulations, and will continue to be refined based on additional statistical and numerical studies. The current threshold parameters range from 2.0 for sea-level pressure to 4.2 for potential temperature. The resulting error thresholds identify bad observations, but may also occasionally flag good observations.
The LDAD QCMS will also keep statistics on the frequency and magnitude of the observational errors encountered for sea-level pressure, potential temperature, dewpoint, and surface wind. At the completion of each hourly analysis, the system provides the total number of observations for each variable, the number of observations that failed the QC check, the station names for the failed observations, and the error and threshold values for each of the failed observations. The error is defined as the difference between the QC analysis value and the observed value, as computed in the spatial consistency check described above.
Statistics will be calculated for all stations in the MSAS domain. Stations from different networks will be kept statistically separate. Specifically, for the Build 4.0 implementation of the LDAD QCMS, the following stratifications will be maintained: "ASOS," "SAO" (METAR manual), "AUTO'" (METAR automated, but not ASOS), "BUOY," and "NPN" (NOAA Profiler Network). Local mesonets will be stratified by provider. For example, "CDOT," for the Colorado Department of Transportation.
The statistics will be reported in the form of hourly QC messages. Figure 3 shows the CDOT hourly message for 16 September 1997 at 1300 UTC. Statistics for the total number of observations ("TOTAL OBS"), the total number of observations that failed the QC check ("QST OBS"), and the percentage of failed observations ("PERCENT QST") are given at the top of each page of the hourly message. "QST" represents "questionable" observations. Errors and threshold values for the failed observations are listed in alphabetical order in the columns. In ASOS hourly messages, the stations will be also stratified by NWS region.
Figure 3. Hourly QC message for Colorado Department of Transportation stations on 16 September 1997 at 1300 UTC. The station listed was found bad by the spatial consistency check.
Note that threshold values are not given for the surface wind errors in Figure 3. Wind observations are tested by computing observation errors and threshold values for each of the u and v components of the wind. However, observation errors are converted to polar coordinates before display in the QC messages. If either of the u or v components fail, both direction and speed errors are computed.
Stations listed in the QC messages are either in error due to hardware or software failure, or are unrepresentative of the observation scale and, as such, are susceptible to diurnal, mesoscale, and terrain effects. To help distinguish between the two, daily, weekly, and monthly (4-week) summaries of the hourly QC messages are also provided. The summaries include the percent age of failed observations and the average error and rms error for individual stations and for all stations combined. Figure 4 shows the daily NPN QC message for 16 May 1997. As with the hourly messages, all stations in the domain are used to calculate the statistics reported at the top of each page, but only stations that have failed the QC checks are listed in the individual statistics. Stations with large percentages of failed observations are most likely experiencing hardware or software failures. For example, the QC message in Figure 4 shows PRCO2 (Purcell, OK) as reporting bad dewpoint temperature observations 85% of the time. The rms errors for the station are also identical to the absolute value of the mean error, an indication that a persistent bias exists in the observations. The fact that the error is negative further indicates that the observations are biased high. With this information, the PCC in Boulder was able to determine that the dewpoint sensor at Purcell had failed. The sensor has since been fixed, and the percentage of dewpoint observations failing the QC checks is back to zero.
Figure 4. Daily QC message for NOAA Profiler network (NPN) surface stations on 16 May 1997.
Daily, weekly, and monthly QC messages will include only those stations with observations that have failed the QC checks more than 25% of the time.
Two text files, a "reject" and an "accept" list will be provided to allow forecasters to override the results of the QC checks. The reject list is a list of stations and associated input observations that a forecaster wishes to label as bad, regardless of the outcome of the QC checks; the accept list is the corresponding list for stations that a forecaster wishes to label as good, regardless of the outcome of the QC. Applications reading the lists (e.g. the MSAS analysis) will then reject or accept the stations specified. In both cases, observations associated with the stations in the lists can be individually flagged. For example, the forecaster may wish to add the wind observations at a particular station to the reject list, but not the temperature observations.
QC and station monitoring procedures will not be affected by forecaster intervention lists, with the sole exception that observations on the reject list will be labeled as "suspect'" and not used to check the spatial consistency of neighboring observations. This will allow forecasters to continue to monitor the performance of the stations contained in the lists. For example, a forecaster may notice a station with wind observations that fail the QC checks a large percentage of the time, and choose to add that station to the reject list. However, once failure rate at the station falls back to near zero (possibly due to an anemometer repair), the forecaster will likely delete that station from the list.
In addition to the text QC messages described above, the LDAD QCMS will provide netCDF (Rew and Davis 1997) files containing the hourly quality controlled observations, and the following QC structures: "QcApplied," a bit map indicating which QC checks were applied to each observation (including an indication of forecaster intervention), "QcResults," a bit map indicating the results of the various QC checks, and "QcAdjustVals," an array holding the estimated values calculated by the QC checks (e.g. the analysis-minus-observation value calculated by the spatial consistency check). Also included in the netCDF files are lists of the (surrounding) observations used in the spatial consistency check, as well as the background and threshold values utilized.
The netCDF files will combine stations from each network ingested. However, if desired, "Subsrc" numbers given in the files, along with a text translation file, can be used to identify the originating network for each station.
Both the text QC messages and hourly observation plots color-coded by QC results will be available to forecasters and data providers via the LDAD system.
More work will be required to meet the NWS requirements for the quality control of incoming data. Of highest priority will be the continued implementation of the TSP specifications for the QC of surface data. This will involve the installation of additional QC checks for the observations discussed in this paper (SLP, temperature, winds, and humidity), as well as the implementation of QC checks for other observation types. Table 2 gives an overview of the TSP specifications for QC checks by surface observation. The asterisks indicate checks already implemented. Notice that, although QC results are not yet output, validity and spatial consistency checks have already been implemented for surface pressure change and dewpoint depression observations.
Table 2. NWS-specified quality control checks for surface data. Checks implemented in the initial version of the LDAD QCMS are indicated by asterisks. The checks consist of internal consistency (IC), validity (VC), temporal consistency (TC), climatological (CC), model consistency (MC), spatial consistency (SC), and a consistency check (MC2) based on comparing observations to the Generalized Exponential Markov process (Miller, 1981).
Work will also begin on the specification of the "data descriptors," the data structures intended to define an overall opinion of the quality of each observation by combining the information from the various QC checks. In the current system, an observation is labeled as "bad" if it fails either of the QC checks, and "good" if it passes both. Future versions of the QCMS will label observations as "verified," "screened," "questionable," or "rejected" based on the QC checks currently applied to the observation, the sophistication of those checks, and the departure of the observation from the expected values provided by the QC checks.
Improved forecaster visualization is another high priority item. In future versions of LDAD and WFO-Advanced, two QC subsystems will be maintained to maximize the timeliness of the QC information. In the first subsystem, QC checks will be applied, data descriptors computed, and forecaster displays updated as soon as an observation is available to the WFO-Advanced workstation. This "ingest-driven" subsystem will incorporate all of the single-station QC checks, i.e., the static checks and the temporal consistency check. The second subsystem, the "schedule-driven" subsystem, will incorporate the model and spatial consistency checks, and will be scheduled to run at times when model grids and surrounding observations are most likely to be available. Data descriptors and forecaster displays will be updated immediately after the schedule-driven QC subsystem has completed its update of the QC information.
Additional future work will include enhancements to existing QC checks, the incorporation of new station and observation types, and cooperative research on new quality control techniques.
For example, current plans for MSAS include an increase in spatial resolution from 60 to 20 kilometers, and an increase in temporal resolution from an hourly to a 15- or 20-minute update cycle. These upgrades will produce not only improved surface analyses, but also subhourly QC results and improved background grids for the spatial consistency check.
QC techniques for data from boundary layer profilers will also be implemented when available. Boundary layer profilers measure winds from the surface to approximately 3 km above ground level, and as such require vertical QC checks, in addition to spatial and temporal checks. Preliminary results for several checks, including the TSP-specified model consistency check, are given in Barth et al. (1998).
Research will also continue with a QC scheme being tested for use with local mesonet observations (McGinley and Stamus 1998). The scheme utilizes a Kalman filter, a signal processing technique that has found many applications, including meteorological data assimilation.
The quality control and monitoring system, chosen for AWIPS Build 4.0 implementation, has been described. This initial version of the system utilizes QC checks and monitoring procedures originally developed as part of a surface assimilation system developed at FSL in 1986 and running at NCEP since 1989. QC results from the system have been used for several years by both the ASOS and profiler monitoring centers to help identify erroneous observations and surface stations with hardware or software maintenance problems.
Planned upgrades to the QC system include the incorporation of additional NWS-specified techniques for the quality control of AWIPS data, enhancements to the existing QC techniques, improved forecaster visualization, the incorporation of new station and observation types, and the implementation of new, proven QC techniques from the research community.
Preliminary results from the application of the initial QC system to a local surface mesonet are given in Hartsough et al. (1998).
Barth, M.F., P.A. Miller, M.H. Savoie, C.S. Hartsough, 1998: The LDAD observation quality control and monitoring system: results from the model consistency check applied to boundary layer profiler winds. 10th Symposium on Meteorological Observations and Instrumentation, Phoenix, AZ, Amer. Meteor. Soc., (Paper FA5.9).
Belousov, S.L., L.S. Gandin, and S.A. Mashkovich, 1968: Computer Processing of Current Meteorological Data. Ed. V. Bugaev. Meteorological Translation No. 18, 1972, Atmospheric Environment Service, Downsview, Ontario, Canada, 227 pp.
Grote, U.H. and M. Biere, 1998: The WFO-Advanced system software architecture. 14th International Conference on Interactive Information and Processing Systems, Phoenix AZ, Amer. Meteor. Soc., (Paper FA8.3).
Hartsough, C.S., P.A. Miller, M.F. Barth, M.H. Savoie, 1998: The LDAD observation quality control and monitoring system: results from the spatial consistency check applied to surface observations. 10th Symposium on Meteorological Observations and Instrumentation, Phoenix AZ, Amer. Meteor. Soc., (Paper FA5.8).
Jesuroga, R.T., C. Subramaniam, M. Kelsch, and P.A. Miller, 1998: The AWIPS Local Data Acquisition and Dissemination System. 14th International Conference on Interactive Information and Processing Systems, Phoenix AZ, Amer. Meteor. Soc., (Paper FA 8.16)
McGinley, J.A. and P.A. Stamus, 1998: Second Symposium on Integrated Observing Systems, Phoenix, AZ, Amer. Meteor. Soc., (Paper 7A.1)
Miller, P.A. and S. G. Benjamin, 1991: Horizontal quality control for a real-time 3-h assimilation system configured in isentropic coordinates. Ninth Conf. Numerical Wea. Prediction, Denver, CO, Amer. Meteor. Soc., 32-35.
Miller, P.A., and S.G. Benjamin, 1992: A system for the hourly assimilation of surface observations in mountainous and flat terrain. Mon. Wea. Rev., 120, 2342-2359.
Miller, P.A., and R.L. Fozzard, 1994: Real-time quality control of hourly surface observations at NOAA's Forecast Systems Laboratory. Tenth Conf. Numerical Wea. Prediction, Portland, OR, Amer. Meteor. Soc., 7-9.
Miller, P.A., and L.L. Morone, 1993: Real-time quality control of hourly reports from the Automated Surface Observing System. Eighth Symposium on Meteorological Observations and Instrumentation, Anaheim, CA, Amer. Meteor. Soc. 373-378.
Miller, R.G., 1981: GEM: A statistical weather forecasting procedure. NOAA Technical Report NWS 28, National Oceanic and Atmospheric Administration, U.S. Department of Commerce. 103 pp.
Rew, K.R. and G.P. Davis, 1997: 13th International Conference on Interactive Information and Processing Systems, Long Beach, CA, Amer. Meteor. Soc.
Sanders, F., C.A. Doswell III, 1995: A case for detailed surface analysis. Bull. Amer. Meteor. Soc., 76, 505-521.
Subramaniam, C., R.F. Prentice, J. Adams, L. Angus, 1998: The AWIPS Local Data Acquisition and Dissemination System Architecture. 14th International Conference on Interactive Information and Processing Systems, Phoenix AZ., Amer. Meteor. Soc., (Paper FA 8.17)
Technique Specification Package 88-21-R1 For AWIPS-90 RFP Appendix G Requirements Numbers: Quality Control Incoming Data, 1993. AWIPS Document Number TSP-032-1992R1, NOAA, National Weather Service, Office of Systems Development.