LDAD System Manager's Manual

4.0 General Data Management

4.1 Overview

The Local Data Acquisition (LDA) system has been developed with three things in mind: there is inherent site to site variability; site-specific information changes with time; and the sources from which the LDA needs to acquire data run the gamut from twenty year old technology to current and beyond, such as video data streams. Thus, from the beginning, the LDA system was built using a very modular and generalized component software architecture. For example, all modem commands are recorded in session files or scripts which a program reads and then uses to send the appropriate command to the modem.

Secondly, the processing of the data acquisition system is shared between the local data provider and the NWS. For example, for a long time, ALERT data were acquired by having a dedicated connection between the ALERT base station and a computer that accepted the various weather data in the form of tips for rain gauge, voltage for temperature, pressure etc. where these values have to be converted from electronic values to meteorological values using calibrated conversions. Every time the sensors were re-calibrated or added or removed by the provider, the acquisition team had to go through the sensor tables and update them. Also, because each provider had its own specific format, a decoder and storage module had to be developed for each data type to be acquired. After a while, this became a data acquisition and management nightmare; here is where the LDA strategy proves its value.

The basic strategy is to keep things simple both for the data providers and for the acquisition team. The LDA has developed a simple data format that just about any data stored in a database can use with ease. It is an ASCII format called Comma Separated Values Text (CSVText) where each data field is separated by a comma. Additionally, a set of metadata files created and maintained by the data provider or in conjunction with site personnel will be used by the acquisition decoder to decode the data files for the weather data. Thus, data can be obtained from the weather observation database in meteorological units instead of sensor units. This removes the need for the acquisition team to deal with non-meteorological data and conversion routines. Also, because the CSVText format is easy to generate, the provider is more likely to accept this task and pre-format the data. All data are categorized and stored into four databases: mesonet for surface weather observations, hydro for rain and stream observations, manual for manual observations such as cooperative observers, and upper air for multi-level observations such as profilers. In addition, the CSV text is stored in the text database.

4.2 Data Processing and Storage

***
Please note that the figures in this section have not been updated for Build 5.1.1. Changes have been made in the decoder and storage part of the internal system. The text in the enumerated lists does reflect the modified structure of the code, using the various routerXXX processes.
***

The LDAD decoder is built using a generic architecture that allows it to handle all local data sets that we will be receiving now and in the future. The specific information that is needed to decode an individual data file is contained in ASCII metadata tables that can be modified as necessary for any new data type. Each data file is assigned a type. The self-describing data types such as SHEF, GRIB, and BUFR that have existing decoders will be incorporated within the LDAD framework in later versions. For the non-self describing data type CSVText, we utilize the set of metadata files to obtain a full description of the data. If a message comes in any other ASCII format, a pre-processor is needed to convert it to a CSVText file which can then be decoded.

When a local data file arrives on the LDAD server, a program called newLDADdataNotification, a component of the LDAD Gateway, uses its name and a look-up table (LDADinfo.txt) to determine the type of data contained in the file. Then the file is copied across the firewall using rcp to the internal data server (DS) to be decoded and stored. The LDAD decoder obtains the metadata information using the data-type key and proceeds to decode and store the data in the appropriate database(s).

The local data that are decoded are stored in the text database, and one of four predefined netCDF data formats. These formats are:

Each of these types has its own directory where the netCDF data are stored, i.e. $LDAD_DATA/mesonet, $LDAD_DATA/hydro, $LDAD_DATA/manual, and $LDAD_DATA/upperAir, respectively. The storage program bins the data into hourly netCDF files, so all data acquired within a particular hour are time-stamped and stored in the same netCDF file. These data sets are subsequently quality controlled (QCed) and then made available to the public.

In AWIPS 5.1.1, the display reads hydro and mesonet netCDF files; no display is available for the other types. However, field sites can develop their own plots using the techniques described in the Adaptive Plan View Plotting description (part of the localization documentation).

4.2.1 Logging

All LDAD data decoder and storage processes log their activity in directory $LDAD_INTERNAL_LOGDIR/<yyyymmdd>. Log files are named using the convention <process-name><pid><host><hhmmss>, where hhmmss is the time when the file was opened. Log files for external processes (on LS) are located in the $LDAD_EXTERNAL_LOGDIR directory. Please refer to Chapter 10 for more details on log files.

4.2.2 Data Routing/Flow

The data flow diagram in Figure 4.1 shows all the steps that are performed in the acquiring, decoding, storing, and quality control processing of a mesonet base station file through the LDAD Gateway. This flow is valid only for mesonet type files but the concept holds true for all acquired datasets.
  1. The base station via a dedicated line puts the data onto the LDAD Server, in directory /data/Incoming.
  2. newLDADdataNotification monitors /data/Incoming. As soon as a new data file arrives, a notification message is sent from newLDADdataNotification to CO_serv.
  3. CO_serv sends a message across the firewall to the listener process on the internal side.
  4. listener rcps the new data file from the external side to the /data/fxa/LDAD/tmp directory, in the process adding a date-time string to the file name. For security reasons, data files on the external side can get across the firewall to the internal side only via rcp from the internal side.
  5. If the dataset requires some pre-processing, the appropriate pre-processor is notified and it filters and reformats the datafile to the acceptable form. Preprocessors use /awips/ldad/ldadtmp for scratch space, then write the final file to $LDAD_RAWDATA (/data/fxa/LDAD/Raw). If no preprocessor is specified in /awips/ldad/data/LDADinfo.txt, then the file is moved directly to $LDAD_RAWDATA.
  6. The listener sends a notification to the LDAD CommsRouter, which forwards it to the DataController. Assuming it matches the decoder pattern, the routerLdadDecoder process gets awakened, reads in raw data files, and decodes them, storing the files in $LDAD_DECODED_DATA.
  7. Once the complete data file has been decoded, the decoder alerts the storage processes (routerShefEncoder and routerStoreNetcdf).
  8. The storage processes format and store files for display or further processing.
  9. Once data are stored, notifications are sent to the Notification Server that new data are available. The QC routines will read in the newly created netCDF data file and quality control the data.
  10. The QC process creates a QCed datafile and QC summary messages that are stored in separate directories.
  11. The QC summary messages also are stored in the text database.
  12. This process triggers the LDAD Database Trigger function that retrieves the data and sends a message to the textdbNotify.pl program.
  13. textdbNotify.pl cleans up the data, reformats it, and notifies listener that a file for dissemination is ready for transfer.
  14. listener rcps the data back over to the external side and notifies CO_serv that a new file is available for dissemination.
  15. The hmIngest program (a component of the hmFactor Server) on the external side receives a notification from CO_serv and proceeds to disseminate it to clients.

Figure 4.1 LDAD Data Flow Diagram

A Specific Example: LARC Acquisition

  1. Using the Collection/Dissemination function from the workstation, a user can make a one-time or recurring request for LARC data. In either case, the LdadScheduler process on the workstation contacts the ldadServer on the ds. It uses tell_co to retrieve information about the LARC station.
  2. tell_co retrieves all necessary information regarding collection and storage of data from $LDAD_INTERNAL_DATA/gageinfo and gaugeDescriptions. User information is retrieved from files usrpro and userDescriptions.
  3. tell_co sends a get LARC Message IPC message to listener.
  4. listener opens a TCP/IP socket to CO_serv requesting LARC data transfer.
  5. CO_serv monitors its own socket, waiting for connections. When the LARC request arrives, it builds a specific session file, based on the session file template listed in gageinfo (e.g., HANDAR_sess_1), then uses it to dial the LARC gauge.
  6. CO_serv stores the data in $LDAD_EXTERNAL_DATA/Incoming.
  7. CO_serv copies the data to the $LDAD_EXTERNAL_PUBLIC/hydro directories and then passes the notification to listener. At this point the scheduled processes are completed.
  8. listener retreives the LARC file using rcp and sends a notification when finished.
  9. listener informs the LDAD scheduler that the dataset is available and the GUI displays it to the user.
  10. The dataset is also passed to the decoder/storage suite, via which the data are stored in netCDF files and SHEF-encoded to be placed in the hydro database. (See Section 4.3.3.1, below.) Precip reports from gauges so equipped can be displayed on D2D; these and river stages are also available through the hydro apps.
LARC Acquisition

Figure 4.2 LARC Data Acquisition

4.2.3 Text Product Dissemination

The steps listed here describe how data get into the external database for access by remote users via the bulletin board service (BBS)

  1. When a product is stored in the text database (Informix), the trigger mechanism looks to see if it is in the list for dissemination (as set up during install).
  2. If the new product is in the list, the trigger process stores it in the $FXA_DATA/trigger directory.
  3. Trigger calls textdbNotify.pl providing as arguments the file name and file type (from the trigger configuration file).
  4. textDBNotify.pl reads the product and performs some checks and logging.
  5. textDBNotify.pl sends a message to listener that the file is ready to be sent.
  6. listener registers itself through a TCP/IP socket connection with sendLDADnotification.pl and waits to receive notification for the requested data. When it does, it rcps the file over to the $LDAD_EXTERNAL_DATA_REPOSITORY.
  7. CO_serv initially sets up a TCP/IP socket and waits for connections. listener passes the notification to CO_serv.
  8. CO_serv sends a message to the hmIngest process that the dataset is available.
  9. hmIngest massages the text file and places it into the $LDAD_EXTERNAL_PUBLIC/nwswwas directory (as directed by pollForData.conf), with name <CCCNNNXXX>1.txt. Any previous version(s) of this file are renamed <CCCNNNXXX>2.txt, etc., with 10 versions of each maintained in the directory. (The limit of 10 is set within the hmIngest postprocessing code, and is not site-configurable.)
  10. The requested data are now in the database. The listener also receives the file name and address of the text product.
  11. tmain, the BBS Interface, accesses the file based on the client request.


WWA Dissemination Data Control

Figure 4.3 WWA Dissemination Flow

4.2.4 Non-Text Dissemination

In addition to text, other datasets can be sent to the external system to make them available to outside users.

  1. On a timed basis, pollForData.pl checks files on the ds to see if file(s) of interest have been updated.
  2. When a file is found to be sent, sendLDADnotification.pl is executed.
  3. On the ls, the hmIngest process (java) gets the notification, pulls the file across the firewall, and stores it in the directory specified in pollForData.conf.

4.3 Data Controller Processes

4.3.1 Overview

The Data controller processes are responsible for generating the appropriate messages to the decoder and communication processes. Some processes are standalone that do not require messages but perform their functions independently and on a schedule. There are currently 5 data controllers in the system, each using one or more common LDAD processes:

4.3.2 Interaction with Decoders

When the DS and LS systems start up, all LDAD processes including the LDAD decoder/storage suite are started up by the main start script startLDAD.csh on the DS system.

4.3.3 Decoding and Storage

The figures presented throughout this section depict a high level process and data flow from ingest, communication, to the data controllers and then to decoder and storage and finally to be disseminated as well. Processes are shown in boxes. Servers are labeled using the two letter system designator. Where applicable, the primary or secondary is also noted. The arrows depict primary data flow. Log files are depicted on the server where the logging process resides. As described in Section 4.2.1, all internal and external logfiles reside in the $LDAD_INTERNAL_LOGFILES/<yymmdd> and $LDAD_EXTERNAL_LOGFILES directories, respectively. Generated data files and accessed configuration files are depicted as well.

4.3.3.1 LARC Decoding

  1. newLDADdataNotification reads data periodically from LDAD Incoming directory - $LDAD_EXTERNAL_DATA/Incoming.
  2. It creates and sends an IPC message to CO_serv (TCP/IP socket) informing of the availability of LARC files.
  3. CO_serv sets up a TCP/IP socket and waits for connections. CO_serv passes the notification to listener.
  4. listener sets up a TCP/IP socket connection, binds its local address so that the client (CO_serv) can send to it, and waits for the connection from the client process. When the whole message is read, it retreives the LARC file by rcp, writes it to $LDAD_DATA/tmp, and sends a notification.
  5. listener executes preProcessLDAD.pl to preprocess LARC data report files. CO_serv passes a fully qualified data file, the host name used by the sendNotification subroutine, the port number of the process that requires notification upon completion of the preprocessing operation, and necessary information about the LARC data file format.
  6. preProcessLDAD.pl generates the date and time string from the filename and appends it to the file name. Upon completion of the preprocessing, LARC files are restored to $LDAD_RAWDATA.
  7. routerLdadDecoder goes to the LDAD directory, searches for preprocessed LARC files, decodes them, and writes them to $LDAD_DECODED_DATA.
  8. routerLdadDecoder notifies the storage processes.
  9. routerStoreNetcdf and routerShefEncoder store the decoded files to netCDF and SHEF files, respectively.
  10. Storage sends a message to the Notification Server that new data are available.
LARC Data Decoder

Figure 4.4 LARC Data Decoder

4.3.3.2 Any CSV Mesonet

  1. newLDADdataNotification reads data periodically from LDAD Incoming directory - $LDAD_EXTERNAL_DATA/Incoming.
  2. It creates and sends an IPC message to CO_serv (TCP/IP socket) informing of the availability of mesonet files.
  3. CO_serv sets up a TCP/IP socket and waits for connections. CO_serv passes the notification to listener.
  4. listener sets up a TCP/IP socket connection, binds its local address so that the client (CO_serv) can send to it and waits for the connection from the client process. When the whole message is read, it retrieves the mesonet file by rcp, writes it to $LDAD_DATA/tmp, and sends a notification.
  5. listener executes preProcessLDAD.pl to preprocess mesonet data report files. CO_serv passes a fully qualified data file, the host name used by the sendNotification subroutine, the port number of the process that requires notification upon completion of the preprocessing operation, and necessary information about the mesonet data file format.
  6. preProcessLDAD.pl generates the date and time string from the filename and appends it  to the file name. Upon completion of the preprocessing, mesonet files are restored to $LDAD_RAWDATA.
  7. routerLdadDecoder goes to the LDAD directory, searches for preprocessed mesonet files, decodes them, and writes them to $LDAD_DECODED_DATA.
  8. routerLdadDecoder notifies the storage processes.
  9. routerStoreNetcdf stores the decoded files to netCDF files.
  10. Storage sends a message to the FX Notification Server that new data are available.
Mesonet Data Controller

Figure 4.5 CSV Mesonet Data Controller

4.3.3.3 ALERT

Same as Section 4.3.3.2, with the addition of SHEF encoding of the data.

4.3.3.4 IFLOWS

Same as Section 4.3.3.2, with the addition of SHEF encoding of the data.

4.3.3.5 SUA

TBD

4.4 Quality Control

LDAD observations are quality controlled by the AWIPS Quality Control and Monitoring System (QCMS) which, in turn, relies on the MAPS Surface Assimilation System (MSAS) to perform automated quality control checks. MSAS/QCMS is a multi-scheduled system, with four main subprocesses: the model ingest cycle, the subhourly QC cycle, the hourly QC/analysis cycle, and the daily QC summary generators. Each subprocess runs independently based on crontab entries on AS2: the model ingest cycle runs twice a day, the subhourly QC runs every 5 minutes, the hourly QC/analysis cycle runs immediately after the xx:18 subhourly run; and the QC summary programs run near 00Z.

The QCMS for AWIPS Build 5.1.1 is a partial implementation of the overall requirements for the quality control of surface observations. The subhourly processing consists of the application of validity, internal consistency, and temporal consistency checks to LDAD mesonet observations of sea-level pressure, temperature, dewpoint temperature, wind, station pressure, altimeter setting, pressure change, relative humidity, visibility, and precipitation observations; the hourly consists of the application of validity, internal consistency, temporal consistency, and spatial consistency checks to LDAD mesonet and NOAAPORT observations of sea-level pressure, temperature, wind, and dewpoint temperature.

WFO personnel can override the results of the automated QC checks. See Sections 9.6 and 12.1 of this document for more information on the subjective intervention capabilities.

The QCMS also calculates hourly, daily, weekly, and monthly statistics on the frequency and magnitude of the observational errors encountered for sea-level pressure, temperature, dewpoint, and surface winds. It processes both NOAAPORT and LDAD mesonet observations. Messages containing these statistics are stored in the text database.

See Section 10 of the D2D User's Guide for more information on the QCMS; see Section 2 for more information on MSAS.

Figure 4.6 illustrates the overall data flow in MSAS. Click on a box to view details of the function and the processes it uses, or here to review the complete list of MSAS processes.


Figure 4.6 MSAS Data Flow

Additional information on MSAS is available in these plain text files

4.5 Notification

Within the LDAD system, there are 3 types of notification processes; The IGC notification process is used by listener and CO_serv communications; it passes through the firewall using the firewall's TCP proxy. The IGC uses standard IPC calls for its processes.

All LDAD notifications within either the internal or external systems use standard IPC calls.

FX has built a package that sits on top of standard IPC that is used for its Notification Server. For communication between the LDAD Storage processes and FX Notification Server, we use FX's dmNotify IPC package.

4.6 Purging

Scour is the standard FX purging process. It purges files based on age.

4.6.1 Internal, AS, DS

Purging for the AS and DS machines is performed by Scour. All internal directories are group readable and writable. The ldad account, being a member of the fxalpha group, allows all its data and log files to be purged by this scour process.

4.6.2 External, LS

Purging for the LS is performed by Scour.


Table of Contents Prev Next


Last updated: 30 Nov 00 AWIPS 5.1.1