LDAD System Manager's Manual
4.0 General Data Management
4.1 Overview
The Local Data Acquisition (LDA) system has been developed with
three things in mind: there is inherent site to site variability;
site-specific information changes with time; and the sources from
which the LDA needs to acquire data run the gamut from twenty year old
technology to current and beyond, such as video data streams. Thus,
from the beginning, the LDA system was built using a very modular and
generalized component software architecture. For example, all modem
commands are recorded in session files or scripts which a program
reads and then uses to send the appropriate command to the modem.
Secondly, the processing of the data acquisition system is shared
between the local data provider and the NWS. For example, for a long
time, ALERT data were acquired by having a dedicated connection
between the ALERT base station and a computer that accepted the
various weather data in the form of tips for rain gauge, voltage for
temperature, pressure etc. where these values have to be converted
from electronic values to meteorological values using calibrated
conversions. Every time the sensors were re-calibrated or added or
removed by the provider, the acquisition team had to go through the
sensor tables and update them. Also, because each provider had its own
specific format, a decoder and storage module had to be developed for
each data type to be acquired. After a while, this became a data
acquisition and management nightmare; here is where the LDA strategy
proves its value.
The basic strategy is to keep things simple both for the data
providers and for the acquisition team. The LDA has developed a simple
data format that just about any data stored in a database can use with
ease. It is an ASCII format called Comma Separated Values Text
(CSVText) where each data field is separated by a comma. Additionally,
a set of metadata files created and maintained by the data provider or
in conjunction with site personnel will be used by the acquisition
decoder to decode the data files for the weather data. Thus, data can
be obtained from the weather observation database in meteorological
units instead of sensor units. This removes the need for the
acquisition team to deal with non-meteorological data and conversion
routines. Also, because the CSVText format is easy to generate, the
provider is more likely to accept this task and pre-format the data.
All data are categorized and stored into four databases:
mesonet for surface weather observations, hydro for rain
and stream observations, manual for manual observations such as
cooperative observers, and upper air for multi-level
observations such as profilers. In addition, the CSV text is stored in
the text database.
4.2 Data Processing and Storage
***
Please note that the figures in this section have not been
updated for Build 5.1.1. Changes have been made in the decoder and
storage part of the internal system. The text in the enumerated lists
does reflect the modified structure of the code, using the various
routerXXX processes.
***
The LDAD decoder is built using a generic architecture that allows
it to handle all local data sets that we will be receiving now and in
the future. The specific information that is needed to decode an
individual data file is contained in ASCII metadata tables that can be
modified as necessary for any new data type. Each data file is
assigned a type. The self-describing data types such as SHEF, GRIB,
and BUFR that have existing decoders will be incorporated within the
LDAD framework in later versions. For the non-self describing data
type CSVText, we utilize the set of metadata files to obtain a full
description of the data. If a message comes in any other ASCII format,
a pre-processor is needed to convert it to a CSVText file which can
then be decoded.
When a local data file arrives on the LDAD server, a program called
newLDADdataNotification, a component of the LDAD Gateway, uses its
name and a look-up table (LDADinfo.txt) to determine the type of data
contained in the file. Then the file is copied across the firewall
using rcp to the internal data server (DS) to be decoded and stored.
The LDAD decoder obtains the metadata information using the data-type
key and proceeds to decode and store the data in the appropriate
database(s).
The local data that are decoded are stored in the text database,
and one of four predefined netCDF data formats. These formats are:
- Mesonet type base stations (e.g., highway department weather
stations, ALERT weather)
- Hydrological data (e.g., ALERT river stage and rainfall,
Integrated Flood Observing and Warning System (IFLOWS))
- Manual reports (e.g., coop observers)
- Upper Air reports (e.g., microART)
Each of these types has its own directory where the netCDF data are
stored, i.e. $LDAD_DATA/mesonet, $LDAD_DATA/hydro, $LDAD_DATA/manual,
and $LDAD_DATA/upperAir, respectively. The storage program bins the
data into hourly netCDF files, so all data acquired within a
particular hour are time-stamped and stored in the same netCDF file.
These data sets are subsequently quality controlled (QCed) and then
made available to the public.
In AWIPS 5.1.1, the display reads hydro and mesonet netCDF files;
no display is available for the other types. However, field sites can
develop their own plots using the techniques described in the Adaptive Plan
View Plotting description (part of the localization
documentation).
4.2.1 Logging
All LDAD data decoder and storage processes log their activity in
directory $LDAD_INTERNAL_LOGDIR/<yyyymmdd>. Log files are named
using the convention <process-name><pid><host><hhmmss>,
where hhmmss is the time when the file was opened. Log files for
external processes (on LS) are located in the $LDAD_EXTERNAL_LOGDIR
directory. Please refer to Chapter
10 for more details on log files.
4.2.2 Data Routing/Flow
The data flow diagram in Figure 4.1 shows all the steps that are
performed in the acquiring, decoding, storing, and quality control
processing of a mesonet base station file through the LDAD Gateway.
This flow is valid only for mesonet type files but the concept holds
true for all acquired datasets.
- The base station via a dedicated line puts the data onto the LDAD
Server, in directory /data/Incoming.
- newLDADdataNotification monitors /data/Incoming. As soon as a new
data file arrives, a notification message is sent from
newLDADdataNotification to CO_serv.
- CO_serv sends a message across the firewall to the listener
process on the internal side.
- listener rcps the new data file from the external side to the
/data/fxa/LDAD/tmp directory, in the process adding a date-time string
to the file name. For security reasons, data files on the external
side can get across the firewall to the internal side only via rcp
from the internal side.
- If the dataset requires some pre-processing, the appropriate
pre-processor is notified and it filters and reformats the datafile to
the acceptable form. Preprocessors use /awips/ldad/ldadtmp for scratch
space, then write the final file to $LDAD_RAWDATA (/data/fxa/LDAD/Raw).
If no preprocessor is specified in /awips/ldad/data/LDADinfo.txt, then
the file is moved directly to $LDAD_RAWDATA.
- The listener sends a notification to the LDAD CommsRouter, which
forwards it to the DataController. Assuming it matches the decoder
pattern, the routerLdadDecoder process gets awakened, reads in raw
data files, and decodes them, storing the files in
$LDAD_DECODED_DATA.
- Once the complete data file has been decoded, the decoder alerts
the storage processes (routerShefEncoder and routerStoreNetcdf).
- The storage processes format and store files for display or
further processing.
- Once data are stored, notifications are sent to the Notification
Server that new data are available. The QC routines will read in the
newly created netCDF data file and quality control the data.
- The QC process creates a QCed datafile and QC summary messages
that are stored in separate directories.
- The QC summary messages also are stored in the text database.
- This process triggers the LDAD Database Trigger function that
retrieves the data and sends a message to the textdbNotify.pl
program.
- textdbNotify.pl cleans up the data, reformats it, and notifies
listener that a file for dissemination is ready for transfer.
- listener rcps the data back over to the external side and
notifies CO_serv that a new file is available for dissemination.
- The hmIngest program (a component of the hmFactor Server) on the
external side receives a notification from CO_serv and proceeds to
disseminate it to clients.
Figure 4.1 LDAD Data Flow Diagram
A Specific Example: LARC Acquisition
- Using the Collection/Dissemination function from the workstation,
a user can make a one-time or recurring request for LARC data.
In either case, the LdadScheduler process on the workstation contacts
the ldadServer on the ds. It uses tell_co to retrieve information
about the LARC station.
- tell_co retrieves all necessary information regarding collection
and storage of data from $LDAD_INTERNAL_DATA/gageinfo and
gaugeDescriptions. User information is retrieved from files usrpro
and userDescriptions.
- tell_co sends a get LARC Message IPC message to listener.
- listener opens a TCP/IP socket to CO_serv requesting LARC
data transfer.
- CO_serv monitors its own socket, waiting for connections. When
the LARC request arrives, it builds a specific session file, based
on the session file template listed in gageinfo (e.g., HANDAR_sess_1),
then uses it to dial the LARC gauge.
- CO_serv stores the data in $LDAD_EXTERNAL_DATA/Incoming.
- CO_serv copies the data to the $LDAD_EXTERNAL_PUBLIC/hydro
directories and then passes the notification to listener. At this
point the scheduled processes are completed.
- listener retreives the LARC file using rcp and sends a
notification when finished.
- listener informs the LDAD scheduler that the dataset is available
and the GUI displays it to the user.
- The dataset is also passed to the decoder/storage suite, via
which the data are stored in netCDF files and SHEF-encoded to be
placed in the hydro database. (See Section
4.3.3.1, below.) Precip reports from gauges so equipped can be
displayed on D2D; these and river stages are also available through
the hydro apps.
Figure 4.2 LARC Data Acquisition
4.2.3 Text Product Dissemination
The steps listed here describe how data get into the external
database for access by remote users via the bulletin board service
(BBS)
- When a product is stored in the text database (Informix), the
trigger mechanism looks to see if it is in the list for dissemination
(as set up during install).
- If the new product is in the list, the trigger process stores it
in the $FXA_DATA/trigger directory.
- Trigger calls textdbNotify.pl providing as arguments the file name
and file type (from the trigger configuration file).
- textDBNotify.pl reads the product and performs some checks and
logging.
- textDBNotify.pl sends a message to listener that the file is
ready to be sent.
- listener registers itself through a TCP/IP socket connection with
sendLDADnotification.pl and waits to receive notification for the
requested data. When it does, it rcps the file over to the
$LDAD_EXTERNAL_DATA_REPOSITORY.
- CO_serv initially sets up a TCP/IP socket and waits for
connections. listener passes the notification to CO_serv.
- CO_serv sends a message to the hmIngest process that the
dataset is available.
- hmIngest massages the text file and places it into the
$LDAD_EXTERNAL_PUBLIC/nwswwas directory (as directed by pollForData.conf), with name
<CCCNNNXXX>1.txt. Any previous version(s) of this file are renamed
<CCCNNNXXX>2.txt, etc., with 10 versions of each maintained in the
directory. (The limit of 10 is set within the hmIngest postprocessing
code, and is not site-configurable.)
- The requested data are now in the database. The listener also
receives the file name and address of the text product.
- tmain, the BBS Interface, accesses the file based on the client
request.
Figure 4.3 WWA Dissemination Flow
4.2.4 Non-Text Dissemination
In addition to text, other datasets can be sent to the external
system to make them available to outside users.
- On a timed basis, pollForData.pl checks files on the ds to see
if file(s) of interest have been updated.
- When a file is found to be sent, sendLDADnotification.pl is
executed.
- On the ls, the hmIngest process (java) gets the notification,
pulls the file across the firewall, and stores it in the directory
specified in pollForData.conf.
4.3 Data Controller Processes
4.3.1 Overview
The Data controller processes are responsible for generating the
appropriate messages to the decoder and communication processes. Some
processes are standalone that do not require messages but perform
their functions independently and on a schedule. There are currently 5
data controllers in the system, each using one or more common LDAD
processes:
- LARC Controller
- One Time
- Scheduled
- CSV Formatted Data
- Trigger Controller
4.3.2 Interaction with Decoders
When the DS and LS systems start up, all LDAD processes including the
LDAD decoder/storage suite are started up by the main start script
startLDAD.csh on the DS system.
4.3.3 Decoding and Storage
The figures presented throughout this section depict a high level
process and data flow from ingest, communication, to the data
controllers and then to decoder and storage and finally to be
disseminated as well. Processes are shown in boxes. Servers are
labeled using the two letter system designator. Where applicable, the
primary or secondary is also noted. The arrows depict primary data
flow. Log files are depicted on the server where the logging process
resides. As described in Section 4.2.1, all internal and external
logfiles reside in the $LDAD_INTERNAL_LOGFILES/<yymmdd> and
$LDAD_EXTERNAL_LOGFILES directories, respectively. Generated data
files and accessed configuration files are depicted as well.
4.3.3.1 LARC Decoding
- newLDADdataNotification reads data periodically from LDAD Incoming
directory - $LDAD_EXTERNAL_DATA/Incoming.
- It creates and sends an IPC message to CO_serv (TCP/IP socket)
informing of the availability of LARC files.
- CO_serv sets up a TCP/IP socket and waits for
connections. CO_serv passes the notification to listener.
- listener sets up a TCP/IP socket connection, binds its local
address so that the client (CO_serv) can send to it, and waits for
the connection from the client process. When the whole message is
read, it retreives the LARC file by rcp, writes it to $LDAD_DATA/tmp,
and sends a notification.
- listener executes preProcessLDAD.pl to preprocess LARC data report
files. CO_serv passes a fully qualified data file, the host name
used by the sendNotification subroutine, the port number of the
process that requires notification upon completion of the
preprocessing operation, and necessary information about the LARC data
file format.
- preProcessLDAD.pl generates the date and time string from the
filename and appends it to the file name. Upon completion of the
preprocessing, LARC files are restored to $LDAD_RAWDATA.
- routerLdadDecoder goes to the LDAD directory, searches for
preprocessed LARC files, decodes them, and writes them to
$LDAD_DECODED_DATA.
- routerLdadDecoder notifies the storage processes.
- routerStoreNetcdf and routerShefEncoder store the decoded files
to netCDF and SHEF files, respectively.
- Storage sends a message to the Notification Server that new data
are available.
Figure 4.4 LARC Data Decoder
4.3.3.2 Any CSV Mesonet
- newLDADdataNotification reads data periodically from LDAD Incoming
directory - $LDAD_EXTERNAL_DATA/Incoming.
- It creates and sends an IPC message to CO_serv (TCP/IP socket)
informing of the availability of mesonet files.
- CO_serv sets up a TCP/IP socket and waits for
connections. CO_serv passes the notification to listener.
- listener sets up a TCP/IP socket connection, binds its local
address so that the client (CO_serv) can send to it and waits for
the connection from the client process. When the whole message is
read, it retrieves the mesonet file by rcp, writes it to
$LDAD_DATA/tmp, and sends a notification.
- listener executes preProcessLDAD.pl to preprocess mesonet data
report files. CO_serv passes a fully qualified data file, the host
name used by the sendNotification subroutine, the port number of the
process that requires notification upon completion of the
preprocessing operation, and necessary information about the mesonet
data file format.
- preProcessLDAD.pl generates the date and time string from the
filename and appends it to the file name. Upon completion of the
preprocessing, mesonet files are restored to $LDAD_RAWDATA.
- routerLdadDecoder goes to the LDAD directory, searches for
preprocessed mesonet files, decodes them, and writes them to
$LDAD_DECODED_DATA.
- routerLdadDecoder notifies the storage processes.
- routerStoreNetcdf stores the decoded files to netCDF files.
- Storage sends a message to the FX Notification Server that new
data are available.
Figure 4.5 CSV Mesonet Data Controller
4.3.3.3 ALERT
Same as Section 4.3.3.2, with the addition of SHEF encoding of the
data.
4.3.3.4 IFLOWS
Same as Section 4.3.3.2, with the addition of SHEF encoding of the
data.
4.3.3.5 SUA
TBD
4.4 Quality Control
LDAD observations are quality controlled by the AWIPS Quality
Control and Monitoring System (QCMS) which, in turn, relies on the
MAPS Surface Assimilation System (MSAS) to perform automated quality
control checks. MSAS/QCMS is a multi-scheduled system, with four main
subprocesses: the model ingest cycle, the subhourly QC cycle, the
hourly QC/analysis cycle, and the daily QC summary generators. Each
subprocess runs independently based on crontab entries on AS2: the
model ingest cycle runs twice a day, the subhourly QC runs every 5
minutes, the hourly QC/analysis cycle runs immediately after the xx:18
subhourly run; and the QC summary programs run near 00Z.
The QCMS for AWIPS Build 5.1.1 is a partial implementation of the
overall requirements for the quality control of surface observations.
The subhourly processing consists of the application of validity,
internal consistency, and temporal consistency checks to LDAD mesonet
observations of sea-level pressure, temperature, dewpoint temperature,
wind, station pressure, altimeter setting, pressure change, relative
humidity, visibility, and precipitation observations; the hourly
consists of the application of validity, internal consistency,
temporal consistency, and spatial consistency checks to LDAD mesonet
and NOAAPORT observations of sea-level pressure, temperature, wind,
and dewpoint temperature.
WFO personnel can override the results of the automated QC
checks. See Sections 9.6 and 12.1 of this document for more
information on the subjective intervention capabilities.
The QCMS also calculates hourly, daily, weekly, and monthly
statistics on the frequency and magnitude of the observational errors
encountered for sea-level pressure, temperature, dewpoint, and surface
winds. It processes both NOAAPORT and LDAD mesonet
observations. Messages containing these statistics are stored in the
text database.
See Section 10 of the D2D User's Guide for more information on the
QCMS; see Section 2 for more information on MSAS.
Figure 4.6 illustrates the overall data flow in MSAS. Click on a
box to view details of the function and the processes it uses, or here to review the complete list of MSAS
processes.
Figure 4.6 MSAS Data Flow
Additional information on MSAS is available in these plain text
files
4.5 Notification
Within the LDAD system, there are 3 types of notification processes;
- Inter-Gateway Communcations (IGC)
- Standard IPC
- FX-IPC
The IGC notification process is used by listener and CO_serv
communications; it passes through the firewall using the firewall's
TCP proxy. The IGC uses standard IPC calls for its processes.
All LDAD notifications within either the internal or external
systems use standard IPC calls.
FX has built a package that sits on top of standard IPC that is
used for its Notification Server. For communication between the LDAD
Storage processes and FX Notification Server, we use FX's dmNotify IPC
package.
4.6 Purging
Scour is the standard FX purging process. It purges files based on age.
4.6.1 Internal, AS, DS
Purging for the AS and DS machines is performed by Scour. All internal
directories are group readable and writable. The ldad account, being a
member of the fxalpha group, allows all its data and log files to be
purged by this scour process.
4.6.2 External, LS
Purging for the LS is performed by Scour.
Last updated: 30 Nov 00 AWIPS 5.1.1