This site administrator's guide provides information needed to run and troubleshoot the Message Handling System (MHS) software at sites that are connected to the AWIPS Wide Area Network (WAN). The WAN will replace the Automation of Field Operations and Services (AFOS) communications system now used by the NWS.
The MHS will eventually provide the connectivity needed for AWIPS sites to exchange e-mail, data, and other types of messages. For Build 4.0, the MHS supports the transmission of products created on the Text Workstation, including forecasts, watches, warnings, and administrative messages. These products will also be transmitted via the AFOS system, so it may be difficult to tell whether or not the MHS is working. The only messages that will travel exclusively over the WAN are administrative messages sent from the NCF to field sites.
The MHS is composed of software developed by three groups: ISOCOR, PRC, and FSL. The ISOCOR software is a COTS package responsible for making connections between sites, and for sending messages over the WAN. The PRC layer provides an interface to the ISOCOR software and performs the following functions: transforms messages into the format required by the ISOCOR software; transforms address information into the ISOCOR format; provides three priority queues to handle messages of different priorities. For Build 4.0, the FSL software provides an interface between the PRC layer and the Text Workstation software, allowing products created by forecasters at the WFOs to be transmitted via the WAN. The FSL software also routes incoming messages to the Text Database, so the messages can be viewed on the Text Workstations.
The PRC and ISOCOR software are both file-based. That is, instead of sending the incoming and outgoing messages to each other via some sort of IPC mechanism, they both write the messages to disk, and send notifications to each other. The PRC software writes outgoing message files to /data/x400/msgtbl, and its incoming files to /mhs/msg/rcvq. The ISOCOR software writes outgoing messages to /data/x400/sx400/input, and incoming messages to /data/x400/rx400/output. The ISOCOR software deletes its files as soon as it is done with them, but the PRC software doesn't. The files in the PRC directories may be accessed by user applications, and need to be retained until the user is finished with them.
The ISOCOR, PRC, and FSL software should all run on the ds machine.
The MHS software is automatically started when the ds system comes up. The MHS software will also be restarted any time the data ingest system is restarted, for example when you use the "Restart All" button on the Data Ingest Restart menu. It is not currently possible to restart the MHS software by itself from this menu.
In Build 4.0, if you want to start the MHS software by itself, use the following commands:
These commands are useful for restarting the MHS after a problem has been detected. See Section 3.0, "Trouble-shooting," for more information on system errors that may occur.
After the MHS software has been started, the following processes should be running:
You can use the utility program "/awips/ops/bin/show_x400msg_procs" that PRC has developed to show information about the PRC processes when they are running. It lists all of the currently running processes, and for each msgreq_svr and msgrcv_svr process, it also shows the process id, user id, group id, message queue length, and status (UP, DOWN).
The following commands can be used to stop the MHS processes:
As mentioned in the Overview section, it may be difficult to tell whether or not the MHS software is working. However, if you suspect that there is a problem (for example, you haven't received an administrative message you are expecting from the NCF), there are several places to look to see what's going on. All of the processes involved write to log files. There are also several utility programs that can be used to determine the state of the system.
Log files are generally the first things to check when you want to see if the system is functioning properly. The log files are useful for two reasons:
The following sections describe the log files associated with each module (ISOCOR, PRC, and FSL), and what you can expect to find in each log file.
The ISOCOR software writes to a log file called /usr/x400mail/x400.log. This log file is in binary format, so you have to run a program to look at it:
Each line is tagged by day of month and time. For each message it handles, the MTA writes out who the originator is, where the message file is located, where the message is being sent, and some message id information from the message header. If there are problems with any of this information (can't find the message file, error encountered when sending message to the recipient, etc.), the MTA will write an appropriate error message to the log file. (The -e option prints only error and alarm messages.)
If there are errors reported in the x400.log file, they should be reported to the NCF. These errors are usually the result of one of three things:
The PRC software also writes to log files. The log files are located in the /awips/ops/logs/<hostname> directory: A cron script monitors the size of these files. If one exceeds 5 MB, it is renamed <filename>.old and a new one started.
Two MCServiceGuard-related logs, mcStatus.log and mcTrace.log, are also found in this directory.
If there seems to be a problem with the MHS, and there are error messages in the PRC log files, try to restart the PRC system using the stop_x400msg_procs and start_x400msg_procs scripts described in sections 2.1 and 2.3 . If this doesn't help, contact the NCF and let them know about the error messages.
The FSL message handling server (MhsServer) writes to a log file in /data/logs/fxa/<date>/MhsServer<processId><hostname><time>. The MhsServer log file contains a trace of every message that is sent to the PRC software, and also contains reports of any run-time errors that occur. The most frequent run-time error encountered by the MhsServer is that it can't run the program that interfaces to the PRC software (msg_send). This usually means the msg_send program isn't in the expected location, or has the wrong permissions. If you see this type of error in the MhsServer log file, notify the NCF.
There are several utility programs you can use for troubleshooting the MHS software. Some of them have been mentioned already.
There are three table files used by the Message Handling System.
This file is located in $FXA_HOME/data, and contains a list of the sites that are currently connected to the WAN. This file is used by the Text Workstation code to translate the AFOS site IDs entered by the user into the site IDs expected by the PRC software.
There are two sections in this file. In the first section, the individual sites are identified. The AFOS three character id for each site is listed, followed by the PRC site id. In the second section, the possible addresses the user can enter in the text editor header block are listed. This includes ALL, DEF, and the regional addresses C, E, S, and W. In awipsSites.txt, all of these addresses except DEF are followed by the "ALL" PRC address, which results in messages being sent to all sites currently on the WAN. For Build 4.0, the messages addressed to DEF also go to all sites on the WAN.
The awipsSites.txt file currently contains the addresses of the first 21 sites to be connected to the WAN. awipsSites.txt also contains the site IDs for the systems located at NWS headquarters, PRC, and FSL.
This file is located in the $PROJECT/data/mhs directory. $PROJECT points to the directory that contains PRC Message Handling System code (currently /awips/ops).
"rcvHandler.tbl" contains information needed by the message receiving software developed at PRC. Each line in the file contains a code number, the directory where the incoming message file can be found, a flag indicating the type of handling routine (currently limited to SYS), and the command or list of commands to be executed when a message is received.
This file is located in $FXA_HOME/data. It contains a list of AFOS product IDs, and the corresponding WMO IDs and AWIPS site IDs needed by the Text Workstation to complete the AWIPS product header. This table currently contains product IDs for Guidance and Public Forecast products, as well as Administrative products. Any product created and stored using the AWIPS text editor will be sent out on the WAN if it is in this table.
Note: This table will need to be updated before Build 4.0 goes into the field, since not all products were in the table at the time the Build 4.0 software was completed and handed off. The current afos2awips.txt table is limited to information at the SOC Change Notices Web site (http://www.nws.noaa.gov/cgi-bin/chg_show.pl?fn=CID97-3.TXT).
The following environment variables are used by the Message Handling System:
These environment variables are defined in $FXA_HOME/.environs, and are set when you log in as user "fxa."