Prototype System Plan

DRAFT

 

 

 

I. Introduction

 

A. Objectives

 

The National Weather Service (NWS) is embarking on a new philosophy of operations that emphasizes collaboration among the various field offices and flexibility in addressing the weather situation and producing forecast information. In order to support this new concept of operations, the existing information systems must be evolved (or perhaps revolutionized) to improve performance, open the way to new high resolution data sets, and integrate capabilities which exist in various independent applications. 

 

In the near term, NWS plans to migrate AWIPS and other systems to Linux-based computers. This will allow them to refresh their hardware infrastructure while decreasing maintenance costs and improving performance. This work should be expedited to the extent possible, as many components are already end-of-life. A roadmap was recently presented (Feb. 04) to complete this migration to Linux. This prototype project would be complementary in that it will provide a means to test the approach laid out in the roadmap before implementing this nationwide.

 

An important goal in the near term is addressing performance issues. The current status of the AWIPS migration to Linux has uncovered some performance problems, particularly for the Graphical Forecast Editor. This has come about because the px computers are being tasked to run applications that were not part of the original system architecture. There is also a desire to make additional performance gains in displaying critical weather information, as AWIPS has yet to meet the performance criteria originally specified for it.

 

The paradigm for real-time data in AWIPS has been to acquire the data, process it as necessary, and store it locally on site. With the ever expanding volume of data, other approaches must be investigated. A promising new approach is distributed data paradigm. The data remain at the location where they were generated or at a suitable serving site. A catalogue of available data is maintained at each site in real-time. A site can then request the data and it is served from the storage site to the requesting site. This paradigm assumes sufficient network bandwidth to deliver the data (or a representation of the data) within an acceptable time frame.  Such an approach is highly scalable versus the current approach that requires increasing broadcast bandwidth whenever new data are added. There will still be a need to store some data locally, particularly data used for short fused warnings and data that would be requested by most sites. The implication is that some analysis will need to be done to determine what data sets are good candidates for the new distributed data paradigm.

 

A number of special purpose applications have been developed to meet specific needs.  The NMAP application is used by national centers and OCONUS WFOs to generate graphic products for end users.  The generation is done as an overlay while viewing a variety of weather information. The FX-C application allows offices to collaborate interactively while viewing weather data. It can also be used to generate a variety of graphic products for briefings and web sites. Forecasters at WFOs use the Graphical Forecast Editor to modify gridded data that become the basis for all of their forecast products. Currently these run as separate applications in addition to the basic AWIPS system. This is inefficient and limits the flexibility of where tasks can be done; only sites that have the NMAP (or FX-C) infrastructure can produce manual graphics for example. Current systems/applications do not support the new concept of operations very well. A longer-term objective of this project (given continued resources and commitment) will be to put in place a revised display services infrastructure that allows the integration of these capabilities.

 

B. Opportunity

 

NWS is planning to migrate all of AWIPS to Linux. The workstations have already been replaced by Linux based systems. The server replacement is scheduled for the next several years. This project would allow an earlier checkout of the server migration. There is also an opportunity to explore a few system architecture changes as part of the migration to Linux. A network attached storage device could potentially simplify the overall architecture, make the system more expandable, and improve performance. This needs to be investigated in a prototype/operational environment to see if the benefits justify the cost. This is precisely what this prototype project would accomplish.

 

Beyond the short-term migration opportunities, it will also serve to demonstrate some longer term directions as well. For example, it could host a prototype workstation that had limited functions but was based on a revised design that could be extended to integrate the capabilities represented in the current standalone applications discussed earlier.

 

C. Approach

 

FSL has a long history of exploratory development, generally using a "rapid prototyping" approach, in which systems are built and tested in operational settings. This method closely parallels the widely-known Spiral Model of development, which features incremental development and testing, with requirements and objectives refined based on the experience gained in each cycle. This allows for concepts to be tested rapidly before operational implementation on a nation-wide scale. It also recognizes that not all ideas work in operations. This provides a means to refine or discard those capabilities that don't demonstrate their worth in an operational setting.

 

 

 

 

II. System Overview

 

A. Architecture Alternatives

 

The proposed hardware architecture closely resembles what FSL discussed with NGIT and subsequently presented to the AWIPS SET (System Engineering Team). The target architecture consists entirely of Linux machines interconnected by a Gigabit Ethernet network. One HP machine running HP-UX 10.20 may be retained temporarily to facilitate transition to the new architecture. This machine will make it possible to interconnect to the current field infrastructure (e.g. WAN and Message Handling System) during system evaluation. High system availability is addressed by internal redundancy for the NAS (Network Attached Storage) and fail-over software that allows processes to execute on backup machines. The NAS is the primary storage for all meteorological data that needs to reside locally (i.e. WFO). The current AWIPS system will be retained in the forecast office during the evaluation of the prototype system.

 

An extension of this architecture introduces new technology that is not part of the architecture presented to the AWIPS SET. It includes a machine that serves as a kick-start server and DHCP daemon to simplify software upgrades. A 64-bit Linux machine replaces one of the 32-bit machines to handle access to very large databases and support scientific computations. A 64-bit version of the Linux operating system will be needed for this machine. A high performance cluster may be added to support local modeling. The architecture features a second separate Ethernet between the NAS and some of the Linux machines to reduce network contention between incoming data and processed data.

 

B. Hardware Components

 

The key component of the system is the NAS since it is the repository for all processed data. This device makes it possible to have all of the Linux machines access the meteorological database. The NAS must be highly reliable with dual power supplies, processors, and RAID. The NAS is dedicated to servicing requests for data storage and retrieval and does not process any meteorological data. The Linux machines are commodity processors that are readily available from various vendors. The workstations are dual 32-bit machines with the highest available clock rate and up to 2 GB of memory. Figure 1 illustrates the basic system architecture and its components.

 

Local Data Acquisition and Distribution (LDAD) and the Graphical Forecast Editor (GFE) have each been given dedicated machines to reduce resource contention with other essential AWIPS processing functions.

    

C. Expandability

                        

The Ethernet architecture makes it easy to add additional processors and storage devices. The network switch must have a sufficient number of ports to support a large number of processors and must be expandable to add additional ports. A second Gigabit Ethernet can easily be added to increase the available bandwidth between the Linux machines and to provide redundancy.

 

Furthermore, the NAS makes it possible to provide full access to all meteorological data from any of the machine on the network. This is significant for system fail-over and to support several applications that need access to a variety of data. A second NAS can be added if the capacity of the original NAS is insufficient. Also, the new software architecture will make it possible to access external databases directly so that not all data that a forecaster may want to access needs to be on the local NAS.

 

D. System Evolution

 

The large anticipated increase in the amount of data (due higher resolution models and improved observations) will put significant demands on all parts of the system, from network to data storage and processing. It is envisioned that much of this data could be stored at various remote locations, possibly regional databases. This would reduce the amount of data that would have to be transmitted and stored at each WFO. However, the WAN bandwidth would need to be upgraded significantly to allow adequate response at the local offices. The concept of regional or distributed databases also makes it possible to reassign forecast responsibilities (such as for operational backup).

 

The architecture can readily adapt to different processing requirements by redistributing processes to different machines and adding machines to handle site-specific requirements (such as for the National Centers or RFCs). It is conceivable that a local high performance cluster could be added to perform specific forecast modeling.

 

 

Prototype Hardware

         

Figure 1. AWIPS Linux Architecture

 

 

 

III. Development

 

A.  Workstation Transformation

 

       a. Objectives - Develop a revised workstation design that provides services and appropriate APIs and layers to allow integration of NMAP, FX-C, and ultimately GFE capabilities in an integrated system. This entails a redesign the current AWIPS API.

 

       b. Hardware Architecture - The workstation will use a standard off-the-shelf 32-bit computer with two CPUs, a 24-bit video card, and a Gigabit Ethernet card. The video memory will be large enough to handle a large number of animation frames in multiple windows. The workstation will support three monitors, each driven by its own X-server.

 

       c. Software architecture - The enhanced architecture will support an API that will lead to a more modular software design and simplify application integration. The objective is to limit further enhancements to the IGC (interactive graphic capability) and make basic IGC functionality available through a well-defined API. New functionality, such as drawing and annotation, would be implemented as an application that would communicate through this interface to perform basic drawing functions. This interface would act similar to the current D2D extension framework except that the application does not have to be written in C++ and could be integrated/compiled separately from the display code.

 

       d. Development tasks/(resources) - The API needs to be defined with the help of the applications developer. Since this interface could be extensive to meet the many needs of applications such as GFE, NMAP, and others, only a subset will be implemented for the initial prototype. The focus will be on providing stronger support for multi-color graphics and implementing a "basic" drawing and annotation capability. Future developments will provide a more complete drawing implementation, similar to that provided by NMAP.

 

 Software enhancements are also planned to allow exchange of D2D graphics with GFE and provide an XML interface to outside users for manually generated products.

         

API Definition   4 pm
Basic API development   12 pm
Basic drawing tool   6 pm
GFE Graphic exchange   4 pm
XML Interface   6 pm
    ------
    32 pm

 

       e. Deliverables - FSL will deliver the following:

       f. Schedule/Milestone - The first major milestone includes completion of all the tasks identified above:

       g. Issues - The API requirements must be well understood in order to provide a usable interface. The lack of information about future applications will make it difficult to ensure all necessary interfaces are defined.

 

B. Data Management Improvements

 

      a. Objectives – Develop a new data management paradigm for AWIPS that accommodates anticipated increases in data volume more easily than the current approach. The new approach consists of a combination of “push” and “pull” technologies, i.e. some data will continue to be broadcast, while other data will be requested from remote servers.

 

·        Demonstrate NAS and its benefits.

·        Demonstrate a new distributed data paradigm for a few new data sets. 

·        Improve purger functionality and performance

·        Modify data inventory and notification services in response to new paradigm and to improve performance

·        Demonstrate Kick-start server to load new software and O/S

·        System fail-over (processor failure)

·        Migrate server processes to Linux (to the extent possible)

 

 

       b. Hardware Architecture - The NAS device is a key element of the new data management architecture. Raw data will be acquired by the Linux CP or LDAD processor and temporarily stored on the preprocessor's local disks. Once processed, the data will be converted to a standard format and stored on the NAS. The NAS device will be NFS mounted by the workstations and other Linux machines.

 

       c. Software architecture - The new software architecture will eliminate the concept of a single data location (i.e. FXA_DATA) for all meteorological data. An inventory server that contains information on the location and availability of data (i.e. catalog), will be implemented to point to the desired data. Using the UNIDATA DODS (or OPENDAP) concept data can be retrieved from local or remote disks without changing the application. This allows a model grid on a remote data server to be displayed as though it were stored locally and transparent to the user. The notification server will be rewritten to accommodate this new paradigm. It is envisioned that "green times" may only be computed when the user opens a menu or for data that is currently displayed in the various panes.

 

      d. Development tasks/(resources) - The NAS must be evaluated for performance and flexibility. A catalog and inventory server must be developed to store information about the location and availability of the data. This server then needs to be integrated with a system that can access remote data and convert it to a standard format (e.g. NetDCF) that can be used by the application. The current notification server needs to be rewritten.

 

·        NAS evaluation                        4 pm

·        Catalog development                8 pm

·        Remote data access                  8 pm

·        Notification server                    6 pm

·        Sample (remote) data               6 pm

·        Postgres text DB                      6 pm

-------

                                                            38 pm

      e. Deliverables :

 

·        Prototype of a new system management approach for AWIPS.

 

      f. Schedule/Milestone :

 

·        NAS evaluation                        6 mos

·        New data management 1 year

 

       g. Issues - The national infrastructure, such as a very high speed WAN, regional databases, and MHS replacement, will not be in place by the time the prototype is developed. This will make it difficult to properly evaluate the new data management paradigm in an operational setting.

 

 

C. Network Improvements

 

      a. Objectives – Implement a network that can accommodate growth in data acquisition and processing.

·        Demonstrate state-of-the-art LAN network

·        Implement a network to enable distributed data prototype

 

       b. Hardware Architecture - The current FDDI network will be eliminated. All systems (including a "transition" HP) will be connected to the Ethernet switch.         

 

       c. Software architecture - The new Linux architecture requires use of fail-over software (including a heartbeat LAN) to switch processes from one machine to another.

 

        d. Development tasks/(resources) - The network improvements are hardware changes that are mostly transparent to the application software. The activities to be performed are primarily evaluation of fail-over software and system upgrade using a kick-start server. Providing a second database on FSL’s LAN will permit limited evaluation of the distributed data paradigm.

 

·        Network configuration and evaluation                      4 pm

 

        e. Deliverables:

 

·        System test results (summary)

 

        f. Schedule/Milestone:

 

·        Evaluation complete                        6 mos

 

        g. Issues - The implications of the proposed new DVBS link on the AWIPS architecture are not well understood? Will there be significant changes to the D2D interface?

 

 

D. Workstation augmentation

 

a.       Objectives - Demonstrate performance improvements using the latest available

hardware. Also, investigate the use of OpenGL for 3-D data display

 

b.      Hardware Architecture - The workstation video cards require implementation of

the OpenGL instruction set to achieve maximum gain in performance.

 

         c. Software architecture - No change to the system architecture is envisioned for the initial investigation.

 

         d. Development tasks/(resources) - The current software architecture and code will be examined to determine if caching and redesigning portions of code could result in significant performance improvements. The use of OpenGL will be tested for 3-D visualization (e.g. WDSS II).

 

·        Code analysis and modification       4 pm

 

      e. Deliverables:

 

·        Modified code, if appropriate.

 

      f. Schedule/Milestone:

 

·        Modifications that improve system performance will be delivered with the prototype system    1 year

 

       g. Issues:

 

·        None

           

 

 

E. Resource Summary

 

            Staff resources (Phase 1)

                Workstation Transformation                       32 pm

                 Data Management Improvements              38 pm

                 Network Improvements                              4 pm

                 Workstation Augmentation                         4 pm

                                                                                   --------

                                                                                    78 pm

            Staff resources (Phase 2)

                Workstation Transformation                      25 pm

                 Data Management Improvements              20 pm

                 Workstation Augmentation                         5 pm

                                                                                    --------

                                                                                     50 pm

 

               Major milestones (months after start of project)

                   Preliminary design of proposed API          3

                   Implement flat file for D2D WarnGen       3

                   NAS evaluation                                          6

                   All Linux System Integration                   12

                   Sample App. using drawing Capability    12

                   New data management Infrastructure       12

                   API to load/store D2D display data          18

                   Remote OPeNDAP access (GFE grids)    24

              

 

IV. Issues

 

The tasks described in this plan assume that the basic ports of the AWIPS code (i.e. IGC, decoders, notification servers, monitors, etc) are performed under the NGIT contract. This plan addresses longer-term issues, such as software transformation, improvements in data management, evolution of the D2D code, and additional hardware components.

 

This plan assumes that the initial prototype (Phase 1) will require additional development (Phase 2) to complete the software transformation and implement the distributed data paradigm at a wider level.

 

V. Appendix

 

The appendix provides a list of near-term AWIPS problems addressed by the prototype, and also depicts the planned one-step transition from the current AWIPS system to the final all-Linux system (presented by NGIT).

 

 

AWIPS OB3 Critical/Major Problems
 
 
Problem:   SSH performance problems on HP

The use of a prototype testbed discovers these problems well ahead of field deployment. Also, the Linux-based system has huge performance gains over the legacy HP equipment.

 
Problem:   ISC traffic using message handling system

The GFE, inter-site coordination in support of the national forecast database continue to consume resources (network, data, CPU). Some sites have reduced their forecasts to 2.5km resolution thus stressing the AWIPS system to limits not seen before. Problems storing >2GB files on the current architecture have appeared, as well as quantity of data sent between sites. The use of the NAS, web-based data pull technology for the distributed data base, and increase in network bandwidth will all help for future needs

in this area.
 
Problem:   Various radar display problems

The new radar volume coverage patterns, increased product frequency, and open RPG have all stressed the ingest and display of radar products on AWIPS. Problems reading data between the HP NFS-mounted disks and the new Linux PC workstations are mounting. Event services are being taxed by almost an order of magnitude more requests, and internal data structures supporting the inter-process communication services are being stretched too far. The number of  products for rendering has doubled (6600 in 5.x, 14000 in OB3/OB4), and users are instantiating several D2D sessions on one PC. The forecaster is seeing  data dropouts, problems with failovers, and auto-update problems with respect to radar displays during severe weather. 

 

The new prototype will solve this problem with intelligent registration for products, higher-speed access to products via network and new storage devices, and redesign of the event services software.

 
Problem:  Warngen problems with new follow-up capabilities

The new warning software which assists the forecaster in creating text-based products for public dissemination is rapidly changing. The current software depends heavily on a legacy relational database on  the HP hardware for follow-up warnings. Delays have been seen when 'popping up' text messages from the interactive warning generator on D2D.

 

The new Linux prototype will remove the RDB from the D2D display as a performance enhancement. This will be the first introduction to a flat-file based text retrieval that should result in enhanced performance for warning dissemination.

 
 

AWIPS One Step Transition to Linux

 

Hardware Specs



Figure A1.  FSL's Prototype Linux Test Bed

 

 

 

Planned Prototype System Configuration

AWIPS Hardware


Figure A2. Proposed AWIPS Linux Configuration


Prepared by Herb Grote;
last update 8 Jun 04 by J. Wakefield.