Table of Contents
Previous Chapter
The D2D component of the WFO-Advanced system is a "modern computing application" in the foregoing regard. Our software reads data from remote sources (the data ingest processes), distributes tasks between clients and servers (the communications router, the notification server, or the IGC_Process, for example), and presents results to users on workstation and X terminal displays.
Interprocess communication (IPC) of this sort is such a fundamental part of the WFO-Advanced D2D software that we have developed an object-oriented library to make the sending and receiving of data relatively simple. In fact, 60% of the applications in the D2D software link with our IPC library.
The IPC library developed for the D2D has three major components:
Recognizing that various IPC transport mechanisms may exist simultaneously, and that some are better suited for certain communications tasks than others, the D2D IPC library includes support for multiple, concurrent IPC mechanisms. The library hides the underlying IPC transport from an application programmer and automatically chooses the best one for each message to be sent.
Currently, the software's decision of which mechanism to use is made quite easy; our IPC library has never had multiple transport mechanisms. Since the birth of D2D, we have used three different transport mechanisms. Each successive mechanism has satisfied the requirements better than the previous mechanism so we haven't found a need to support two or more mechanisms concurrently. This may change as more reliable and high performing mechanisms emerge.
The rest of this chapter:
This requirement is crucial. In fact, the D2D data ingest system relies on this feature -- 14 of the 24 processes that comprise the data ingest system will not function without it.
Currently, the vast majority of our IPC messages are specified to be at normal priority. In fact, the only type of message that is currently of a high priority is for relaying radar alert data.
For example, a text depictable in an IGC_Process may need data from a TextDB process running on some remote host. The IGC sends an IPC message to the TextDB requesting the data. The IGC does not want to do any more processing until the reply from the TextDB arrives, so it chooses to receive messages only from the TextDB process.
In a UNIX environment, most event multiplexors are implemented with a select() call which uses file descriptors to identify the devices to multiplex. Therefore, the IPC transport mechanism must provide file descriptors for the devices that receive IPC messages.
The alternative to using an event multiplexor is to poll each of the input sources separately in a loop. This wastes CPU time. Considering that a single D2D graphics workstation runs 24 (or more as users start extensions and applications) processes -- each of which has to poll for input -- the waste is astronomical.
The number of bytes for most IPC messages passed between D2D processes is rather small. However, some extensions such as warnGen and the alertAreaEditor can potentially pass large messages (> 64k) to the IGC process.
If one or more auxiliary processes are used to implement the IPC mechanism, then these processes should be able to restart without having to restart the client processes. In other words, these processes should be able to save and restore their state. The IPC mechanism should be able to detect an unresponsive IPC server process, and restart that process if necessary.
As far as being maintainable, the mechanism should not be a black box. It should either have a well-documented theory of operation or provide source code to its implementation.
The mechanism should require little system administrator assistance. It should use familiar forms of addressing.
Users expect that green times displayed on the user interface product menus or the display of a product on an IGC will be updated as soon as new data for that product arrives. The expediency of auto-notification depends heavily on IPC. The decoders notify the notification server, which in turn notifies the IGCs, the UI, and the volume browser. The notification server sends and receives thousands of messages per hour; some of these messages can be quite large.
Remember, the UI and the display of data is divided into separate processes, although the user may not be aware of this. If the user selects a product for display, or a pane to swap with the large pane, he/she expects the results almost immediately. An average user response time of more than two seconds would register serious complaints among our users. Since the IGC may spend a good portion of the two seconds reading and displaying data, rapidly sending the message from the UI to the IGC and back to the UI is crucial.
Like the UI and IGC, extension interactivity is partitioned between the IGC and an extension process. One of the most time-critical tasks for a meteorologist is the generation of warnings, which is managed by the warnGen extension. Potentially very large IPC messages are sent between the IGC and warnGen processes. If the messages take too long to send and receive, the user will be bogged down with sluggish responsiveness.
While it would be nice to state this requirement as "throughput of at least 30 kilobytes per second," or some other figure, this requirement depends on the user's perceptions of system responsiveness, and the speed of non-IPC processing.
Granted, approaching UNIX IPC can be a bit difficult because of all the different mechanisms it provides. Since UNIX had a colorful evolution, the IPC features it provides are equally colorful. There are a number of IPC mechanisms directly provided by the UNIX operating system, each with its own features and blemishes. There are also a number of packages, commercial and free, that layer themselves atop the UNIX IPC mechanisms. These "middleware" packages make using UNIX IPC easier or provide additional features not directly provided by UNIX itself. Since the beginning of our development, we have used three distinct mechanism implementations: one utilizes a middleware package, while the other two have been built on top of native UNIX mechanisms.
Figure 5a.1 DMQ Transport Design
To receive a message, a client process makes a call to the DMQ API, passing a time-out value, in deciseconds. The routine will not return until a message has been received or the time-out has been exceeded. Optionally, the client can pass an address, which tells the routine to return a message only from a process with that address. The DMQ routine makes a request via the bus to a queuing process for a message. If there is a message on the queue for the receiving process, it is sent back via the bus.
DMQ supports queueing processes with multiple queues, which makes it possible to separately receive high- and normal-priority messages, satisfying another important IPC requirement.
Receiving specific messages distinguished by address is another strong feature of the DMQ transport mechanism. The client process can ask the DMQ queuing process for a message with a particular address, leaving all the other messages still queued.
In Figure 5a.1, main components of the DMQ architecture are depicted as black boxes since DEC has supplied minimal information about the internals of these pieces. For example, the DMQ bus might be a TCP socket, a UNIX Domain socket, or something entirely different. Likewise, all we know about the queueing processes is that there are at least three running simultaneously, and sometimes up to six processes. Little is known about what these processes do, or the internal details of the queues; this makes this transport mechanism extremely difficult for D2D developers to maintain.
Another major weakness is that the DMQ transport is definitely not event multiplexor friendly. A UNIX select() call cannot be used to determine when data has arrived on the DMQ bus, since DEC does not supply a file descriptor for the bus. This has tremendous performance ramifications on the D2D display processes, which receive events from multiple sources: IPC, stdin and stdout (application interface), X, timer events, etc. These display processes are forced to resort to polling, which consumes a lot of unnecessary CPU time.
Unfortunately, the DMQ bus has a limitation of 32 kilobytes. The software we wrote around DMQ does not support automatically fragmenting and reassembling messages, so the responsibility falls to the users of the IPC library. This is not desirable since IPC clients have enough to worry about.
Besides requiring that system administrators install and maintain the product, DMQ layers an additional level of addressing on top of that already provided by every UNIX workstation. This makes debugging of IPC problems more difficult because of the additional lookup or memorization needed to map from DMQ host number to host name or Internet Protocol (IP) address. Users who initiate processes using IPC are also burdened with setting an environment variable to the DMQ host number for the host of the process.
Using separate processes to queue messages allows this transport mechanism to satisfy our important requirement of asynchronous sends. However, this strategy adversely affects performance and reliability. The dynamic memory for the binary byte stream has to be allocated in the sending process, receiving process, and the intermediate queuing process. The extra context switch to the queuing process increases transmission time, especially for large messages. For reliability reasons, a queuing process is not optimal since it can crash. Even worse, DMQ queueing processes do not save their state upon a crash, which means that all client processes will need to be restarted. This is problematic since the workstation takes considerable time to initialize.
Finally, this middleware product costs money. But UNIX provides several IPC mechanisms, and we already pay for UNIX. Why shell out more money?
UNIX Domain sockets, also known as local sockets, enable processes on a single UNIX system to communicate. These kinds of sockets are identified by creating nodes in the file system, usually under the /tmp directory. The X Window System uses UNIX Domain sockets: programs running on the same host as the X server can connect to the X server on display 0 by using the socket /var/spool/sockets/X11/0. This kind of socket features an easy addressing scheme (nodes in the filesystem) but can transfer data between processes on a single system only.
Sun's Remote Procedure Call system, or RPC, is an interface layered above sockets and is free with HP/UX. The client sends a message by making what appears to be a procedure call. The parameters passed into the procedure are the message data. The return value of the procedure is an optional reply message from the destination process.
In order to implement asynchronous sends (fire and forget), an IPC server daemon runs on every host where our client processes need IPC. The server maintains a queue for every running client process on the server's host, which is indexed by the process address (host IP address and process ID). Each queue maintains a local socket that is connected to a client process. RPC calls are used for communication from a client process to the server. Local sockets are used for sending queued messages from the server to the client. This design is very similar to how DMQ supports synchronous sends except for some notable improvements:
Here is the basic algorithm for this transport mechanism. Figure 5a.2 illustrates this algorithm.
Local sockets and RPC are friendly to event multiplexors since a file (socket) descriptor can easily be obtained. And because this mechanism is event multiplexor friendly, a display process can truly sleep while waiting for an IPC, X, or a timer event arrive. No CPU intensive polling is necessary.
Both UNIX domain sockets and RPC support a message size that is limited only by the amount of virtual memory to which a process has access. Although this mechanism has code to fragment and reassemble messages, it is rarely used.
The addressing for this mechanism is fairly universal. Every UNIX process has a process ID, and every UNIX host has an IP (Internet Protocol) address. This is a big improvement over DMQ.
The IPC server has separate queues for IPC messages with different priorities, satisfying an important IPC requirement.
The requirement of synchronous sends is not completely satisfied with this mechanism. When the sending process returns from its synchronous send, it knows that the message has been received at the server process running on the host of the target process. However, the message has not arrived at the destination process yet; the destination process might be in the middle of some intensive processing and cannot read the local socket immediately.
Although not as mysterious as DMQ, RPC is somewhat of a black box. Unfortunately, it is difficult to tell how many internal sockets are used for its transport. That information is pertinent to the server daemon since UNIX has a limit on the number of open sockets and files a process can have. In addition, HP's implementation of RPC was flawed with HP-UX 10.10: occasionally, strange values were returned from RPC calls. The problem seems to be have fixed with HP-UX 10.20.
Using separate processes to queue messages allows this transport mechanism to satisfy our important requirement of asynchronous sends. However this strategy adversely affects performance and reliability. The dynamic memory for the binary byte stream has to be allocated in the sending process, receiving process, and the intermediate queuing process. The extra context switch to the queuing process increases transmission time, especially for large messages. For reliability reasons, a queuing process is not optimal since it can crash. Fortunately, unlike DMQ, our server can restore its state after a crash, eliminating the need for clients to be restarted.
This design is fairly complex, and so is the resulting code which deplorably affects maintainability. The code to generate RPC calls is particularly complex. Initially, we planned to use the utility rpcgen to generate the RPC code. Unfortunately, it produced code of dubious quality.
The message selection requirement is not completely satisfied with this mechanism. The local socket connected to the server may contain many messages from different target processes. There is not a way to read from the middle of a socket and leave the rest of the socket data intact. Thus, other non-desirable messages have to be read before the desired message can be read. This implementation defers the dispatching and processing of the non-desired message until the desired message arrives. This could be time consuming and memory exhaustive if the desired message takes a long time to arrive and a lot of unwanted messages arrive in the meantime. DMQ and Thread Based Socket IPC provide better solutions for satisfying selective reception of messages.
Instead of a queuing process, we opted for a multiple thread environment in order to support asynchronous sends. Maurice Bach in The Design of the UNIX Operating System defines a thread as the following:
Until now, most if not all applications developed here at FSL have used single-threaded processes, which have a single flow of control through program code. Processes linking with our IPC library will potentially be multi-threaded with multiple flows of control. A multi-threaded process can achieve significant performance gains through the use of concurrent thread execution. This means the two or more threads are in progress at the same time. For some HP hosts with multiple processors, such as the K series, two or more threads can be executed simultaneously.
This mechanism uses multiple threads of execution to implement asynchronous (fire and forget) sends. Here is a possible scenario without using multiple threads. Suppose the notification server process is trying to send a message to an IGC process, but the IGC is really busy doing some other processing, perhaps constructing some radar tables. The UNIX buffer between the two sockets has become full, so when the notification server tries to write to the socket, the UNIX write() call blocks execution until the IGC reads some bytes from its end of the buffer. Since execution is blocked, the notification server cannot process other notifications, which will delay the notifications sent to other workstations who have IGCs that are responsive.
Now, consider the above scenario in a multi-threaded process. When the notification server detects the socket to the IGC is full, it creates a new thread of execution whose job is to keep writing to the socket until the entire message is sent. This thread will be blocked inside its write() call. Meanwhile, the main thread can proceed with the business of the notification server. When the IGC removes some bytes from the socket buffer, the socket writing thread wakes up, and receives the CPU so it can write to its socket. If the notification server keeps trying to send messages to the busy IGC, more socket writing threads may be created. Thus, the process could have a main thread and many socket writing threads. Keep in mind that thread creation does carry some performance overhead, so we want to initiate threads only when the socket buffer is full.
The real strength of this design is the elimination of the queueing (server) process. The socket writing threads (threads are often referred to as lightweight processes) take the place of the queueing process and are used only when really needed. The result is a dramatic performance increase over our previous two transport mechanisms. See Section 5a.2.5.
The following diagram depicts the general data flow for this mechanism. Each client process maintains a socket for every other process which which it wants to communicate. The bidirectional TCP socket can be used for both sending and receiving data on the same host or remote hosts. Unfortunately, the number of files (and/or sockets) a process can have open is limited (most of our hosts are configured to have 60, although this is a tunable kernel parameter at the expense of increased memory use). In an effort to be considerate of our clients' needs, the maximum number of file descriptors that the IPC library consumes is a third of the system max. In order to accomplish that goal, each socket object will record the time it was last accessed for either writing or reading. If a request for a new connection will exceed our 20 (or so) socket limit, the least recently used socket will be destroyed.
As mentioned already, not having a third party server process involved will increase performance and reliability.
The use of threads fully satisfies the asynchronous send requirement. And the real advantage of using threads is that they are used only when really needed; only when the destination process is not responsive. Performing a send without a thread constitutes a synchronous send, which satisfies another important requirement.
The default buffer size for a TCP socket is 32K (32,768) bytes, but can be configured to the maximum size of 256K (262,144) bytes. This mechanism uses the maximum buffer size, but that is a tunable parameter, specified in our config file (ipc.config). The IPC library does multiple socket writes and reads in order to send messages of unlimited size, satisfying another requirement of our IPC library.
Our design allows for client processes to have two sockets between them, one for normal and one for high priority messages, although the extra socket will be created only as needed. Currently, very few clients send or receive high priority messages, although that could certainly change. Our event multiplexor has been enhanced so that it will flag high priority sockets as readable before normal priority sockets.
Since each target has its own socket, it is relatively easy to implement the requirement of selecting messages from a particular destination. The algorithm for doing this is explained in Section 5a.4.2.3.
Because we are using threads only when needed and eliminating an extra process context switch, the performance of this mechanism is quite impressive. In the words of Beavis and Butthead, "this system rocks, man!"
We know that doesn't sound promising for a critical library of an operational forecast system, but we are pretty optimistic about HP's future plans for operating systems that support the multi-threaded programming model. HP just announced their plans to support kernel threads in HP-UX 10.30 which are entities that are visible to the OS kernel, as opposed to user threads which exist in user space and execute user code. Kernel threads promise to make signal handling in a multi-threaded environment more robust. We shall see about that. As we free our system from commercial off-the-shelf software (COTS), it will become more portable and will have more flexibility in pursuing superior thread libraries such as the one offered by Sun Microsystems.
Table 5a.1 IPC Requirement Comparisons
DMQ Server Socket Thread Socket
------------------------------------------------------------------
Asynchronous Sends Yes Yes Yes
Synchronous Sends Yes No+ Yes
Message Priority Yes Yes Yes
Message Selection Yes No+ Yes
Friendly to Event Multiplexors No Yes Yes
Unlimited Message Size No Yes Yes
Reliable and Maintainable No+ Yes- Yes
Performs Well No+ Yes- Yes
------------------------------------------------------------------
For the performance tests, we used an asynchronous, normal priority message containing the string, "This is a performance test". Each test sent the message 200 times; the time displayed in the following table is the average of those 200 times. We performed the test five times: twice for each of the three transport mechanism; once for a local transmission, and once for a remote transmission (the sender and receiver running on different hosts within the development network). Unfortunately, we were not able to perform a remote test for DMQ due to address configuration problems. The same hosts were used for all the tests, and performed when the hosts were relatively idle.
Here are the results from those tests.
Table 5a.2 IPC Performance Comparisons
Test Time it took in seconds to send a message Description and receive a reply from that message ----------------------------------------------------------- DMQ (local) 0.0596364 DMQ (remote) Not Available Server (local) 0.0168488 Server (remote) 0.0160134 Thread (local) 0.0022547 Thread (remote) 0.0104642 -----------------------------------------------------------
Suppose you are a software engineer working for the hippest 3D visualization animation software producer in the industry. And your company is desperately trying to escape a hostile take over, and you're in charge of researching financial trends in the industry. You decide to enlist the help of some hot shot financial analyst, and unfortunately you find my number, an analyst with a shady Wall Street company who has been involved in covert financing of the Irish Republican Army.
Anyway, your desktop phone has multiple line capability (up to 10); each line is accessible by a button which flashes when an incoming call arrives on the line. So with my number in hand, you select a line on your phone; wait for the dial tone; dial; and voila, you are talking to my receptionist, 1800 miles away. You ask the receptionist for my extension. I have a similar phone and one of the buttons is flashing. I push the button and start listening to your company's woes.
What does this have to do with our design? With a little a bit of imagination, it's not hard to see at all. The companies for which you and I work are analogous to computer processes. Obviously, communication between other companies is part of our business, but our companies have specific missions and methods for accomplishing those missions. Certainly the notificationServer and the IGC_Process are processes needing communication with other processes, but both have specific reasons for being beyond IPC. You and I are analogous to objects in a process, each having a specific task in the context of a larger mission. The multi-line phone manages connections and thus would be considered a connection manager. Our design has a connection manager except we call it a SocketConnection object. Each button on the phone manages a communication endpoint. That certainly fits the definition of a socket. Between my button and your button is an electronic telephone line (or maybe fiber-optic, but neither of us really cares about the details). Between every two sockets is a two-way socket buffer; the actual implementation of this buffer can be many things, all transparent to us.
What about the receptionist? I'm glad you asked. No actual data is transferred between him/her and you. You're just requesting a connection between me and you. The receptionist uses the same phone we are using, so s/he has access to a button/socket also. Except this button is used only for accepting and forwarding connections. Every company usually has some sort of receptionist, albeit some are automated, but usually there is only one per company. Our design has a receptionist also, except we call ours an AcceptSocket object. Like the receptionist, there is only one per process, and its sole function is to accept and forward connections.
Back to me and you. As I talk, you listen and vice versa. This is a polite conversation, even though I have a Type A personality. The words I use are converted to electronic signals; sent over the phone line; and then converted back to words. Our IPC messages start out in data formats that the client object understands, but are converted to a binary stream that garners host/architecture independence. Once the message is received, it is quantized back into words for the benefit of the client objects that process the incoming message. Once you hear my message, I may look up some figures on my PowerBook, and then reply. Our communication is two way, and we are exchanging data back and forth. Just as the receptionist's button is analogous to a socket, it had a special purpose, so we gave it a special name, AcceptSocket. My button and your button are also analogous to sockets. In our design, TCP sockets that are endpoints between two-way communication lines that transfer actual data messages are encapsulated in what we call DataSocket objects.
A SocketConnection object can have multiple DataSocket objects; each connected to a different process. However like our phones, there is a limit to how many active connections a process can simultaneously maintain. What if I needed to call you, and all my lines were being used? One possible solution is to hang up on the person whose line has been idle the longest. Not a solution that will give you style points or good karma, but at least it will free up the line for that important phone call to me. Our design does the same thing. When a process has reached its limit of outside connections (usually about 20), then the SocketConnection object looks for the least recently used DataSocket object, and destructs the object which is very analogous to hanging up on someone. But in the digital world, it is not quite as rude.
Ok, we admit it. Here's where the analogy gets kind of creepy. Suppose that I try to call you. I get through to the receptionist, but since you are using the last line for some other conversation, I get put on hold. Perhaps, you are talking to your boss about a raise, or maybe to your significant other about some new way of (never mind...). Meanwhile, I have this important message to give you, and even though the music to the Hawaii Five-O TV show is very refreshing, I have some important work to do, and can't do it effectively being on hold. So what do I do? I clone myself. My clone waits for your line to become available, and then gives you my message and kills himself. Very tragic, I know, but at least I was able to get some work done while my clone waited for you to get off the line. This sounds very similar to the problem of a process sending a message to another process that has a full socket buffer because the destination process is busy talking to its significant other, the meteorologist. Our solution is similar to the cloning idea although it doesn't have as many moral ramifications. We cut a thread whose job is to wait for the destination process to read some of the socket buffer. The thread then writes the message and exits once the message is complete. Cloning can be expensive (new technology), so we want to clone only when absolutely necessary. Likewise, thread creation carries some overhead with it, so we want to cut a thread only when the socket buffer is full.
If you can indulge this metaphor a little longer, we have one more point to deliver. Suppose you are on the phone with your significant other, and your boss knocks on your door. BUSTED! He interrupts you with some important information about a tech review that you couldn't care less about. You have a great idea. You use a clone that will receive your boss's information. Then the clone calls you up, and waits on hold until you get off the phone. The clone then gives you that very important information, but he doesn't kill himself because he may be useful later on. We handle asynchronous signals in very much the same manner. Suppose the IGC is busy rendering a depictable and a SIGUSR1 signal comes in. We have a signal thread that is always running which receives the signal. The thread then writes the signal to a pipe managed by a SignalPipe object. A pipe is very similar to a socket, and in fact is identical to a socket as far as the EventDispatcher is concerned. The IGC reads the pipe and processes the signal once the depictable rendering is complete.
SocketConnection objects manage a group of sockets of a particular priority. These objects are derived from the Connection class. This is to support a future possibility that our IPC library will have another transport mechanism in addition to TCP sockets.
Currently we support only two levels of message priorities, normal and high. At static initialization, a process creates a SocketConnection object for normal messages. A second SocketConnection object is created only if the process sends or receives high priority messages.
DataSocket objects can be constructed in two ways depending on whether the object is used to initiate a connection to another process or if the object was created to receive a connection request initiated by another process. Both constructors perform some mutual initialization tasks which are executed by the private member, DataSocket::initialize() that both constructors invoke. These tasks include initializing the data structures needed for thread management, and adjusting the socket buffer size to the value specified in the configuration file, localization/nationalData/ipc.config.
The DataSocket object also manages the threads that may be used to send to the connecting process. If a DataSocket object is destroyed, running threads are also terminated. The object also maintains mutual exclusion locks (mutexes) to ensure that only one thread at a time will be writing to a socket.
Anonymous processes from the same executable will each have different IPC addresses since the port number will be generated by UNIX when the AcceptSocket object is created. This allows many instances of the same executable to be running simultaneously; each instance can be addressed independently.
The text string representation of an anonymous address has the following format: <host name>/<port number>/<process id>. Host name can either be in fully qualified domain format: vulture.fsl.noaa.gov or as a dotted IP address: 127.0.0.1. The process ID is not used in the internal address representation and is not needed to connect to a socket. However, it is useful for logging and debugging purposes since it is much easier to identify a process by its process ID than by its port number.
Only one instance of an executable with a named address can be running on the network at one time. Named processes must run on the host specified in the system configuration file (ipc.config) and their AcceptSocket object will use the port number specified in the config entry for that process.
A config entry for a named process has the following format: <process name> <host name> <port number>. The process name has no restrictions but usually coincides with the executable name. The text string representation of an address for a named process is simply the process name specified in the config file. As with anonymous targets, host names can either be in fully qualified domain format or as a dotted IP address. The port number can be any positive number less than 32K, but all the named processes running on the same host must have unique port numbers.
Both anonymous and named processes must be assigned an IPC address before any messages can be received or sent. A client of the IPC library can query the process address by calling the static method, Connection::myTarget().
How a process is assigned its address depends on whether it's an anonymous or a named process:
The EventDispatcher singleton object maintains three sets of client objects.
In order for the EventDispatcher to maintain these sets, client objects must register with the singleton during construction and cancel their registration during destruction.
A process can wait for events to arrive in four different ways.
The IPC library contains a global function, selectDescriptorEvents, which is a wrapper around the select() call. It is passed a set of DescriptorEventClient objects and a pointer to UNIX struct that represents time with a granularity of microseconds. A null pointer can be passed in which indicates that the select() will never time out. This routine blocks its process until one of the devices is ready or the time-out value has been exceeded. It uses the objects to construct three sets of file descriptors indicating which devices are waiting for reading, writing, or an exception. It passes these sets and the pointer to the time struct to select(). If select() returns because one or more devices are ready, selectDescriptorEvents determines which objects are ready by examining the descriptor sets set by select(). It then invokes the callback, DescriptorEventClient::handleEvent() for each object that is ready. If two or more objects are ready at the same time, the objects with higher priority devices will be notified first. SelectDescriptorEvents() will return an enumerated value depending on the four possible results of calling select() which are:
If the process wants to wait continually for IPC messages to arrive, this code fragment might be used in main() after all initialization is complete.
while (Connection::waitForMessage (IPC_Types::WAIT_FOREVER) ==
IPC_Types::IPC_SUCCESS);
A caveat of using this approach is that only IPC devices are monitored. Objects that manage non-IPC devices will not be notified when their devices are ready. Also, TimerEventClient objects will not be notified when their timer has expired. If the process has these kinds of objects, then the approach described in Section 5a.4.2.4 should be used.
Here is how Connection::waitForMessage() implements this approach.
Here's how we implemented this approach.
The EventDispatcher supports multiple interchangeable dispatch engines. Even though we currently have only one variety, we may have several in the near future, one for Tcl executables and one for non-Tcl/Tk executables. To specify which engine to use, a client passes one of these enumerated values into EventDispatcher::enterDispatchLoop(): EventDispatcher::GENERIC or EventDispatcher::TCL. If no engine type is specified, then the generic engine is assumed.
Note: Tcl/Tk interpreters built in our software tree do not use our event dispatcher at all but use the Tcl event notifier directly. Tcl/Tk interpreters are executables that are used to interpret Tcl scripts; Tcl/Tk executables are programs which have Tcl/Tk built into them, but otherwise are used for their own purposes instead of for script evaluation. These programs would use our EventDispatcher with the TclDispatchEngine, which is yet to be developed.
We will discuss only the generic dispatch engine since the Tcl engine has not been implemented yet. The implementation is very similar to what was described in Section 5a.4.2.2. The dispatch method will continue to loop until an internal flag inside the EventDispatcher singleton is set to false by calling EventDispatcher::exitDispatchLoop(). Inside the body of the loop, the method does the following:
For most of our interpreters, IPC is considered a module and is initialized by the global function, IPC_Init(). This function creates a Tcl command that a script can invoke which simply invokes either Connection::myTarget() and Connection::setMyTarget() which were explained in Section 5a.4.1.3. More importantly, all the IPC DescriptorEventClient objects that were registered with the EventDispatcher at static initialization are now registered with Tcl. This is done by calling Tcl_CreateFileHandler(), passing the UNIX file descriptor, a mask describing the type of I/O to wait for, a callback, and a pointer to the DescriptorEventClient object. Unregistering with Tcl is done with Tcl_DeleteFileHandler(). When Tcl invokes the callback, it passes in the pointer to the DescriptorEventClient object. Thus, the callback can call the callback method of the object, DescriptorEventClient::handleEvent().
After IPC_Init() is executed, socket and pipe objects will register with Tcl when they also register with the EventDispatcher singleton. Likewise, when these objects are destructed, they cancel their registration with both the EventDispatcher and Tcl.
DataTime aDataTime; TextString aString; double aNumber; ArgPkg3<DataTime, TextString, double> args (aDataTime, aString, aNumber);
An important caveat is that each data type must have an associated serialize(), serialLength(), and quantize() routine.
To convert from data to binary, the following set of functions is used:
byte* serialize(const <some data type>& arg, byte* addr)
The data to write (arg), and the binary stream to write to (addr) are passed in. The routines return the pointer to the byte on the stream after the data just written.
To determine how many bytes a piece of data occupies on the byte stream, the following functions are used:
long serialLength(const <some data type>& arg)
The object or variable is passed in (arg) and the number of bytes it will take to encode onto a binary stream is returned.
To convert from binary to data, the following set of functions is used:
byte* quantize(<some data type>* arg, byte* addr)
A pointer to the data to read (arg), and the binary stream to write to (addr) is passed in. The caller is responsible for allocating memory for the data. Quantize methods do not allocate any memory. The routines return the pointer to the byte on the stream after the data just read.
Fortunately, the conversion functions for most all of the common data types have been written. Occasionally, a client may want to send an unusual class of object across process boundaries and will have to write these conversion routines. Complex data types are converted by calling serialize, serialLength, or quantize on the individual data members that are to be converted. The author of the serialize/quantize functions for a particular class can choose which data members will be encoded/decoded to/from a binary stream. For atomic data types, we use the XDR package for our conversion needs. XDR was a good choice since it's available on a great number of UNIX and non-UNIX systems. In fact, any system that has NFS had better have XDR.
With an ArgPkg object, data is converted to binary by simply calling its serialize() method. This method in turn calls the serialize() routine for each of the packaged data types. An ArgPkg object also supports a serialLength() method that is implemented in the same recursive fashion.
The delivery type and priority default to IPC_Types::ASYNC and IPC_Types::NORMAL_PRIORITY, so these can be omitted by the client.
Quite often, a client develops a class that inherits from ParameterizedMsg. This is usually convenient, but not necessary unless the client wants to directly control how the message data is translated to binary. In this case, the client passes a null pointer for the arg object when constructing the message object. The derived class should then provide implementations for the virtual methods: structToByteStream() and byteStreamLength().
Now that the data and the attributes are packaged, the client can initiate the sending of the message by calling one of the two send methods, passing either a pointer to a ChildProcess object or an IPC_Target object (described in Section 5a.4.1), indicating where the message should be sent. These methods will return one of the enumerations of IPC_Types::ErrorCodes. If the delivery type is IPC_Types::SYNC, then these methods may take a while to return, especially if the destination process is not responsive, or network traffic is heavy. These methods will return immediately if the transmission is asynchronous. Clients should check the return values of these methods. If the methods return IPC_Types::IPC_TRY_AGAIN_LATER, then the destination is unreachable. If IPC_Types::IPC_HOPELESS is returned, then there is something wrong with the destination address that was passed in or some of the meta-data values are invalid.
As soon as one of the send methods return, the ParameterizedMsg object is no longer needed unless the client wants to send the same message at a later time.
Most of the real work is done by ParameterizedMsg::send (const IPC_Target *target). ParameterizedMsg::send (const ChildProcess *process) simply extracts the IPC_Target from the process object and then calls the other send method which follows the following algorithm.
The message is now completely packaged, converted and ready to transmit!
SocketConnection::sendMsg() then invokes the DataSocket::writeMsg() method on the selected socket object, passing the delivery mode and the PendingMsg object. DataSocket::writeMsg() will return a status indicating whether the write was successful; that status value is then returned by SocketConnection::sendMsg().
The first task for DataSocket::writeMsg() is to record the current time. This will be useful in determining which is the least recently used socket.
Next, the socket object places the socket in non-blocking mode. This means that if we try to write to a full socket, the write() call will return immediately with a status of EAGAIN. We use UNIX fcntl() calls to toggle between blocking and non-blocking mode.
As we mentioned earlier, the PendingMsg object will perform the write to the socket. The constructor of this object determines how many bytes are in the message stream by reading the first four bytes of the binary stream. PendingMsg::send() will call the UNIX write() routine, which does the socket writing. Write() will return the number of bytes written to the socket. At this point, there are three possibilities:
Because of that third possibility, it may take several calls to PendingMsg::send() before the complete message is sent.
After the DataSocket object places the socket in non-blocking mode, the PendingMsg object is asked to try to write its message. The socket is then immediately placed back into blocking mode. If the write has completed either successfully or unsuccessfully, then we are done. Yippee! If the write did not complete, the receiving process must be busy with some other task which caused the socket buffer to fill up. With that being the case, the situation becomes much more complicated and is explained in detail in Section 5a.4.7.
After the write completes, the PendingMsg object will be destructed causing the binary stream to be deallocated (which can be a fair chunk of memory for large messages). DataSocket::writeMessage() will then return a status indicating whether the write has succeeded or not. If SocketConnection::sendMsg() gets a status back indicating the write has failed, we can assume that the connecting process no longer wants to engage in conversation. As a result, we destroy the DataSocket object which closes our end of the connection, and then remove the socket object from the table of connections.
The primary task of the method DataSocket::handleEvent() is to remove binary data from a socket and to assemble that data into a message that can be received and processed by clients of the IPC library. It is possible that the message will be sent in fragments, requiring multiple calls to this method in order to read the entire message. The following algorithm implements this method.
In addition to the logical module, every message has a type associated with it. The type tells the client what the format of the data is, and what the message should be used for. For example, one type of message could be an instruction for the IGC to load a color table. This type of message contains one piece of data, a color table key. A logical module may contain one or many types of messages. Like the logical module, a type has an enumerated value associated with it that is passed with every message as part of the metadata.
For every logical module, the client must instantiate a receiver object which receives all the messages for that module. Ideally, this object should be created during initialization before a process begins waiting for events. This object must be derived from the abstract base class Receiver containing a single pure virtual function, receive(), for which the client must provide an implementation. When the IPC library calls this method, it passes the byte stream, the number of bytes in the stream, the address (IPC_Target object) of the sending process, and the message type.
Once a DataSocket object receives an entire message, it passes the byte stream to the static method, Dispatcher::route(). This method will then extract the message header from the byte stream, and use the logical module ID in the header to locate an associated registered receiver object. The receive method for that object is then called, passing the message byte stream without the header, the size of the header-less byte stream, and some other metadata that was included in the header. Once the receive method returns, Dispatcher::route() deallocates the memory for the byte stream.
A client must provide an implementation for the method that receives different types of messages for a particular logical module. Typically, a receive method will contain a switch statement, selecting on the various types of messages for that module. Each case in the switch should convert the binary stream to some meaningful data types and then pass that data to some object or module that can process the request and data.
ArgPkg3<DataTime, TextString, double> args (binaryByteStream); DataTime aDataTime (args.param1()); TextString aString (args.param2()); double aNumber = args.param3();
Whenever a process initiates or receives a connection request, we count the number of DataSocket objects that each SocketConnection object owns. Then we add two to the count since every process has at least one SignalPipe and AcceptSocket object also.
If our count is over twenty, we have to close down one of our existing connections. There are a number of ways we could choose the unlucky connection, all equally appealing. We opted for the least-recently-used approach. Each time a socket is written or read, we record the current time inside the DataSocket object. We destroy the DataSocket object with the oldest recorded time which will close the socket and allow us to have a new connection without exceeding our limit.
To initiate a new connection request, the SocketConnection object makes room for the new socket if necessary. It then constructs a DataSocket object by passing the address of the destination process. This DataSocket constructor creates a TCP socket and then calls the UNIX routine connect(), passing the address. The address of the process to connect to is actually the IP address of process host and the port number of the TCP socket managed by the connecting process's AcceptSocket object. As long as the accept socket has been created, and a listen request has been submitted, a connection can be made, even if the connecting process is suspended or busy processing. Generally, connect() will determine quickly whether the connection is feasible, although it is possible that connect() could block the initiating process if network traffic is heavy. For now, we decided to allow the blocking because during our testing, connect() always returns almost immediately. If it is a problem later on, we can install a time-out value on the connection. This can be done by placing the socket in non-blocking mode, and then issuing a connect request. If connect() returns EINPROGRESS, invoke a select() call, waiting until the socket becomes writable or the time-out has been exceeded.
If connect() returns ECONNREFUSED, we know that the connection cannot be made because the address was bad, the network or the connecting host is down, or the connecting process is not running. The DataSocket constructor then closes the socket and sets the file descriptor data member to -1.
After the DataSocket object is created, the SocketConnection object checks to see if the object contains a valid file descriptor. If so, the object is added to the dictionary of connected DataSocket objects maintained by the SocketConnection object. If not, the DataSocket object is destroyed, and a problem is logged with LogStream, indicating that the connection attempt has failed.
Every process capable of IPC has a single AcceptSocket object that manages a TCP socket whose sole purpose is to accept connection requests. When a request arrives, this socket becomes ready for reading. SelectDescriptorEvents() detects this and calls the AcceptSocket::handleEvent() method for this object. This method will tell the owner of this object, the normal priority SocketConnection object, to make room for a new socket if necessary. Next, the method calls the UNIX routine accept(). This creates a new socket that is connected to the socket managed by the DataSocket object living in the process that initiated the connection. Accept() returns the file descriptor of this socket, which is used to construct a DataSocket object. This constructor doesn't have much to do since a socket was already created. It just saves the file descriptor passed in, and then performs the initialization common to both constructors. The new DataSocket object is then given to the parent SocketConnection object.
At this point, the SocketConnection object managing normal priority sockets cannot add this new DataSocket object to its dictionary of connected sockets because we do not know the address of the process to which the new socket is connected. We also do not know the priority level of this socket. The accept() call can optionally return the address of the peer socket in the other process; however, this won't help us, since that address contains the port number of the socket managed by a DataSocket object. The address of the connecting process must contain the port number of the socket managed by the AcceptSocket object. Fortunately, the process address and the priority is sent in the header of every message, so the mystery surrounding this new socket will be resolved once its first message arrives. In the meantime, the SocketConnection object adds the socket object to a set of sockets that do not know yet who their connecting processes are. DataSocket::handleEvent(), the method that reads messages from a socket, will check to see if the connection address is known. If it isn't, the read method reads the header from the front of the byte stream, looking for the address of the sending process and the priority. The address and priority now known, this DataSocket object can now be placed properly in the SocketConnection object that manages the priority of this socket. Also, this DataSocket object can be removed from the set of mystery sockets maintained by the normal priority SocketConnection object.
In order to clean up a broken connection, one of the SocketConnection objects destructs the DataSocket object and removes it from its dictionary of known sockets or its set of unknown sockets. The DataSocket destructor closes the socket, freeing up the file descriptor for other incoming connections. Also, any threads waiting to write to this socket are terminated.
The best approach to use is left up to the client of the IPC library. If the client needs to know that its message has been completely received and processed, then the synchronous wait should be used.
This section discusses how to detect if these approaches are needed, and the implementation of both approaches. However, before discussing the approaches, we discuss how our design utilizes mutual exclusion mechanisms.
First, we count the number of active threads that the DataSocket object has invoked. If there is at least one, we know that the socket buffer is still full, since a thread will terminate as soon as it can do its write, and a thread is created only in response to a full buffer.
If there are no threads active, the buffer could still be full. For the second check, the socket buffer is placed in non-blocking mode and the PendingMsg object will attempt to write the entire message to the buffer. In non-blocking mode, the UNIX write() routine will return the number of bytes written, even if it wasn't able to write the requested number of bytes. If the entire message was not written, the socket buffer is full. After the write attempt, the socket buffer is returned to blocking mode. One hopes that the complete message will be written to the buffer, and neither of the following two approaches needs to be applied.
For both approaches, this design makes use of the mutual exclusion mechanism (also called mutex locks) provided by the pthread library. Mutex locks are used to ensure that only one thread of execution has access to a data structure or an operation that is global to all threads. Thus, many threads may be waiting for a single thread to finish an operation like writing to a socket buffer. Once that thread is finished, it unlocks the mutex, and one of the waiting threads is then granted access. The choice of which thread goes next is left up to the pthread library. A first come, first served approach might make sense, but the library's choices seem to be random.
Each DataSocket object manages two mutex lock mechanisms. The first ensures that fragmented messages are not interleaved with other messages. Remember, sometimes it takes several write() calls to send a complete message. Without a locking mechanism, thread A might write 20 percent of its message causing the write() to return since the buffer is now full. Thread B is given access to the socket buffer and writes a different message. The DataSocket object in the receiving process has trouble reading the message from thread A, since it has no way of telling that a new message is interleaved.
The second mutex lock is used to protect the DataSocket object's thread table, a dictionary that keeps track of the running threads. If a lock were not used, it would be quite possible that the main thread of execution is reading or writing to the table while a sending thread was also writing to the table.
Once the main thread of execution obtains the mutex lock, it can proceed with writing the message. However, we don't want to block indefinitely! The number of seconds to wait for space to become available on the buffer is specified in the configuration file, ipc.config. Current configuration is set for 3 seconds. Before trying to write to the socket, the UNIX select() routine is used to determine if the socket is ready for writing. If select() returns 1, we know that there is space on the buffer and the write() call will not block. If select() returns 0, our patience has reached its limit. At this point, it is safe to assume that the receiving process is truly unresponsive, so we should break our connection with the receiving process.
The synchronous approach may sometimes be used even though the client requested the asynchronous approach. Some processes link with middleware libraries that register handlers for asynchronous signals without using our signal catcher mechanism. If a sending thread is invoked, then those signals will not be delivered consistently to the library. This is the case with our D2D executables that link with the Freeway library (wfoApi, dialRadar, etc.). The best solution is to prevent these executables from ever creating threads. This can be done by calling the static method, Connection::preventThreadCreation().
Here is the algorithm for doThreadSend()
This approach is far more efficient and sexier than the synchronous approach, but the client sending the message may not care about sexiness or efficiency.
A thread is invoked by calling the pthread library routine pthread_create() by passing the thread routine doThreadSend() and a pointer to the PendingMsg object. The creation routine returns an identifier which can be used to make a new entry in the table.
We made a design decision that we should somehow limit the number of threads running in the system because an abundance of pending threads means that potentially a lot of message memory is allocated. Also, a plethora of running threads implies that the receiving process is not responding or is possibly hung. After mulling over several possibilities, we came up with the approach of counting the number of bytes for all the pending messages whenever we are about to create a new thread. If it is over a certain threshold value defined in ipc.config we will break our end of the connection, which involves destructing the DataSocket object and terminating all the running threads for that DataSocket.
Under normal circumstances, a thread terminates under its own control, and then tells its DataSocket object that it has finished. However, whenever a DataSocket is destructed, it needs to tell all the running threads to terminate, ASAP. This is done by using the pthread library routine pthread_cancel(), passing the thread identifier. Unfortunately, pthread_cancel() is just a request, and the return of this routine does not guarantee that all running threads have been cancelled. As a result, the destructor has to wait for all the threads to terminate before it can continue execution. It does this by trying to lock the sending mutex lock. Once it obtains the lock, we can be sure that all the threads have terminated. When a thread receives a cancel request, it does not automatically give up any mutexes that it locked. Fortunately, the pthread library calls a callback routine whenever a thread is about to be cancelled. Our callback routine, doThreadCancel(), simply unlocks the sending mutex.
This section describes how a client process can register for signals and also control their delivery. It then explains the difference between synchronous and asynchronous signals and how these kinds of signals can cause re-entrancy problems. Finally, this section discusses the implementation of delivering signals in both single- and multiple-threaded environments.
A client process does not have to provide a SignalClient object; one with default behavior is created and registered at static initialization. Should the client process register several SignalClient objects, signals will be passed to only the most recently registered client.
Signals are categorized in the following way:
A few of our D2D processes such as the acqserver receive IPC messages from clients that do not link with our IPC library. As a result, these processes do not wait for events using the approaches described in Section 5a.4.2. Since our approach for handling asynchronous signals safely is contingent on the process using one of our event-waiting approaches, these processes must handle asynchronous signals at the risk of re-entering code that is not designed to be re-entered. If a client process calls the static method SignalCatcher::useUnsafeSignalDelivery(), then asynchronous signals will be handled as soon as they are delivered. Normally, their delivery is deferred until the client process is ready to handle them.
Even in a single-threaded process, delivery of signals differs depending on whether the signal was generated synchronously or asynchronously because of the risk of re-entering non-re-entrant code.
Synchronous signals are the result of some error condition that occurs inside a process, and are delivered synchronously with respect to that error. For example, if a floating point calculation results in an overflow, a SIGFPE (floating point exception signal) is delivered to the process immediately following the instruction that resulted in the overflow. Our signal catching software handles the following synchronous signals: SIGILL, SIGTRAP, SIGABRT, SIGEMT, SIGFPE, SIGBUS, SIGSEGV, SIGPIPE, and SIGSYS.
Synchronous signals can be handled immediately without the danger of re-entering code since the signal was in response to an error in the code that it is interrupting.
Asynchronous signals are normally the result of an event that is external to the process, and are delivered whenever during the process execution such an event occurs. For example, when a user running a program types the interrupt character at the terminal (generally <ctrl-c>), a SIGINT (interrupt signal) is delivered to the process. Our signal catching software handles the following asynchronous signals: SIGHUP, SIGINT, SIGTERM, SIGALRM, SIGUSR1, SIGUSR2, SIGCHLD, SIGLOST, SIGPROF, and SIGQUIT.
If asynchronous signals are handled as soon as they arrive, it is quite possible to re-enter code that doesn't support re-entrancy, such as malloc() which may cause memory errors or a process crash. For example, suppose a depictable is busy allocating memory for a radar table and a SIGCHLD signal is delivered to our process, signaling that one of our child processes has died and needs to be restarted. While restarting the child, some memory is allocated and the process crashes due to a segmentation violation. Why? Malloc() accesses some shared global data structures. The second malloc() call during the signal handling was issued before the first malloc() call from the depictable had a chance to complete.
In order to avoid re-entrancy problems, our signal catching software delivers async signals using a strategy of deferred delivery. The vast majority of D2D processes alternate in a loop between waiting for events and handling new events. An event can be a timer expiring, an IPC message, a mouse button click, etc. As we saw with the previous example, async signals can be delivered while a process is busy handling an event. However, we defer the handling of that signal until the process has finished its event handling and is now waiting for a new event to arrive. The new event could be that deferred signal. We procrastinate on signal processing by using a pipe which is UNIX's implementation of a FIFO (first-in, first-out) queue. The details of using a pipe are covered in the next section.
Signal solicitation is done during static initialization time before the process executes its main() routine. The UNIX routine sigaction() is used to install one of two signal handling routines for each signal that we are handling. When one of those signals is sent to our processes, UNIX will call one of those routines, passing the signal number.
Also, during static initialization, the SignalCatcher will create a SignalPipe object that encapsulates a UNIX pipe (FIFO queue). During construction, the object creates the pipe and saves the two file descriptors for reading and writing to the pipe. It also registers with the EventDispatcher as a DescriptorEventClient so that the object will be notified when its pipe is ready for reading.
The signal handling routine for synchronous signals is called handleSignal(). This routine contains one large switch statement with cases for each supported signal. Each case will invoke the appropriate method for the currently registered SignalClient object. Thus, synchronous signals are not deferred; they are handled as soon as they arrive.
The signal handling routine for asynchronous signals is called receiveAsyncSignal(). If the client process instructed the SignalCatcher object not to defer signal delivery, then this routine will call handleSignal() which will handle the signal immediately. However, most processes will not do this since they do not want to risk a crash because of re-entrancy. Instead, receiveAsyncSignal() will tell the SignalPipe object to write the signal number to its pipe. When the process has finished its processing of other events, and is idle waiting for an event to arrive, selectDescriptorEvents() routine (covered in Section 5a.4.2.1) will determine that the pipe managed by the SignalPipe object is now ready for reading, and will call SignalPipe::handleEvent(). This method will read all the data in the pipe into an array of integers. Each cell in the array will contain an async signal that has been delivered but not handled. Looping through the array, handleSignal() is called for every cell in the array. Thus, the pipe is used to defer the delivery of async signals until a process is ready to handle them.
Handling asynchronous signals in a multi-threaded environment is fairly complicated since an async signal can be delivered to any one of the running threads. The decision of which thread gets the signal is dependent on the implementation of the DCE pthread() library. Also, this library does not allow clients to install handlers for async signals with sigaction() once the process becomes multi-threaded.
After consulting with HP, we developed the following approach for soliciting and delivering async signals in a multiple threaded environment. Right before the first sending thread is invoked, the SignalCatcher object will start a thread whose sole purpose is to listen for async signals. This async signal thread will block all other threads from receiving the supported set of asynchronous signals using the UNIX routine, sigprocmask(). It then calls another UNIX routine, sigwait(), that will block the thread's execution until one of the async signals arrives and returns the signal number.
At this point, if the client process instructed the SignalCatcher object not to defer signal delivery, then the thread will call handleSignal() which will handle the signal immediately. This is very dangerous since the signal handling will occur in a different thread but might possibly simultaneously access some of the same data structures that the main thread is accessing. Fortunately, only a few D2D processes require the immediate delivery of async signals.
More likely, once the async signal thread receives a signal from sigwait(), it uses the SignalPipe object to write that signal number to the pipe. The signal is then read by the main thread of execution without interrupting any of the main thread's processing. The signal is then handled and processed by the main thread by calling handleSignal(). This approach not only addresses the re-entrancy problem, but also the issue of two threads accessing the same data structures simultaneously.
Clearly, interprocess communication in the D2D and data components of WFO-Advanced (FX-Advanced) is a complex issue. Even though we now have quite likely the clearest IPC implementation so far, it is still a daunting prospect to completely understand what's all involved. That is, of course, the entire reason for this document, which describes the IPC system in its entirety.
In this document, we have covered the requirements and history of IPC in WFO-Advanced, including descriptions of the two predecessors to the current thread-based implementation. We learned how well each product worked and what features made the previous systems unacceptable. We also examined the current thread-based implementation and discussed its performance gains, its simplicity, its complexity, and its weaknesses.
By using an analogy of a business telephone system, we discovered what components existed in the thread-based implementation by correlating them with real-world objects. The analogy served as a focal point for the discussion of the implementation itself.
The majority of this document exposed the thread-based IPC implementation in great detail, enabling even the passive observer to understand and maintain the system. The discussion included addressing, event dispatching, message receipt, message transmission and metadata, synchronous and asynchronous sends, the use of threads, signal handling, and most of the classes and objects involved.
The WFO-Advanced IPC library continues to be open-ended and still allows for multiple concurrent transports, with threaded IPC coexisting seamlessly with any future transports. Any future work on the IPC library might include the addition of a new transport, but this is doubtful as the thread-based implementation satisfies all of our requirements quite well and in a highly advanced fashion. Future development on the thread-based implementation may evolve as vendors' thread libraries and kernel thread support improve. And although our use of threads doesn't involve concurrency, multiprocessor versions of UNIX may have some interesting (if minor) impacts on our library's performance. We look forward to better thread support as threaded programming becomes more and more the norm for software engineering.
Table of Contents
Next Chapter