This document describes three additional capabilities that have been implemented as a proof of concept. My hope is that all three will be adopted for AWIPS 5.1. These capabilities are enhancements; they are backward compatible since they don't affect the existing functionality or performance. This document just describes what these enhancements, why they were done, and how to use them. For implementation details, use the source, Luke!
Questions or comments: email me
Thread Safe LoggingWhat is Thread Safe?
Collective Logging
Why Now?
Will All Executables Use Thread Safe Log Stream?
How is Thread Safe Log Stream Enabled?What Is Collective Logging?
Stack Traces
How is the Collective Filename Specified?
How is Collective Logging Enabled?
Are Collective Log Files Automatically Broken At Midnight?Enhancement? Hasn't This Feature Always Been Part of LogStream?
How are Stack Traces Different Now?
You Have to be Kidding; More COTS to do Stack Traces?
Are Stack Traces Possible With a Stripped Executable?
LogStream was implemented using the C++ IO Stream package which is not thread safe (at least in the Linux implementation.). Even if it was, LogStream still uses static data structures which can easily be corrupted by two threads logging a message at the same time. One solution is to only use LogStream in the main thread of execution. This approach is OK for simple threads that don't need to log anything useful, but could be problematic for more complex threads. A better approach is to make LogStream thread safe by wrapping mutex locks around the critical logging code.
LOG_STREAM_TEST_OBJECTS = \
logStreamTest.o \
${SIGNAL_CATCHER} \
${LIB_PROC_MGMT} \
${LIB_IPC} \
${LIB_CONFIG} \
${LIB_LOG_STREAM} \
${LIB_UTIL} \
${LIB_FOUNDATION}
Here's a link list of an executable using the thread safe log stream:
LOG_STREAM_TEST_OBJECTS = \
logStreamTest.o \
${SIGNAL_CATCHER} \
${LIB_PROC_MGMT} \
${LIB_IPC} \
${LIB_CONFIG} \
${LIB_THREAD_SAFE_LOG_STREAM} \
${LIB_UTIL} \
${LIB_FOUNDATION}
As with all of our macros, LIB_THREAD_SAFE_LOG_STREAM is defined in ${STAGING}/Makefile.aliases
The collective sink could be useful for collecting the logs of different runs of the same program into one file. Or maybe collecting all the output of one of the heads of a two headed display in one file. Or maybe it might be useful to have all the programs from a particular host all log to the same file.
| %L | LOG_DIR environment variable |
| %H | Hostname with domain name |
| %D | UTC date, YYYYMMDD |
| %N | Program name (just the leaf, not the full path) |
| %d | DISPLAY environment variable |
For example, suppose you were to run the D-2D today (3/20/2000) on ws1 and the LOG_PREF environment variable is set to $FXA_HOME/logs. If you set the COLLECTIVE_FILE environment variable to %L/%D/%N-%H, all the five of the IGC processes would log to the following file: $FXA_HOME/logs/20000320/IGC_Process-ws1.fsl.noaa.gov.
A filter has the following syntax:
Here's an example. These three filters will turn on collective
logging for the program "asyncProdScheduler" for all categories except
debug or verbose.
asyncProdScheduler all collective all = on
asyncProdScheduler all collective verbose = off
asyncProdScheduler all collective debug = off
A prefix has the following syntax.
collective processID = on
It is possible to get a stack trace on a HP gcc executable since gdb (the GNU debugger) can do these flawlessly. After much head scratching and studying the gdb source code, I was able to write code that produces stack traces even through virtual functions, shared libraries, signal handlers, and long jumps. On Linux, the task was much simpler. There is already a function in the libc library that gets the addresses of each frame in the calling sequence. All that was left to do was translate those addresses into function names.
IGC_Process 4118 IGCSignalClient.C 55 BUG: Signal 11 (Segmentation violation) received. 0x8106950 in IGCSignalClient::shutdown() at IGCSignalClient.C:57 0x831c864 in SignalCatcher::handleSignal() at SignalCatcher.C:131 0x831d83f in ThreadSignalCatcher::handleSignal() at ThreadSignalCatcher.C:257 0x831c798 in handleSignal() at SignalCatcher.C:94 <Cannot determine where signal occurred> 0x81f7c5e in PVImageDepict::paint() at PVImageDepict.C:119 0x8110885 in ImageDepictTuple::getPsuedoData() at DepictTuple.C:1280 0x811055d in ImageDepictTuple::render() at DepictTuple.C:1191 0x810eb39 in DepictTuple::displayDrawable() at DepictTuple.C:289 0x811ad9f in Frame::display() at Frame.C:355 0x8123281 in FrameSeq::refresh() at FrameSeq.C:1758 0x81202ac in FrameSeq::load() at FrameSeq.C:421 0x8137540 in DisplayMgr::handleKeyEvent() at KeyPressHandler.C:872 0x8116710 in DisplayPanel::handleDisplayAreaEvent() at DisplayPanel.C:1172 0x812e29d in GraphicsDispatcher::dispatchEvent() at GraphicsDispatcher.C:113 0x812e180 in GraphicsDispatcher::handleEvent() at GraphicsDispatcher.C:84 0x831c250 in dispatchFileEvent() at TclDispatchEngine.C:63 0x84029d2 in FileHandlerEventProc() at tclUnixNotfy.c:407 0x83f20d0 in Tcl_ServiceEvent() at tclNotify.c:444 0x83f236e in Tcl_DoOneEvent() at tclNotify.c:747 0x831c69a in TclDispatchEngine::dispatch() at TclDispatchEngine.C:204 0x8320f7b in EventDispatcher::enterDispatchLoop() at EventDispatcher.C:76 0x81066eb in main() at IGC_Process.C:559
Notice that the signal occurred in the fifth frame, but we can't determine where in Linux. Here's the same stack trace on HP-UX. Instead of file names and line numbers, the parameters of each function are displayed.
IGC_Process 1800 IGCSignalClient.C 55 BUG: Signal 11 (Segmentation violation) received. 0x13c9dc in IGCSignalClient::shutdown(bool, int) 0x925630 in SignalCatcher::handleSignal(int) 0x927054 in ThreadSignalCatcher::handleSignal(int) 0x925440 in handleSignal(int) 0x484e10 in SatPVImageDepict::ingestData(void) 0x400acc in PVImageDepict::paint(GraphicsTarget const *, CartCoord2D<int>, Depictor const *, ZoomAndDensityInfo const &, Depictable::ComboType) 0x178658 in ImageDepictTuple::getPsuedoData(CartCoord2D<int> const &) 0x178228 in ImageDepictTuple::render(bool) 0x17570c in DepictTuple::displayDrawable(bool, bool) 0x1cc208 in Frame::display(void) 0x1e4a70 in FrameSeq::refresh(short) 0x1e0acc in FrameSeq::load(unsigned long const &, AbsTime const &, long, LoadMode, bool, long) 0x241c58 in DisplayMgr::handleKeyEvent(unsigned long, int) 0x19bcf4 in DisplayPanel::handleDisplayAreaEvent(_XEvent const *) 0x21ad5c in GraphicsDispatcher::dispatchEvent(_XEvent const &) 0x21ab98 in GraphicsDispatcher::handleEvent(void) 0x922094 in dispatchFileEvent(void *, int) 0xa4a234 in FileHandlerEventProc() 0xa3acb0 in Tcl_ServiceEvent() 0xa3b004 in Tcl_DoOneEvent() 0x9226e0 in TclDispatchEngine::dispatch(EventDispatcher *) 0x939284 in EventDispatcher::enterDispatchLoop(EventDispatcher::DispatchType) 0x13743c in main()
Here's a stack trace caused by the same segmentation violation. However this one was triggered by a frame being rendered in the background. The current version of logStream would have suppressed this stack trace, but this version displays it.
IGC_Process 4118 IGCSignalClient.C 55 BUG: Signal 11 (Segmentation violation) received. 0x8106950 in IGCSignalClient::shutdown() at IGCSignalClient.C:57 0x831c864 in SignalCatcher::handleSignal() at SignalCatcher.C:131 0x831d83f in ThreadSignalCatcher::handleSignal() at ThreadSignalCatcher.C:257 0x831c798 in handleSignal() at SignalCatcher.C:94 <Cannot determine where signal occurred> 0x81f7c5e in PVImageDepict::paint() at PVImageDepict.C:119 0x8110885 in ImageDepictTuple::getPsuedoData() at DepictTuple.C:1280 0x811055d in ImageDepictTuple::render() at DepictTuple.C:1191 0x810eb39 in DepictTuple::displayDrawable() at DepictTuple.C:289 0x811af42 in Frame::prepareATuple() at Frame.C:423 0x812be07 in FrameSeq::handleTimerExpiration() at FrameSeq.C:4797 0x831c27d in dispatchTimerEvent() at TclDispatchEngine.C:83 0x83f8b98 in TimerHandlerEventProc() at tclTimer.c:520 0x83f20d0 in Tcl_ServiceEvent() at tclNotify.c:444 0x83f236e in Tcl_DoOneEvent() at tclNotify.c:747 0x831c69a in TclDispatchEngine::dispatch() at TclDispatchEngine.C:204 0x8320f7b in EventDispatcher::enterDispatchLoop() at EventDispatcher.C:76 0x81066eb in main() at IGC_Process.C:559
For the name demangling, I used a library that comes with gcc called libiberty.a. Since this library always comes with gcc, hopefully this should not be a configuration issue for PRC.
The other library is a GNU library called libbfd.a and comes with the BinUtils package. To get it via ftp, click here. We have this package for HP-UX installed on our Auspex file server. For Linux, it comes bundled with Red Hat 6.
I set up the logStream Makefile so that these two libraries are linked directly with the StackTracer module. The pathname of where to find libbfd.a is in the environment variable, GNU_DIR while the pathname of where to find libiberty.a is in the env var, GCC_DIR. It doesn't appear that linking with these two libraries increases the executable sizes substantially.
Fortunately, an executable can be partially stripped, removing the debugging information but keeping the symbol information. A partial strip can net substantial savings in the executable size. And the good news is that function names are still available. Filenames and line numbers are only available with executables that still have complete debugging info. To partially strip an executable with HP gcc, use the argument, -Wl,-G when linking. (For Linux, use, -Wl,-S)
The following table shows the size in megabytes of the IGC_Process on
HP-UX after linked with the various strip options.
| Complete Strip (-s) | 12.92 |
| Partial Strip (-Wl,-G) | 16.46 |
| No strip | 61.91 |
If these enhancements are adopted for build 5.1, I strongly urge that executables running in a production environment be partially stripped.