WASP's thread pools

Way back in 2002 when I was developing ISO8583 servers for PayPoint I put together a two thread pool server design that has worked so well that many of the servers that I develop today still use variations on the original design. The main idea behind the design was that the threads that do the network I/O should never block and the threads that do the user work can block if they like. Since this work was being done before Windows Vista came along with it’s changes to how overlapped I/O that was still pending when the thread that issued it exited, see here. The I/O threads were not allowed to exit. However, to handle peaks and troughs in demand and operations that could block for various lengths of time (due to database access) it was useful to be able to expand and contract the thread pool that did the actual work. This led to a design where we had a fixed sized pool of I/O threads and a variable sized pool of “business logic” threads. Dispatch to the business logic pool was via a thread safe queue (built using and I/O completion port) and the dispatch was two stage so that the dispatcher could determine when it needed to expand the pool to deal with more work. Due to way the intra-pool dispatch worked it was easy to instrument the server using performance counters so that support staff could easily visualise how heavily loaded the server was.

WASP uses a variation on this design.

WASP has a fixed sized I/O pool which, by default, is used for all of the work that the server does. You can configure the number of threads in the I/O pool by using the <IOPool> node in the config file.

<?xml version="1.0" encoding="Windows-1252"?>
<Configuration>
  <WASP>
    <IOPool
      NumThreads="4">
    </IOPool>
    <TCP>
      <Endpoints>
        <EndPoint
          Name="Packet Echo Server"
          Port="5050"
          FramingDLL="[CONFIG]\PacketEchoServer.dll"
          HandlerDLL="[CONFIG]\PacketEchoServer.dll">
        </EndPoint>
      </Endpoints>
    </TCP>
  </WASP>
</Configuration>

You should only increase this value if your performance monitoring shows that the I/O threads that you have are always busy. If you don’t specify how many I/O threads to use then WASP will use 2. If you specify 0 then WASP will use 2 x the number of CPU cores that the machine has; whilst this was a reasonably option back when machines had 1 or 2 cores it’s far less sensible with machines of more than 2 cores. In general 2-4 I/O threads should always be ample and the fewer you can get away with the better.

The configuration above does all work on the I/O threads. You can specify that a two pool design is used by adding a <ThreadPool> node to the configuration.

<?xml version="1.0" encoding="Windows-1252"?>
<Configuration>
  <WASP>
    <IOPool
      NumThreads="4">
    </IOPool>
    <ThreadPool
      NumThreads="4">
    </ThreadPool>
    <TCP>
      <Endpoints>
        <EndPoint
          Name="Packet Echo Server"
          Port="5050"
          FramingDLL="[CONFIG]\PacketEchoServer.dll"
          HandlerDLL="[CONFIG]\PacketEchoServer.dll">
        </EndPoint>
      </Endpoints>
    </TCP>
  </WASP>
</Configuration>

This configuration performs message framing and I/O on the I/O pool and message handling on the business logic threads. This business logic thread pool is a fixed size which is useful if each item of work takes a small amount of time. A work queue is used between the two pools and the state of all of these objects can be monitored with perfmon using WASP’s performance counters.

The best way to see how this all works is to use this example plugin which is identical to the packet echo server plugin that we developed in a previous tutorial except for the fact that each piece of work takes a random amount of time to perform. You can see how the business logic queue feeds the thread pool by watching the performance counters whilst running tests with differing numbers of clients.

<?xml version="1.0" encoding="Windows-1252"?>
<Configuration>
  <WASP>
    <IOPool
      NumThreads="4">
    </IOPool>
    <ThreadPool
      InitialThreads="4"
      MinThread="4"
      MaxThreads="8"
      MaxDormantThreads="4"
      PoolMaintPeriod="5000"
      DispatchTimeout="100">
    </ThreadPool>
    <TCP>
      <Endpoints>
        <EndPoint
          Name="Packet Echo Server"
          Port="5050"
          FramingDLL="[CONFIG]\PacketEchoServer.dll"
          HandlerDLL="[CONFIG]\PacketEchoServer.dll">
        </EndPoint>
      </Endpoints>
    </TCP>
  </WASP>
</Configuration>

The configuration above uses a dynamic business logic thread pool. The pool will have a variable number of threads between MinThreads and MaxThreads depending on load. If a work item takes more than 100ms to process and we’re not currently running at the maximum number of threads then a new thread is started to deal with the work item. If we have more than 4 threads dormant, i.e. not currently in use, then a thread is stopped. The pool is checked every 5000ms to see if threads are dormant and can be shut down.

The dynamic pool is useful if you have some operations that take considerably longer than others but are not that common, or if each thread that processes messages is expensive in terms of resources. You need to tune the pool carefully to get the best results from it, but generally it’s good if you are hitting a database every so often and you want to operate with the fewest threads possible when you can.

An additional configuration parameter for both the <IOPool> and the <ThreadPool> nodes is the COMThreadingModel to use on the threads. This can either be MTA, STA or blank. If blank COM is not initialised on the thread pool threads.