Bug fixes Archives

Bug in multi-buffer writes in 6.7


A bug has been discovered in Release 6.7 in the code that deals with TCP socket writes that involve more than a single buffer. These 'multi-buffer writes' are writes that involves either a buffer chain or a block of data passed as a pointer and a length where the length exceeds the size of the buffer allocator that the connection is using.

The bug prevents the 'multi-buffer write' from being executed as a single atomic write at the network layer and so can cause corruption of a TCP data stream if multiple sockets are writing to the same connection concurrently.

The bug is due to the removal in 6.7 of the code required to support Windows XP. In Windows XP we needed to include sequence numbers in write operations to allow for the way we always marshalled all I/O operations from the calling thread to an I/O thread to prevent I/O cancellation due to thread termination. This write sequencing code had the side effect of also protecting 'multi-buffer writes' from being interrupted by other writes.

The fix does not require the reintroduction of write sequencing but, instead, issues a single scatter/gather style write for the entire buffer chain. This is both efficient and correct.

A related bug also affects atomicity of 'multi-buffer writes' into filter layers, such as the SSL code. Similar fixes have been applied here.

The bug is fixed in Release 6.8 which will be released later today.

Latest release of The Server Framework: 6.7

Version 6.7 of The Server Framework was released today.

This release is mainly a code simplification release which removes support for legacy compilers and operating systems. See here for more details. However, there are some breaking changes where smart buffers have replaced buffer references and this causes function signature changes. In addition there has been considerable churn in the Streaming Media Option Pack with knock on changes in the HTTP library code which needed to be made more complete to deal with serving HLS streams.
Version 6.6.5 of The Server Framework was released today.

This release is mainly a feature release with a few bug fixes.

A while back a client of mine had an issue with a TLS 1.2 connection failing during the TLS handshake. We couldn't see any issues with the code and if we only enabled TLS 1.1 on the server then the connection handshake worked just fine.

Eventually we tracked the issue down to the fact that the certificate in use had been signed with MD5 and that MD5 isn't a valid hash algorithm for TLS 1.2 and so the handshake failed when SChannel attempted to validate the certificate and found that it was unacceptable. There's a Microsoft blog posting about this problem here.

Annoyingly the error message for this is: "The client and server cannot communicate, because they do not possess a common algorithm." which is bang on the money, but doesn't help much until after you've worked out what the problem is. We were already dumping out the enabled algorithms and protocols and couldn't understand why the connection either wasn't working or wasn't downgrading to TLS 1.1.

I've now added some certificate investigation code to the point where my example SChannel servers set up their contexts. This at least allows me to warn and degrade to TLS 1.1 if the certificate is not going to work with TLS 1.2. In production systems I expect we'd just fail to start up.

   CCredentials::Data data;

   data.cCreds = 1;
   data.paCred = &pCertContext;

   // Note that for TLS 1.2 you need a certificate that is signed with SHA1 and NOT MD5.
   // If you have an MD5 certificate then TLS 1.2 will not connect and will NOT downgrade
   // to TLS 1.1 or something else...

   DWORD dataSize = 0;

   if (::CertGetCertificateContextProperty(
      ByteBuffer buffer(dataSize);

      BYTE *pData = buffer.GetBuffer();

      if (::CertGetCertificateContextProperty(
         const _tstring signHashDetails(reinterpret_cast<tchar *>(pData), dataSize);

         if (signHashDetails.find(_T("MD5")) != _tstring::npos)
            if (enabledProtocols == 0 || enabledProtocols & SP_PROT_TLS1_2)
               OutputEx(_T("Certificate is signed with: ") + signHashDetails);
               OutputEx(_T("MD5 hashes do not work with TLS 1.2 and connection downgrade does NOT occur"));
               OutputEx(_T("This results in failure to connect. Disabling TLS 1.2"));

               if (enabledProtocols)
                  enabledProtocols &= ~SP_PROT_TLS1_2;

               if (!enabledProtocols)
                  // if enabledProtocols is zero then all protocols are
                  // enabled, so we need to explicitly enable everything
                  // except TLS1.2

                  enabledProtocols = SP_PROT_SSL3TLS1;

   data.grbitEnabledProtocols = enabledProtocols;

Latest release of The Server Framework: 6.6.4

Version 6.6.4 of The Server Framework was released today.

This release is mainly a bug fix release for clients using WebSockets over SSL.

WSARecv, WSASend and thread safety

Last week I learnt something new, which is always good. Unfortunately it was that for over 15 years I'd been working under a misconception about how an API worked.

Tl;dr When using WSARecv() and WSASend() from multiple threads on a single socket you must ensure that only one thread calls into the API at a given time. Failure to do so can cause buffer corruption.

It all began with this question on StackOverflow and I dived in and gave my usual response. Unfortunately my usual response was wrong and after spending some time talking to the person who had asked the question and running their test code I realised that I've been mistaken about this for a long time.

My view has always been that if you issue multiple WSARecv() calls on a single socket then you need to be aware that the completions may be handled out of sequence by the threads that you have servicing your I/O completion port. This is purely due to thread scheduling and the actual calls to WSARecv() are thread safe in themselves. I wrote about this here back in 2002. My belief was that you could safely call WSARecv() from multiple threads on the same socket at the same time and the only problem would be resequencing the reads once you processed them. Unfortunately this is incorrect as the example code attached to the question shows.

The example code is somewhat contrived for a TCP socket in that it doesn't care about the sequencing of the read completions and it doesn't care about processing the TCP stream out of sequence. It issues multiple WSARecv() calls from multiple threads and the data being sent is simply a series of bytes where the next byte is the value of the current byte plus one and where we wrap back to zero after a 'max value' is reached.

Such a stream with a max value of 7 would look like this: 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x00, 0x01, 0x02, 0x03...

Validating such a TCP stream is as simple as taking any read that completes in any order and simply ensuring that the bytes that are contained in the buffer that has been returned follow the expected pattern. Starting from the first byte, whatever value it is, subsequent bytes must be one greater until the 'max value' is reached at which point the wrap to zero and continue to increment. Assuming my long held beliefs were true then it didn't matter how many threads were issuing WSARecv() calls on the socket at the same time, the resulting slices of the TCP stream should all be valid.

Unfortunately the test program fails to prove this and instead proves that without synchronisation around the WSARecv() call the returned stream slices can be corrupt. That is the data in the buffers returned from the read completion can include bytes that do not follow the expected pattern.

Of course the way that the test program uses the API is of limited use as I can't think of a TCP stream where it would be useful to be able to process the stream in randomly sized chunks with no need to have processed the data that comes before the current data. One of the reasons that I believed that I understood the requirements of the API was that I never used it this way. In systems where multiple reads on TCP streams were allowed I would always increment a sequence number, put the sequence number in the read buffer's meta data and then issue the WSARecv() all as an atomic step with locking around it. This made it possible to correctly sequence the read completions and also had the side effect of preventing multiple threads calling WSARecv() on a single socket at the same time. However, with UDP sockets the broken usage pattern is much more likely and I think that I may have seen the corruption on a client's system in the past - we're testing with fixed code now.

I've yet to fully investigate the WSASend() side of the problem but I'm assuming that the issue is the same. The person who asked the question on StackOverflow has seen data corruption when the receive side is protected by synchronisation and the send side isn't and I've no reason to doubt his analysis. I would like to think that calling WSASend() on one thread and WSARecv(), on another on the same socket, at the same time, is OK, but for now I'm assuming it's not and simply using a single lock per socket around both calls.

The current documentation for WSARecv() has this to say about thread safety: If you are using I/O completion ports, be aware that the order of calls made to WSARecv is also the order in which the buffers are populated. WSARecv should not be called on the same socket simultaneously from different threads, because it can result in an unpredictable buffer order. Which, IMHO, doesn't actually address this issue. It mentions the unpredictable buffer order, which is expected, but not the buffer corruption. Also the documentation for WSASend() has an identical note with "WSARecv" replaced by "WSASend", which, when you think about it, doesn't actually make sense at all. I don't remember seeing these notes when I was first looking at the documentation, but who knows (yes, I'd like a fully preserved set of all revisions to the MSDN docs so that I can diff back to what the docs said when I wrote the code ;) ). Network Programming for Microsoft Windows, Second Edition, doesn't mention any thread safety issues at all in its coverage of WSASend() and WSARecv() as far as I can see from a brief scan.

Apart from the one known case of UDP datagram corruption that may be caused by my misunderstanding I think most of our systems are pretty safe just by their design. However, there will be a fix included in the 6.6.3 release and 7.0 wont be susceptible to this problem at all due to its Activatable Object usage at the socket level.

It's always good to learn something new, more so when it challenges long held beliefs and especially when it proves those beliefs to have been incorrect.
I've just found and fixed a bug in version 6.6 of the OpenSSL Option Pack which relates to how we deal with outbound connection establishment errors. Changes in 6.6 mean that the OpenSSL 'connector' holds a reference to the socket whilst the SSL connection is active. The bug prevents connections which fail to be established from closing and causes a socket leak.

Note that this is ONLY for outbound (client) connections that you establish with The Server Framework. Inbound connections to servers are unaffected.

This fix will be included in release 6.6.1 which will be released later this month. If you need the fix sooner then please get in touch.
I've just found and fixed a bug which has been present in The Server Framework from the very beginning. The problem affects connections which are "sequenced" and which have had more than 2,147,483,647 writes performed on them. The observant amongst you will be thinking that it's a counter wrap bug and you'd be correct. The annoying thing for me is that the code in question has unit tests which explicitly test for correct operation when the sequence number wraps; the tests pass but the bug is still there.

The reason that the bug survived the unit tests is that the tests were testing the obvious place for the problem; the CInOrderBufferList class which ensures that if a series of buffers with sequence numbers in them are added to the collection then they are removed in the sequenced order. The wrap test code checks that the collection doesn't get confused when it has buffers with very high sequence numbers in it at the same time as buffers with low sequence numbers are added. The test works fine and tests the correct aspects of the class under test. Unfortunately the code that generates the sequence numbers that go into the buffers in the real code was out of sync with the code that processes these sequence numbers. The CInOrderBufferList class used the IBuffer::SequenceNumber type, a size_t, throughout, the code that generated the numbers did so using a long. The type mismatch meant that the sequence numbers generated were from 0 to 2,147,483,647 and then back to 0. Unfortunately the consuming code did not wrap from 2,147,483,647 to 0 and so expected a buffer with a sequence number of 2,147,483,648 to come along next and it never would.

The symptoms of the bug are that data stops being written to the network and memory usage increases.

I expect the code was originally OK for 32-bit platforms (where size_t and unsigned long are the same size) as I'm sure the code used to use ::InterlockedIncrement() to increment the sequence numbers in the producer and this would have treated the long as an unsigned long for increment and wrap purposes. Every x64 build has always been broken through and the x86 build was broken as far back as release 5.0

I've added a test which is run once inside the sequence number generator which ensures that the numbers being generated have the same range as the IBuffer::SequenceNumber type. This prevents the broken code from compiling (signed/unsigned mismatch) and throws and exception at run time if the code can compile but the data types are incorrect. The library's unit tests then fail due to the exception.

This fix will be included in release 6.5.8 which also includes the changes for the Windows 8/Server 2012 AcceptEx() bug. This will be released this week.
Be aware that there is a known bug in Windows 8 and all Server 2012 variants which causes AcceptEx() completions to be delayed in some situations. This was confirmed by a Microsoft representative on Microsoft Connect, see the error report ticket here. An example of how to demonstrate this bug, its likely affects and the current know causes can be found here in this Stack Overflow question.

I'm a little disappointed with the official response to this bug report. I wasn't expecting a fix to be included in the imminent release of Server 2012 but it would be nice to get more information back from Microsoft about what the actual cause of the problem is. Right now the Stack Overflow question is the only source of information about what may cause this problem and which API calls it is caused by and affects. Given that it appears to be a bug in how IOCP completions are dispatched for threads that issued I/O requests but are now in some kind of blocking state (possibly something to do with APC dispatch and the fact the blocking thread is not currently in an alertable wait state - but this is all guess work).

Note that it is unlikely that code built with The Server Framework will suffer any problems from this bug. We always suggest that you never do blocking calls on I/O threads and all of our AcceptEx() calls (except the first ones which are usually issued when the server starts up) are made from an I/O thread. As long as you are following our advice and are using a separate thread pool for potentially blocking operations then your code should run just fine even on operating systems where this bug is present.

I will be issuing a bug fix release shortly which will marshal the initial AcceptEx() calls that are made when you call StartAcceptingConnections() off of the calling thread and onto one of the I/O pool threads. This is an easy change which will require no changes to non library code and will remove any chance that these initial AcceptEx() calls could suffer from delayed completion if you happen to call an inappropriate Windows API on that thread (currently just a synchronous ReadFile() or anything that itself calls down to a synchronous ReadFile()).

Note that this bug only affects AcceptEx() completions and so only affects servers that use the CStreamSocketServerEx class.
The recent changes to 6.5.5 so that JetByteTools::Socket::CReadTimeoutStreamSocketConnectionFilter holds a socket reference when the timer is set causes problems when the socket is shutdown. The filter doesn't see the shutdown and fails to cancel the timer which means that a reference is held on the socket until the timeout expires. This delays socket closure and causes your timeout handling code to be run on a socket which has been cleanly shut down. The fix addresses this by notifying the filter chain of the shutdown.

This change will be included in 6.5.6 which is currently in testing. If you need the fix now, please get in touch.

This bug slipped through testing due to a lapse on my part. A manual step in the release process needs to be automated and made part of the build/test/release cycle that runs on my build machines. All of the servers are tested using the test clients that ship as part of the server examples and many of these test clients should have failed during testing of 6.5.5. The tools that are used as part of the build and test, such as the client tests, are built via a script and use the latest version. They are then checked in as part of the release and are used during the build/test/release cycle on the build machines. For 6.5.5 the script wasn't run and so it was tested with the 6.5.4 versions of the test clients which didn't have this problem.
Our Secretive Online Game Company client uses The Server Framework for their custom application server for the games industry. They have thousands of users who run their server on a very diverse set of hardware. This is great for us as it really helps to shake down The Server Framework. There's nothing like running your multi-threaded code on lots of different hardware to help find all of the hidden race conditions and whatever. I'm pleased that we have so few bug reports coming in from our clients. Especially knowing that our Online Game Company client has the latest code out in the field or at least in use internally on their cloud system. Unfortunately one of their clients has recently exposed a latent bug in a rarely used corner of The Server Framework.

One of the features of The Server Framework is that we track new features in various Windows operating systems so that you can take advantage of them simply by upgrading. Often you only need to adjust your Config.h file to enable powerful new features. One of these features is FILE_SKIP_COMPLETION_PORT_ON_SUCCESS, you can read more about it over on Len's blog. This allows some optimisation in thread scheduling and context switching and generally improves performance on busy servers.

Back in April 2011 we updated our FILE_SKIP_COMPLETION_PORT_ON_SUCCESS support to include protection from incompatible networking providers, see this Microsoft Knowledge Base article for details of the potential problem. In addition to the compile time support for FILE_SKIP_COMPLETION_PORT_ON_SUCCESS which can be used to turn the feature on and off in The Server Framework, we added a run time check to ensure that the machine on which the code was running did not have any incompatible networking providers installed.

Unfortunately this code wasn't tested as well as it could be and there was a bug in it which leads to problems on systems that have an incompatible networking provider installed and FILE_SKIP_COMPLETION_PORT_ON_SUCCESS support turned on. This leads to the completions for some operations being processed twice and subsequent reference counting (over-release) problems with the corresponding socket and buffer structures.

We've fixed this issue and it will be included in a 6.5.4 release which is currently in test. If you think you're suffering from the problems caused by this and need the fix immediately then please get in touch.

The offending networking provider in this case was the "AVSDA" provider which, a quick web search suggests, is part of Avira Anti virus.

Bug in CStringConverter::WtoA().

There's a bug in CStringConverter::WtoA() when the default system code page is set to a multi-byte code page. The result is that the call will fail with an exception stating that "The data area passed to a system call is too small". This was due to some naive code page handling (or lack of handling) in the code that determined the required buffer size.

This bug will be fixed in release 6.4 of The Server Framework which does not currently have a release date. If you need the fix sooner please get in touch.

Bug fix for 6.3.x CServiceManager.cpp

There's a stupidly obvious, should have been caught at numerous points during pre-release testing, bug in CServiceManager.cpp.

The code below that starts at line 158:

Should actually look like this, note the inclusion of the missing break; and the exception source correction:

What I find especially annoying about this particular bug is that:
   a) this piece of code has pretty good test coverage but we don't cover this particular case. 
   b) the black box server tests that run as part of the release process on the build machines run the service examples using the /debug switch to run them as normal executables to avoid the requiring admin rights on the build machines.
   c) it's the most common use case for this particular piece of code which means that the client that reported it to me last night is the only person using 6.3 or later with single instance services and all of the service development that I've done recently must have been multiple instance.

The fix above will be released in 6.3.3 which will occur some time next week. Hopefully this is all that it will include, but many clients have upgraded to 6.3.2 from much earlier versions and so there may be a few more issued lurking in there.  
I've been improving my pre-release testing system and now run a lock inversion detector as part of my build machine's build and test cycle for the socket server examples. This lock inversion detector can detect the potential to deadlock without the code ever needing to actually deadlock, so it's a pretty powerful tool. It has detected a lock inversion in the async connectors used by the OpenSSL, SChannel and SSPI Negotiate libraries. I've fixed these problems and there will be a 6.3.2 release next week. 

At present I haven't finished changing all of the build scripts for all of the servers to include the lock inversion detection phase into the test runs and so there may be more issues yet to be discovered.

Note that these lock inversions have not, as far as I know, ever actually caused a deadlock, but the possibility is there if the right timing occurs between concurrent read and write operations.
There's a bug in the OpenSSL stream socket filter which results in I/O buffers being leaked on both inbound and outbound data flow. This causes the memory used by the client or server to grow throughout the life of the process. The bug has been present since release 6.2 when the structure of the filtering code was redesigned.

Note that due to the fact that the filtering code was all redone at that time I expect that the same bug is likely to be present in the SChannel, SSPI and compression filters.

I've fixed the problem in the OpenSSL code and will be running this through my test harnesses over the next few days. If you want the fix now then please get in touch, if you can wait then I expect 6.3.1 will be released next week.

This shows that my pre release testing could still do with some work, automated unit tests and server stress tests can show that the code works but don't, in themselves, necessarily catch everything and of course they're only as good as the test coverage in any case. I discovered this issue whilst building some new performance tests which use perfmon to monitor the process under test and produce reports at the end. I guess I now have to extend the scope of this project so that I can monitor expected max and min values from the counters I'm monitoring - so I can see if the memory is ballooning during my automated tests...

Follow us on Twitter: @ServerFramework

About this Archive

This page is an archive of recent entries in the Bug fixes category.

Winsock Registered I/O is the previous category.

Development is the next category.

I usually write about the development of The Server Framework, a super scalable, high performance, C++, I/O Completion Port based framework for writing servers and clients on Windows platforms.

Find recent content on the main index or look in the archives to find all content.