One Million TCP Connections...

| 6 Comments
We get the software and then we hold the company to ransom for .... ONE MILLION TCP CONNECTIONS!
It seems that C10K is old hat these days and that people are aiming a little higher. I've seen several questions on StackOverflow.com (and had an equal number of direct emails) asking about how people can achieve one million active TCP connections on a single Windows Server box. Or, in a more round about way, what is the theoretical maximum number of active TCP connections that a Windows Server box can handle.

I always respond in the same way; basically I think the question is misguided and even if a definitive answer were possible it wouldn't actually be that useful.

What the people asking these questions seem to ignore is that with a modern Windows Server operating system, a reasonable amount of ram and a decent enough set of CPUs it's likely that it's YOUR software that is the limiting factor in the equation.
Assuming it's even possible for the chief networking wonks at Microsoft to determine the maximum number of 'active' TCP connections for a given hardware spec I believe that a) that number would be specific to the particular service pack of the operating system AND the network drivers but it would also depend heavily on how you designed your specific piece of server software that is going to service these connections. How much non-paged pool are YOU using? How many pages does your server software have locked in memory for I/O at any one time? If you don't include your software in the calculation then the number is pretty meaningless. 

One thing to be sure of is that you won't get more than 16 million concurrent connections, as that seems to be the maximum value that can be set in the registry for configuring the TCP stack (see here). In reality you wont get anywhere near that figure due to all of the other limits, most of which are less documented and possibly implicit. On earlier Windows operating systems non-paged pool exhaustion was a real problem as the amount of non-paged pool memory was highly constrained and increased very slowly with relation to the physical RAM present. Vista x64 and later fix that (see here) but there are still other limits, the I/O page lock limit, for example, limits the number of memory pages that can be locked for I/O at any one time and whilst this can be tuned to high values you still need to know how your application uses memory for I/O to know what you should set this value to. Both of these are memory related limitations, but then you have the CPU. We'll assume for a moment that you have designed your application to use I/O completion ports and you're following the rules that I laid out in my earlier blog post about the C10K problem. Oh, have we just included your server software in our calculations? 

You see the problem with asking these kinds of questions is that the answers aren't meaningful. What will you do once you know that you can support 1 million active TCP connections on a single Windows Server 2008R2 box? What does that fact allow you to do that you couldn't do before? It doesn't tell you what a server running your server software can support because your software can be arbitrarily complex and use an unknown (and probably equally unmeasurable amount of resources per connection). Are you looking for a big stick to hit your programmers with? "Microsoft says that this spec box will support 500,000 concurrent TCP connections, why are we only achieving 100,000?" I'm sorry but the answer to that is the same as if you removed the "Microsoft says" part; we need to profile the server to see... So, since you need to profile the server to answer your questions knowing the theoretical maximum doesn't really help. I stand by my earlier blog post on concurrent connection testing. You need to do this kind of testing with your real server software from Day 0. There is no simple answer.

Given that you're building scalability tests from the start you'll also learn about the other scalability limits as you grow your server. How are you going to route all of those connections to your server machine or machines? What is your strategy for down time and server maintenance? Etc.

So, there is a definitive answer to the questions posed at the start of this posting:

Q - "How many active TCP connections can a Windows Server 2008 R2 box support?" A - More than the server software that you're running on it.

Q - "Can a machine of a given spec running Windows Server 2003 support 1 million concurrent TCP connections?" A - Only if your server software can also support that number of connections...   

Separating the hardware and OS part of the question from your custom developed server solution is not meaningful.

6 Comments

Aren't you limited anyway by the number of local ports available in your server? I thought that a single box could only handle less than 65535 concurrent TCP connections. Am I right?

No, that's a common misconception. You're limited to the local ports when making outbound connections as each connection consumes a local port and they are limited to 65535 as you point out and when you take into account the number of ports already in use for other services and any connections currently in TIME_WAIT the maximum number of outbound ports is usually at most 50k.

Inbound ports are identified by a tuple that consists of the local ip and port and the remote ip and port and so are not limited in the same way. I've run tests whereby a simple server on very modest hardware supported more than 70,000 concurrent active connections - the test server and client that I used can be found here: http://www.lenholgate.com/archives/000568.html

I have to completely disagree with this article. Knowing what the upper limit is on particular hardware is essential to understanding what the performance you are measuring means.

For example, say your application can comfortably handle 64,000 connections on a particular OS with a particular CPU and RAM size. If you have know that a trivial "echo" server can handle around 85,000 connections on your OS with roughly comparable hardware that tells you something completely different than if a trivial echo server can handle 1,000,000.

In other words, knowing the limits of the OS and hardware gives you the scale that makes your own measurements meaningful.

Say you don't particularly care what OS you use, but you know that you need to make a very light TCP application that has to handle 75,000 concurrent TCP connections with 4GB of RAM. Knowing that one OS can handle 100,000 such connections with a trivial server and another can handle 50,000 will allow you to rule out the latter OS entirely, which could save you wasted days porting and tuning where failure was a certainty from the beginning.

Very informative article. May I please know how did you test 70,000 concurrent active connections? Since we can open maximum 65535 ephemeral ports at a time, did you use two machine to generate the load on server? (I was wondering if there is any way to generate millions of client connections from single machine)

David,

Given then way you've phrased your reply I can assume that you're not the kind of person who asks the question that this article was aimed at...

However, my point is that any limits from the OS and hardware are meaningless unless your "trivial echo server" is also included in the test.

Different APIs on the OS may perform differently and scale differently, it's pointless looking at headline figures for Windows Server 2012 with a specific hardware configuration (which would likely be based on IOCP and/or RIO as they'd give the best numbers) if you're going to be using WSAsyncSelect. So you'd need OS/hardware and API usage figures for these to be meaningful. You then get to a point where an API is more scalable if used in one particular way and so you'd need trivial example servers with each set of OS/hardware/API numbers... And now you're factoring in your application...

Your desire for bare OS/hardware numbers may be more useful for embedded systems, but for something like Windows Server 2012 I stand by my view that it's just not a meaningful figure unless you factor in what your code is doing and how it's doing it.

Yes, I used multiple client machines during the testing.

You could establish more connections from a "single machine" if you used multiple virtual machines or if the "single machine" had multiple NICs and you bound your outbound sockets to a specific NIC. You'd then get around Y x 65535 where Y is the number of NICs that you have.

Why would you want to do this outside of testing though?

Leave a comment

Follow us on Twitter: @ServerFramework

About this Entry

The WebSocket protocol was the previous entry in this blog.

Latest release of The Server Framework: 6.3.3 is the next entry in this blog.

I usually write about the development of The Server Framework, a super scalable, high performance, C++, I/O Completion Port based framework for writing servers and clients on Windows platforms.

Find recent content on the main index or look in the archives to find all content.