A Hundred Thousand Threads

A lot of the work done by middle-tier web services is requesting data from other services and waiting for a response.

Initially there was a single thread that processed each message one at a time and each request sequentially. As a speed freak I naturally wanted the app to be web scale
. Java is most decidedly not web scale out-of-the-box and so I knew it was time to roll up the sleeves and get to work.

Java developers typically use a sophisticated built-in thread pool framework called Executor Service
to write concurrent programs. Performing concurrent tasks might look something like this:

int nThreads = getRuntime().availableProcessors();
        ExecutorService threadPool = newFixedThreadPool(nThreads);

        for (int i = 0; i  {
                System.out.println("some unit of work");

If the some unit of work
done by each thread is waiting for 2 seconds to get a response, then even with a hundred threads executing concurrently a single core CPU is barely utilized. The Bible of Concurrency–
Java Concurrency in Practice

– tells us that the optimal pool size is:

Ncpu = Number of CPUs
Ucpu = target CPU utilization, 0 <= Ucpu <= 1
W / C = ratio of wait time to compute time.

Optimal Thread pool size = Ncpu * Ucpu * (1 + W / C)

So on an 8 core CPU in an virtual machine completely dedicated to running the service the optimal number of threads for my project is

Ncpu = 8
Ucpu = 1
W / C = 0.99 / 0.01 = 99

Optimal Thread pool size = 8 * 1 * (1 + 99) = 800

Common sense told me that 800 threads is a suspiciously large number. At what point in between 0 and 800 threads does context switching threads deteriorate performance? Can an application even support 800 threads?

Finding out how many threads an app supports is easy: keep creating threads until the program crashes. This gem prints out exactly that:

public class ThreadCounter {
  public static void main(String[] args)  {
    println("max number threads = " +  maxThreads());

  static int maxThreads() {
    int maxNumberThreads = 0;
    try {
        while (true) {
          new Thread(() -> {
      } catch (Throwable t) {
      return maxNumberThreads;

Turns out that on a macbook the number is about 2100 thread. Most OS’ can be configured to handle any number of threads so long as it doesn’t cause a heap overflow (each thread consumes at least 0.5 MB solely on the basis of the stack size) . The next question is: does the performance deteriorate before 2100? A common means of emulating blocking non-CPU-bound IO is to put a thread to sleep for awhile. What I discovered was that the JVM could handle as many threads as I could throw at it.

Mocking the API calls with Thread.sleep()
wasn’t getting anywhere. So a test server was created that received and responded to REST requests that the service being build sent out. On an AWS t2.micro
instance, the numbers of messages the app was consuming per second began to drop at 125 threads. On a m3.medium
instance it topped out around 500 threads. One of the biggest discoveries of this effort was that AWS EC2 instances are unreliable. The performance varied significantly from one test run to another. The second discovery was that creating and destroying TCP connections for each API call over a PoolingClientConnectionManager
is computationally expensive. When trying to mock an API call, Thread.sleep()
is not a legitimate approximation. If it were then a t2.micro
would’ve perform on par with a m3.medium
since both can handle thousands of threads.

After gathering some metrics, 100 threads was decided upon because it was a reliable number for a weak EC2 instance. The production instance can probably handle 500+ threads but 100 yields consistent performance. If more horsepower is needed, another service can be spun up in an autoscaling group within a couple of minutes.





责编内容来自:Carl Martensen (源链) | 更多关于

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » A Hundred Thousand Threads

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录