John Cairns bio photo

John Cairns

John is an engineer, architect and mentor who focuses on extremly performance sensitive and ultra high volume applications.

Email Twitter Github

With the release LOG4J 2.7 we are happy to announce that Conversant Disruptor is an available queue strategy in AsyncAppender. Log4j’s own performance tests show that Conversant Disruptor outperforms all other available queue strategies.

Executive Summary

Log4j’s AyncAppenderLog4j2Benchmark is useful for comparing the throughput of various queuing strategies. I measured the JMH benchmark results on several latest generation Intel Xeon (Haswell) machines.

In all tests Conversant Disruptor outperforms other available strategies. Typical applications can use Conversant Disruptor and expect exceptional performance in a wide variety of circumstances.

Experiments

Results from AsyncAppenderLog4j2Benchmark.throuhgputSimple:

  1. Xeon 2 cpu, with taskset 1 core vs 1 core:

    This is the best case scenario for Disruptor, really any queue strategy, all strategies perform better here than otherwise. Conversant Disruptor consistently shows an improvement over other approaches.
  2. Xeon 2 cpus, 12 cores, taskset 2 cores vs 2 cores:

    This is a typical server side setup with a large number of cores and two cores dedicated to log handling. Conversant Disruptor performs well across the board, JCTools MPSC makes a good showing as well.
  3. Xeon 1 cpu, 12 cores:

    In this scenario, taskset is not used to assign cores, instead numactl was used to schedule the benchmark on cpu 1. The data in this experiment has high error bars and is only provided for comparison and to demonstrate consistency with experiments 1 and 2. Typical applications that do not use taskset can still depend on the high throughput characteristics of Conversant Disruptor.
  4. “Real World,” similar to #2, but using the RandomAccessFile Appender rather than NoOp Appender, 8 threads:

    In this case the performance is constrained by file I/O but Conversant Disruptor still gives an advantage over other strategies.
Strategy JMH Throughput
Conversant Disruptor 490709
JCTools MPSC 403534
ArrayBlockingQueue default 216281
LinkedTransferQueue 246532

Throughput higher than what is reported in measurement 4 may not be achievable in a real world configuration, however using a high performance queue can still lower CPU overhead and latency of an application.

Background

I carried out these tests on 2016 Intel Xeon (Haswell) boxes:

/proc/cpuinfo for these cpus gives: model name: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

The boxes are configured with two different base settings. One is setup for highest performance and one for general throughput. All of the Dell 430 servers used in testing have 2 physical cpus, 64Gb of memory, hyperthreading and powersaving disabled. The two configurations are as follows:

  1. 33x clock multiplier, 3.3Ghz, 2 cores per cpu, highest clock rate
  2. 29x clock multiplier, 2.9Ghz, 12 cores per cpu, typical enterprise server

Experiment 1 was carried out with hardware setup 1, and experiments 2, 3 and 4 were carried out on hardware setup 2.

In experiment 1, Conversant Disruptor outperforms across a wide variety of threading scenarios. The reason is simple, Conversant Disruptor is ideally suited to a small number of physical cpus and cores. In this test, increasing numbers of threads penalize the performance as thread contention begins to dominate over throughput.

In experiment 2, Conversant Disruptor generally outperforms other implementations. JCTools MPSC is on par with Conversant Disruptor for some threading scenarios. However the MPSC - single consumer offers a lower level of concurrency protection than Conversant. JCTools does not offer a robust BlockingQueue api nor does it use best practice coding techniques. Like LMAX Disruptor inclusion of ‘sun.misc.Unsafe’ may mean that MPSC stops working at some future date.

Neither Java JDK based queue strategy is on par with the Disruptor-like strategies.

Finally, in the “real world” test, it is demonstrated that even when file I/O is a constraint, Conversant Disruptor gives a notable improvement in throughput. This is a significant finding and it shows that the benefits of Conversant Disruptor go far beyond theoretical.

When used in log4j, Conversant Disruptor greatly improves performance in a wide variety of uses cases. Moreover, Conversant Disruptor outperforms the competition by providing the best performance with the least number of caveats and sacrifices. Finally, Conversant Disruptor is a native Java implementation of the BlockingQueue interface. Conversant Disruptor does not resort to ‘sun.misc.Unsafe’, nor does it require specialized tuning or api customization right out of the box. Conversant Disruptor is simply the fastest, safest, and most stable BlockingQueue available in log4j. Available on github.