R2DBC provides non-blocking reactive APIs to relational database programmers in Java. It is an open specification, similar to JDBC. JDBC however uses a thread per connection while R2DBC can handle more connections using less threads.
In this blog I did several tests on REST services with a database back-end. I varied
- assigned cores to the load generator and service
- connection pool sizes and with/without connection pool for R2DBC
- driver (JDBC or R2DBC)
- framework (Spring, Quarkus)
- response times
- CPU used
- memory used
When does R2DBC outperform JDBC?
What is there to gain in theory
Threads consume resources
Using less threads means
- using less memory; threads require memory
- using less CPU; less context switches
Thus in theory higher performance using the same resources at high concurrency.
Java threads have their own stack and thus require memory. Using less threads means your process will use less memory.
In Java 8, a single thread would cause around 1Mb of memory to be reserved and committed (read here ). In Java 11 and higher this has improved; memory allocation for threads has become less aggressive. Although still around 1Mb per thread will be reserved, but it will no longer directly be mapped to actual RAM, meaning the actual RAM can be used for other things (and will only be claimed when used), which is a definite improvement. I would expect that applications using many threads running on Java 8 would benefit in terms of memory usage of going to Java 11.
Having a large number of concurrent threads running, also has additional CPU cost due to context switching (read here ). CPUs consist of cores and cores can host a fixed number of threads (see here ). Usually 2 threads per core (when using hyper-threading). My laptop has 6 cores so my system can run 12 threads simultaneously. Applications however are not limited to using only 12 threads. A scheduler assigns a portion of CPU thread time to an application thread and after that period has passed, another thread gets a turn. This switch has a CPU cost. The more threads you have, the more these switches take place. There is usually an optimum number of applications threads where the benefit of concurrency outweighs the additional CPU cost of context switches. If you cross that optimum, adding more application threads will reduce overall performance.
When you can handle more connections using less threads, you save CPU time which would otherwise be required to accommodate for context switches.
What did I measure?
I’ve created a functionally similar implementation of a service with a database backend (Postgres). I did requests on the services which returned 10 database records per request. You can find the sample implementations I used here .
- JaxRS with RxJava using JPA and a JDBC driver using the Hikari connection pool
- Quarkus with RESTEasy using a JDBC driver with the AgroalPool connection pool
- Quarkus with RESTEasy using a R2DBC driver with the R2DBC connection pool
- Spring Boot using JPA JDBC and Hikari connection pool
- Spring Boot using Spring REST Data with JPA, JDBC and Hikari connection pool
- Spring Boot WebFlux with Spring Data using an R2DBC driver and no connection pool
- Spring Boot WebFlux with Spring Data using an R2DBC driver and the R2DBC connection pool
I’ve assigned 1,2 and 4 CPUs to the service and tested with connection pool sizes of 5, 20 and 100. 100 was the maximum number of connections the Postgres database would allow (a default setting I did not change).
I ran compiled and ran the services on OpenJDK 11 with 2Gb of memory assigned and G1GC. The tests did not hit the memory limit thus garbage collection was limited.
I’ve used wrk to perform HTTP benchmarking tests at concurrency of 1, 2, 4, 10, 25, 50, 75, 100. wrk is more efficient in using CPU than for example Apache Bench when running at higher concurrency. Also I assigned 1,2 and 4 cores to the load generator (wrk). At the start of each test, I first ‘primed’ the service so it could build up connections, create threads and load classes by providing full load for a single second. After that I started the actual test of 60 seconds. From the wrk output I parsed (amongst other things) throughput and response times. This is described in my blog post here .
I’ve measured response time, throughput, CPU usage and memory usage.
CPU is measured using /proc/PID/stat which is described here . Memory is measured using /proc/PID/smaps which is described roughly here . Private, virtual and reserved memory did not differ much thus I mostly looked at private process memory.
What were the results?
I’ve tested all the combinations of variables I’ve mentioned above (30 hours of running tests of 60 seconds each). You can find the raw data here . For every line in the findings, I could have shown a graph, but that would be too much information. If you want to have a specific question answered, I recommend loading the data in Excel yourself (it is plain CSV) and play around with a pivot table + pivot graph (do a bit of data exploration).
Effect of the R2DBC connection pool
I tested with and without an R2DBC connection pool using Spring Boot WebFlux.
- Memory usage when using a connection pool was significantly higher than when not using a connection pool
- CPU usage when using the R2DBC connection pool was significantly higher compared not to using the pool
- The connection pool size did not matter much
- Average latency was a lot higher (around 10x) when not using a pool
- The number of requests which could be processed in 60 seconds when using a pool was a lot higher
- Assigning more or less CPUs to the service or the load generator did not change the above findings
Summary: using an R2DBC connection pool allows higher throughput, shorter response times at the cost of higher memory and CPU consumption.
Blocking Quarkus JDBC vs non-blocking Quarkus R2DBC
Now it became more difficult to reach general conclusions
- JDBC with a small connection pool was able to process most requests during the one minute test
- At no concurrency (only one request running at the same time) JDBC outperformed R2DBC with about 33% better response times
- There is an optimum concurrency where R2DBC starts to outperform JDBC in number of requests which can be processed in a minute. When you go higher or lower with concurrency, JDBC seems to do better
- When concurrency is increased R2DBC did better with a large connection pool while JDBC started to perform worse when the connection pool was increased
- Response times of R2DBC were generally worse than those with JDBC
- JDBC took a lot more memory and CPU than R2DBC. This difference became larger at high concurrency.
Perhaps a concurrency of 100 was not enough to make R2DBC shine. R2DBC seems to react differently to connection pool sizes than JDBC with respect to response times and throughput. When short on resources, consider R2DBC since it uses less CPU and memory (likely due to it using less threads or using available threads more efficiently).
|This graph was taken at a concurrency of 100. JDBC uses more memory than R2DBC|
Quarkus vs Spring Boot vs Spring Boot WebFlux vs JaxRS/JavaRx
Response times and throughput
- A complete blocking stack Quarkus + RESTEasy + JDBC gives best response times at a concurrency of 100 and also best throughput.
- When using Spring Boot, you can get best response times and throughput at high concurrency by using WebFlux with an R2DBC driver and pool. This is a completely non-blocking stack which uses Spring Data R2DBC.
- When using Quarkus, JDBC gives best performance at high concurrency. When using Spring Boot Webflux, R2DBC gives best performance at high concurrency.
- Spring Data REST performs worse compared to ‘normal’ Spring Boot REST services of WebFlux. This is to be expected since Spring Data REST gives you more functionality such as Spring HATEOAS.
- Non-blocking services with JAX-RS + RxJava and a blocking backend gives very similar performance to completely blocking service and backend (Spring Boot JPA using JDBC).
- A statement like ‘a completely non-blocking service and backend performs better at high or low concurrency than a blocking service and backend’ cannot be made based on this data.
Summary: For best response times and throughput in Spring Boot use WebFlux + R2DBC + the R2DBC connection pool. For best response times and throughput in Quarkus use a blocking stack with JDBC.
- Quarkus with R2DBC uses least memory but Quarkus with JDBC uses most memory at high concurrency
- Spring Boot memory usage at high concurrency between JDBC and R2DBC or between normal and WebFlux services does not differ much
- When CPU is limited, Quarkus using R2DBC or JDBC are quite efficient in their usage. Spring Boot Webflux without an R2DBC pool however uses least CPU. Spring Data REST uses most CPU.
Of course when you want to further reduce resource usage, you can look at native compilation of Quarkus code to further reduce memory and disk-space used. Spring Framework 5.3 is expected to also support native images but that is expected to be released in October of 2020 .
To summarize the results
- For Quarkus
stick to JDBC when throughput/response times are important (even at high concurrency)
consider R2DBC when you want to reduce memory usage
- For Spring (Boot)
Consider Webflux + R2DBC + R2DBC pool when response times, throughput and memory usage are important
- R2DBC cannot yet be used in combination with JPA
- If you use an application server, most likely you are tied to JDBC and cannot easily switch to R2DBC
- Currently, there are only R2DBC drivers for a handful of relational databases (Postgres, MariaDB, MySQL, MsSQl, H2). The Oracle database driver is noticeably lacking. New versions of the driver however already contain extensions to make reactive access possible. Keep an eye on this thread
- Project Loom will introduce Fibers to the Java language. Using fibers, the resources used to service database requests could in theory be further reduced. What will the impact of the introduction of Fibers to this mix? Will R2DBC adopt fibers? Will JDBC adopt fibers (making R2DBC obsolete)? Will a new standard emerge?