Java Performance Tuning
Java(TM) – see bottom of page
GC log analysers
Our valued sponsors who help make this site possible
Tips April 2018
|Get rid of your performance problems and memory leaks!|
Java Performance Training Courses
|COURSES AVAILABLE NOW. We can provide training courses to handle all your Java performance needs|
Java Performance Tuning, 2nd ed
|The classic and most comprehensive book on tuning Java|
|Threading Essentials course|
|Get rid of your performance problems and memory leaks!|
Systems Resiliency: Simple Tomcat Backpressure (Page last updated March 2018, Added 2018-04-29, Author Will Tomlin, Publisher Hotels.com). Tips:
- The bulkhead (avoid faults in one part of a system to take the entire system down) and circuit breaker (detect failures and prevent the application from trying to repeat the action until it’s safe to retry) patterns help protect your service against misbehaving dependencies.
- Backpressure prevents your system from being overwhelmed by opposing normal system flow in response to workload demands that cannot be met.
- If you have a queue (eg as typically exists at a service gateway) that shares requests for multiple back-end services, then the queue can become dominated by some types of requests causing an effective denial-of-service for other types of requests
- Instead of queuing requests up to the maximum queue size and rejecting further requests, the service should have a concurrency limit and it should immediately reject or fast fail requests above this limit, letting the system continue to flow and avoiding a buildup of backlogged requests (the client then gets to choose to retry or use an alternate mechanism to complete it’s request).
- Systems will regularly get subjected to conditions that cause excessive workload demands: transient traffic spikes, deployments, service degradation, service dependency issues and so forth. Using backpressure enables your stack to shed excessive load, stop failure from cascading and improve MTTR.
Queueing Theory in Practice: Performance Modeling for the Working Engineer (Page last updated November 2017, Added 2018-04-29, Author Eben Freeman, Publisher LISA17). Tips:
- Queueing theory lets you model your system, reason about it’s behaviour, and understand the system better.
- To determine scaling you can experimentally determine small system throughput and then model the system. For a simple single-server queue, wait time = throughput *(service time)^2 / 2*(1- throughput *service time)
- Improving (decreasing) service time improves everything. Halving service time will allow more than double the throughput for the same wait times.
- Variability is bad. Uniform tasks at uniform intervals is very simple to handle – so use techniques to minimize variability: batching, fast preemption (stop any one request hogging the server), timeouts, client backpressure, concurrency control.
- For load distribution (by a coordinator), choosing the least busy server optimizes performance. But choosing the least busy server has a (coordination) cost, and this is limited by the Universal scalability law. At low parallelism coordination makes latency more predictable, but at high parallelism coordination degrades throughput. A compromise strategy is for the coordinator to choose 2 servers at random and then pick the least busy of the two.
- Unbounded queues produces unbounded latency.
Asynchronous API with CompletableFuture: Performance Tips and Tricks (Page last updated October 2017, Added 2018-04-29, Author Sergey Kuksenko, Publisher Oracle). Tips:
- There is an overhead cost to transition a task from one thread to another, and the shorter the task the higher the relative overhead cost
- If you have a task that waits on a notify signal in an executor thread, you should not use the same thread pool to execute the notify tasks as the pool could be full of waiting threads in which case the notifys will never get executed.
- Avoid blocking inside CompletableFuture chains.
- then*Async() gives predictability (no blocking) but then*() gives performance. Use then*Async(…, executor) to specify which pool executes the task for predictability.
- You want to avoid: unnecessary extra threads; too few threads; and unnecessary transitions across threads.
- FixedThreadPool can be more efficient than CachedThreadPool because of avoiding extra thread transition overheads from the pool deciding it needs more threads.
Concurrent Java: Low scalability, the risk of deadlocks or garbage creation, you can not avoid all (Page last updated March 2018, Added 2018-04-29, Author Thomas Krieger, Publisher vmlens). Tips:
- Locks lead to low scalability when you use only one, or the risk to deadlock when using multiple locks. Compare and swap operations and immutable data structures lead to unnecessary object creation which causes more garbage collection cycles
- If you have a small data structure we can use an immutable class. If your data structure is not performance critical, you can use a single lock. In all other cases, you should start with multiple locks since it is easier to implement than compare and swap operations
- Collections.synchronizedMap has only one lock, ensuring that only one thread at a time accesses the underlying map. This makes sure that threads do not see an inconsistent state and that only one thread at a time modifies the data structure, but it also leads to low scalability, since the threads must wait for each other.
- Guarding with multiple locks allows you to update different parts of a data structure concurrently, but you need to be careful about lock ordering to avoid deadlock – try to minimize the number of potential locking interactions, and follow and document a lock ordering protocol for locks that may be acquired together, and release all locks before invoking unknown code.
- Compare and swap checks the value of a variable and only updates it when the value is the same as the expected value. It is a very performant way to make sure that we never overwrite a value written by another thread. But it leads to extra object creation when the update fails
- Immutable classes do not change their value in the middle of an operation and have no need for synchronized blocks which makes them efficient – you are always working with an unchangeable consistent state so you avoid race conditions. But modifying immutable state consists of copying the immutable object, so each modification leads to the creation of garbage.
Last Updated: 2018-04-29
Copyright © 2000-2018 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Trouble with this page?Please contact us