JEP 143: Improve Contended Locking
Author | Dan Daugherty |
Owner | Daniel Daugherty |
Type | Feature |
Scope | Implementation |
Status | Closed / Delivered |
Release | 9 |
Component | hotspot / runtime |
Discussion | hotspot dash runtime dash dev at openjdk dot java dot net |
Effort | M |
Duration | L |
Reviewed by | Karen Kinnear |
Endorsed by | Mikael Vidstedt |
Created | 2011/11/30 20:00 |
Updated | 2017/03/06 11:34 |
Issue | 8046133 |
Summary
Improve the performance of contended Java object monitors.
Goals
Improve the overall performance of contended Java object monitors as measured by the following benchmarks and tests:
- CallTimerGrid (though more of a stress test than a benchmark)
- Dacapo-bach (was dacapo2009)
- _ avrora
- _ batik
- _ fop
- _ h2
- _ luindex
- _ lusearch
- _ pmd
- _ sunflow
- _ tomcat
- _ tradebeans
- _ tradesoap
- _ xalan
- DerbyContentionModelCounted
- HighContentionSimulator
- LockLoops-JSR166-Doug-Sept2009 (was LockLoops)
- PointBase
- SPECjbb2013-critical (was specjbb2005)
- SPECjbb2013-max
- specjvm2008
- volano29 (was volano2509)
Non-Goals
It is not a goal of this project to address any performance improvements for internal VM monitors or Mutexes; Java monitors and internal VM monitors/mutexes are implemented by different code. While some of the concepts in this project might be applicable to internal VM monitors/mutexes, the code is not directly applicable.
It is not a goal of this project to improve contended Java monitor performance on every benchmark or test; in some cases there may be a performance degradation in a specific benchmark or test. That performance degradation might be considered acceptable in order to gain a performance improvement on another benchmark or test.
Success Metrics
This project will be considered a success if there are demonstrable performance gains as measured by the above benchmarks without offsetting significant performance regressions.
There must not be a non-trivial performance regression for uncontended locks.
Motivation
Improving contended locking will significantly benefit real world applications, in addition to industry benchmarks such as Volano and DaCapo.
Description
This project will explore performance improvements in the following areas related to contended Java Monitors:
- Field reordering and cache line alignment
- Speed up
PlatformEvent::unpark()
- Fast Java monitor enter operations
- Fast Java monitor exit operations
- Fast Java monitor
notify
/notifyAll
operations
The original body of work also included changes for "faster hashcode"; since Java object hashcode support is not directly related to contended Java monitors, that work will not be included in this project.
This project will also generate fixes for various bugs discovered during the course of the work; these bug fixes will be managed independently of the performance improvement work so that the fixes can be integrated sooner.
This project is covered by the following "umbrella" bug for administrative simplicity:
JDK-6607129 Reduce L2$ coherence miss traffic in contended lock spin loop, specifically for derby on ctn-family
However, as sub-tasks or bug fixes are completed the work will be integrated using a separate bug id. This allows the entire project to be referred to via one bug ID (JDK-6607129) while allowing incremental improvements to be made available more quickly than waiting for the entire project to complete.
Testing
Functional testing
There does not appear to be a specific set of functional tests exclusively for Java monitors, nor is one necessary. Java Monitors are so widely used by even the simplest of Java programs that almost any functional breakage in Java monitors should be obvious.
Stress Tests
There needs to be a set of well known stress tests for Java monitors. These can be targeted stress tests for specific Java monitor scenarios or tests generally known to be heavy users of Java monitors run with specific stress inducing options.
Note: Use '-XX:-UseBiasedLocking -XX:+UseHeavyMonitors' to bypass both biased locking and stack based locking; forces the use of ObjectMonitor objects.
Field reordering and cache line alignment sub-task stress tests
Stress test should focus on generating high numbers of active ObjectMonitor objects. The targets of the stress testing are peak ObjectMonitor usage, the ObjectMonitor block allocation algorithm and the ObjectMonitor free list management code. The following are the goals:
- To have the same or better peak ObjectMonitor usage for small to medium configurations,
- To have no memory leaks, and
- To have no data-structure management failures.
Speed up PlatformEvent::unpark()
sub-task stress tests
Stress test should focus on high numbers of concurrent waiters and/or concurrent enter-exit threads. The mix of enter-wait-exit and enter-exit threads should be configurable. The target of the stress testing is the successor mechanism.
Goal: no hangs due to lost unpark operations.
Fast Java monitor enter operations sub-task stress tests
Stress test should focus on correctness of enter-exit operations with a scalable number of parallel threads. The target of the stress testing is Java monitor ownership.
Goal: No ownership conflicts where more than one thread thinks it owns the Java monitor.
Fast Java monitor exit operations sub-task stress tests
Should be covered by the stress tests for the "speed up
PlatformEvent::unpark()
" and "fast Java monitor enter operations"
sub-tasks.
Fast Java monitor Notify/NotifyAll operations sub-task stress tests
Stress test should focus on correctness of enter-wait-exit operations
with a scalable number of parallel threads. The target of the stress
testing is Java monitor ownership after wait()
completes and the Java
monitor is re-entered.
Goal: No ownership conflicts where more than one thread thinks it owns the Java monitor.