JEP draft: JWarmup precompile java hot methods at application startup
Author | Wenqian Pei |
Type | Feature |
Scope | JDK |
Status | Draft |
Component | hotspot / compiler |
Reviewed by | Tobias Hartmann, Vladimir Kozlov |
Endorsed by | Vladimir Kozlov |
Created | 2018/05/25 16:52 |
Updated | 2022/12/05 18:04 |
Issue | 8203832 |
Summary
JWarmup overcomes Java application warmup performance problem caused by JIT compiler threads competing with normal java threads for CPU resource at the same time when both the application (requests) loads up at peak and JIT kicks in for compiling tasks. By precompiling java hot methods during warmup, JWarmup can successfully improve peak time performance degradation.
Goals
Pre-compile java hot methods during startup to reach peak performance sooner and to reduce CPU usage for java application at load up peak time.
Non-Goals
It is not a goal to have fast startup. It is not a goal to have optimal generated code from start - it could be recompiled later. It is not guaranteed that all wanted hot methods will be pre-compiled.
Success Metrics
In a normal run under same application execution, collect and record compiling information into a file. Next run with this recorded file to compile the recorded hot methods into native versions ahead of load up peak time so the methods can run a fast native version instead of being executed in interpreter first. Successful execution should not throw exceptions or crash, instead, it should behave like a normal run and run with lower CPU usage.
Motivation
For a normal java method be compiled into native code, C2, JIT server compiler uses profiled data collected on the target method during runtime and makes decision when it will be compiled into native code. For a large java application, the load usually comes in large within a short period of time, meanwhile this leads to hot methods be compiled by JIT compiler to have fast native versions. JIT threads would take much more CPU cycles for the compilation tasks at the same time so that less resources of CPU for the java application threads. When such case happens, application throughput decreases and response time becomes longer evidently. The solution of overcoming the problem is pre-compile the hot java methods ahead of real large load come in.
Description
There are 2 phases to enable JWarmup, pre run and normal run. Pre run usually runs with massive load data for testing. The purpose of pre-run is to record compiling information (profiling data) for the hot java methods and store the data in a disk file. During a normal run, the application will run with previously recorded file, the JIT threads will first compile the methods in the file into native versions. Those methods thus from starting up, executed in native version other than interpreter mode. In pre-run, class initialization order also is logged, this data used to avoid compilation failure due to disorder of class initialization. There exists class dependency between application classes and class init order is recorded for warmup compilation. Not all classes have dependency but the recorded data can be used to avoid class loading issue. The warmup compilation starts with a notification from application, which is via an API function to inform VM start warmup compilation. The warmup compilation should start at a time when most of the application classes have been loaded, it is controlled by user using API function for notification like: JWarmUp.notifyApplicationStartUpIsDone().
Flags:
-XX:+CompilationWarmUp this enables JWarmUp
-XX:CompilationWarmUpRecordMinLevel= set minimal record level
-XX:+CompilationWarmUpRecording this starts JWarmUp recording
-XX:CompilationWarmUpLogFile= this sets log file path
-XX:CompilationWarmupRecordTime= this sets time to flush log into file, default at vm exit
-XX:+PrintCompilationWarmUpDetail this prints detail information
Other flags: The pre-run with -XX:+CompilationWarmUpRecording plus setting recording file use -XX:CompilationWarmUpLogFile, also set how long it should record with –XX:CompilationWarmUpRecordTime, like: -XX:+CompilationWarmUpRecording –XX: CompilationWarmUpLogFile=”jwarmup.log” –XX: CompilationWarmUpRecordTime=1200 record compiling info for 1200 (s) and store the info into jwarmup.log
With the record file available, application with -XX:+CompilationWarmUp –XX:CompilationWarmUpLofFile=”jwarmup.log” Will precompile the recorded methods in the log file after startup.
Alternatives
There was similar work done as student's bachelor thesis at the Swiss Federal Institute of Technology, Zurich: "Integrating Profile Caching into the HotSpot Multi-Tier Compilation System".
Azul has ReadyNow! technology which also reuses compilation data from previous runs.
OpenJDK since JDK 9 has AOT(Ahead of Time) compilation which can generate a native binary version for java methods. It can improve startup performance and partially avoid the problem in this JEP. Then why should we develop another technique to solve the problems?
AOT works on JDK with Graal compiler. JWarmup is based on existing JIT compiler like C2 so it could be ported to old JDK. AOT and native image has some limits about runtime, like GC policy and other VM settings. JWarmup could support mismatching runtime options depending on how it is implemented. JWarmup is driven by profile data. So it can know hot methods, inline tree, branch profiling, which are helpful to generate more optimized code than AOT compiler.
Based on these thinking, we believe JWarmup is a complement of AOT.
The normal run may behave different with testing pre-run. The profile data is inaccurate, we found some JWarmup methods are deoptimized because of this. In high load environment, it should be avoided. Because of the same reason, JWarmup method is not as optimized as JIT method generated in higher compiler level. To prevent load peak time deoptimization from happening there is a control flag for user to control when the deoptimization can happen (-XX:CompilationWarmUpDeoptTime= ), also add control how many methods can be deoptimized at one iteration (-XX:+CompilationWarmUpDeoptNumOfMethodsPerIter). With those flexible control flags, user can choose a time roughly after the peak time to allow deoptimization to take place.
Testing
The implementation is platform independent, so it would apply to all platforms. Beside existing hotspot testing for JIT, it also includes multiple test cases for this JEP only.
Risks and Assumptions
Use C2 compiler code to generate the native code based on the recorded profiling information. The generated code may not be as good quality as compiled in normal run by C2. Since real application is complicated so the class relationships are complicated too, the order of class loading is recorded, only if the order in class loading in the run is greater or equal to the recorded order number it is safe to compile. This may cause many methods are not compiled at start up due to disorder in class loading.
Dependencies
None.