JEP 328: Flight Recorder
Authors | Markus Grönlund, Erik Gahlin |
Owner | Erik Gahlin |
Type | Feature |
Scope | JDK |
Status | Closed / Delivered |
Release | 11 |
Component | hotspot / jfr |
Discussion | hotspot dash dev at openjdk dot java dot net |
Effort | L |
Duration | M |
Reviewed by | David Holmes, Karen Kinnear, Mikael Vidstedt |
Endorsed by | Mikael Vidstedt |
Created | 2017/12/12 19:13 |
Updated | 2018/09/09 16:43 |
Issue | 8193393 |
Summary
Provide a low-overhead data collection framework for troubleshooting Java applications and the HotSpot JVM.
Goals
- Provide APIs for producing and consuming data as events
- Provide a buffer mechanism and a binary data format
- Allow the configuration and filtering of events
- Provide events for the OS, the HotSpot JVM, and the JDK libraries
Non-Goals
- Provide visualization or analysis of collected data
- Enable data collection by default
Success Metrics
- At most 1% performance overhead out-of-the-box on SPECjbb2015
- No measurable performance overhead when not enabled
Motivation
Troubleshooting, monitoring and profiling are integral parts of the development lifecycle, but some problems occur only in production, under heavy load involving real data.
Flight Recorder records events originating from applications, the JVM and the OS. Events are stored in a single file that can be attached to bug reports and examined by support engineers, allowing after-the-fact analysis of issues in the period leading up to a problem. Tools can use an API to extract information from recording files.
Description
JEP 167: Event-Based JVM Tracing added an initial set of events to the HotSpot JVM. Flight Recorder will extend the ability to create events to Java.
JEP 167 also added a rudimentary backend, where data from events are printed to stdout. Flight Recorder will provide a single high-performance backend for writing events in a binary format.
Modules:
jdk.jfr
- API and internals
- Requires only
java.base
(suitable for resource constrained devices)
jdk.management.jfr
- JMX capabilities
- Requires
jdk.jfr
andjdk.management
Flight Recorder can be started on the command line:
$ java -XX:StartFlightRecording ...
Recordings may also be started and controlled using the bin/jcmd tool:
$ jcmd <pid> JFR.start
$ jcmd <pid> JFR.dump filename=recording.jfr
$ jcmd <pid> JFR.stop
This functionality is provided remotely over JMX, useful for tools such as Mission Control.
Producing and consuming events
There is an API for users to create their own events:
import jdk.jfr.*;
@Label("Hello World")
@Description("Helps the programmer getting started")
class HelloWorld extends Event {
@Label("Message")
String message;
}
public static void main(String... args) throws IOException {
HelloWorld event = new HelloWorld();
event.message = "hello, world!";
event.commit();
}
Data can be extracted from recording files using classes available in
jdk.jfr.consumer
:
import java.nio.file.*;
import jdk.jfr.consumer.*;
Path p = Paths.get("recording.jfr");
for (RecordedEvent e : RecordingFile.readAllEvents(p)) {
System.out.println(e.getStartTime() + " : " + e.getValue("message"));
}
Buffer mechanism and binary data format
Threads write events, lock-free, to thread-local buffers. Once a thread-local buffer fills up, it is promoted to a global in-memory circular buffer system which maintains the most recent event data. Depending on configuration, the oldest data is either discarded or written to disk allowing the history to be continuously saved. Binary files on disk have the extension .jfr
and are maintained and controlled using a retention policy.
The event model is implemented in a self-describing binary format, encoded in little endian base 128 (except for the file header and some additional sections). The binary data format is not to be used directly as it is subject to change. Instead, APIs will be provided for interacting with recording files.
As an illustrative example, the class load event contains a time stamp describing when it occurred, a duration describing the timespan, the thread, a stack trace as well as three event specific payload fields, the loaded class and the associated class loaders. The size of the event is 24 bytes in total.
<memory address>: 98 80 80 00 87 02 95 ae e4 b2 92 03 a2 f7 ae 9a 94 02 02 01 8d 11 00 00
- Event size
[98 80 80 00]
- Event ID
[87 02]
- Timestamp
[95 ae e4 b2 92 03]
- Duration
[a2 f7 ae 9a 94 02]
- Thread ID
[02]
- Stack trace ID
[01]
- Payload [fields]
- Loaded Class:
[0x8d11]
- Defining ClassLoader:
[0]
- Initiating ClassLoader:
[0]
- Loaded Class:
Configure and filter events
Events can be enabled, disabled, and filtered to reduce overhead and the amount of space needed for storage. This can be accomplished using the following settings:
- enabled - should the event be recorded
- threshold - the duration below which an event is not recorded
- stackTrace - if the stack trace from the Event.commit() method should be recorded
- period - the interval at which the event is emitted, if it is periodic
There are two configuration sets that are tailored to configure Flight Recorder for the low-overhead, out-of-the-box use case. A user can easily create their own specific event configuration.
OS, JVM and JDK library events
Events will be added covering the following areas:
- OS
- Memory, CPU Load and CPU information, native libraries, process information
- JVM
- Flags, GC configuration, compiler configuration
- Method profiling event
- Memory leak event
- JDK libraries
- Socket IO, File IO, Exceptions and Errors, modules
Alternatives
An alternative to Flight Recorder is logging. Although JEP 158: Unified JVM Logging provides some level of uniformity across subsystems in the HotSpot JVM, it does not extend to Java applications and the JDK libraries. Traditionally, logging usually lacks an explicit model and metadata making it free form with the consequence that consumers must be tightly coupled to internal formats. Without a relational model, it is difficult to keep data compact and normalized.
Flight Recorder maintains a typed event model where consumers are decoupled from internals by using an API.
Testing
Performance testing will be required to ensure acceptable levels of overhead.
Risks and Assumptions
Vendor-specific backends might have been developed on top of JEP 167; the working assumption is that the Flight Recorder infrastructure ought to cover most of the existing use cases. Vendors are encouraged to engage in discussion in the context of this JEP about the feasibility of moving to a single backend as suggested.
Flight Recorder has existed for many years and was previously a commercial feature of the Oracle JDK. This JEP moves the source code to the open repository to make the feature generally available. Hence, the risk to compatibility, performance, regressions and stability is low.