JEP 279: Improve Test-Failure Troubleshooting

OwnerIgor Ignatyev
TypeFeature
ScopeImplementation
StatusClosed / Delivered
Release9
Discussionhotspot dash dev at openjdk dot java dot net, core dash libs dash dev at openjdk dot java dot net
EffortXS
DurationXS
Relates toJEP 228: Add More Diagnostic Commands
JEP 102: Process API Updates
Reviewed byAleksandre Iline, Brian Goetz
Endorsed byMikael Vidstedt
Created2015/03/20 17:09
Updated2023/03/14 16:05
Issue8075621

Summary

Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures and timeouts.

Goals

Gather the following information to help diagnose test failures and timeouts:

We will develop a library that provides this functionality and co-locate the library sources with the product code.

Motivation

It is difficult to troubleshoot intermittent test failures when there is no information about the testing environment. Such test failures often depend on test execution order and concurrence, which makes it extremely difficult to reproduce them.

Description

Currently, there are two extension points in the jtreg test harness. The first one is the timeout handler, which jtreg runs when a test times out. The second one is the observer, which implements the observer design pattern to track different events in a test run. We will use these extension points to gather diagnostic information and develop a custom observer and timeout handler for jtreg.

Information about environment and non-Java processes will be collected by running platform-specific commands. Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by JEP 228, e.g., the print_vm_state command which collects information similar to hs_err files. The information gathered will be stored for later inspection together with test results. The observer will collect the information on finishedTest events when tests fail.

Since tests may create other processes, information about test processes and their child processes will be collected. To find such processes, the library will create a process tree with the original test process at the root.

Library sources will be placed in the test directory in the top-level repository, and makefiles will be updated to build them and bundle them as a part of test bundles.

Testing

We will schedule regular testing which uses this library. When the results and test execution become stable, we will extend the use of the library to other components.

Risks and Assumptions