JEP draft: Safer Process Launch by ProcessBuilder and Runtime.exec

Authorrriggs
OwnerRoger Riggs
TypeFeature
ScopeJDK
StatusDraft
Releasetbd
Componentcore-libs / java.lang
EffortS
DurationM
Created2021/03/16 19:15
Updated2022/03/24 16:40
Issue8263697

Summary

Improve safety of process launch by ProcessBuilder and Runtime.exec on Windows. The arguments are checked to ensure they can be passed to the launched application without the possibility of splitting or combining arguments.

Motivation

The java.lang.Runtime.exec and java.lang.ProcessBuilder APIs are used to launch an application in a separate operating system process. On Windows, the arguments of the caller are encoded into a single command line, to be decoded by the application. It is natural to expect the list of arguments supplied to be passed to the child application intact. While it is true for most operating systems, on Windows, it is not always possible to pass an arbitrary string due the details of command line encoding and decoding.

The details of encoding arguments by Runtime.exec and ProcessBuilder has been previously described only as implementation specific, leaving it to developers to discover, sometimes by trial and error, what works for their particular program. For example, if an argument contains or may contain space or tab, applications find it appealing to add double-quotes around an argument. ProcessBuilder already handles the encoding of arguments with space and tab to provide some measure of portability across operating systems. The application supplied quotes are not necessary and create an ambiguity about whether the quotes themselves are intended to be passed to the application or are only present to ensure the argument is passed as a single string.

On Windows, the arguments are assembled and encoded into a single string passed to the new process when it is created using CreateProcess. In the newly created program, the arguments are parsed from the command line using one of the common conventions for the meaning of quotes, backslashes and special characters. This can lead to cases where the intent is unclear and raises the possibility of a mismatch between the encoding of the arguments and the parsing of the command line into the corresponding arguments. To resolve the ambiguity, the encoding should be well-defined with respect to quotes and special characters and a good match to the parsing of the command line in the application.

For example, quotes are typically balanced, but if there are unmatched quotes in an argument and the argument is added to the command line, the matching quote may not be present and subsequent arguments may be merged into the malformed argument. For example, a list of three arguments { "abc", "\"def", "xyz" } could be naively inserted into the command line as: abc "def xyz and would later be parsed as: {"abc", "def xyz"} joining three arguments into two.

Similarly, if an argument is File with Space and BackSlash\ (without quotes) the resulting command line string should be quoted to keep it as a single string. With first and last quotes added, the string becomes: "File with Space and BackSlash\". Seems reasonable, but we need to look at how that will be parsed by the application. One of the common parsers for the command line considers that a backslash before double quote is a literal quote instead of an opening or closing quote. In this case, the resulting command line is parsed as one argument: File with Space and BackSlash" and contains a final quote instead of backslash.

The most common parsing syntax for applications implemented in C, C++, C# and others can correctly encode arguments with spaces, special characters, literal quotes, and backslashes that are decoded by the application to the original argument strings. Other programs such as .cmd and .bat command lines are processed by a command shell such as cmd.exe use simpler rules that do not have an effective way to encode literal quotes; so some argument strings containing embedded quotes cannot be encoded. The command shell, cmd.exe, that handles .cmd and .bat also implicitly enables interpretation of special characters for redirection that allow file access and pipelines that invoke other programs`. Both of these cases if dubious encoding pose a security risk if not carefully used and reviewed.

In JDK 18 and earlier on Windows, the default for ProcessBuilder and Runtime.exec the encoding of arguments is quite lenient, allowing unmatched quotes, and unquoted special characters that can merge or split arguments. The absence of checks allows various forms of ambiguous commands that cannot be reliably parsed to recover the original arguments.
Stricter checking and argument encoding is supported in earlier versions but the safety features are opt-in requiring the application to explicitly set a system property or to use a security manager. The safer modes help applications apply the recommendations of the Secure Coding Guidelines for Java SE to avoid risks such as injection attacks and unintended execution. But most developers do not take advantage of the additional safety checks. Changing the default to be more secure can reduce the risk of unidentified execution and file access.

Description

We are changing the default for ProcessBuilder and Runtime.exec to require quoted arguments to have properly balanced quotes and to guard against splitting or merging of arguments. For scripts such as .cmd and .bat, executed by shell programs, the encoding of special characters such as < > & | is modified to prevent implicit access to shell features such as redirection and pipelines. This restriction is not applicable when the shell is explicitly invoked; for example: cmd.exe. Most existing programs work as before, as the argument encoding is straight-forward and unchanged. To opt-out of argument checking for an argument it can be wrapped in triple double-quotes. An application that disables the argument checking must be carefully reviewed to avoid potential security risks.

The specific argument checks and encoding depend on whether the executable is an . exe or not. The executable is recognized by Windows and ProcessBuilder as an .exe if the file name ends in case-insensitive .EXE or does not have a dot in the filename. The special characters for .exe and non-.exe that are to be quoted are defined by Windows for [C++ command line arguments] exe-quotes and Cmd arguments as:

ProcessBuilder and Runtime.exec handle argument encoding and passing without the application needing to add or modify the argument except in a few unusual cases. It is strongly recommended to supply arguments without attention to double-quotes or special characters, allowing ProcessBuilder to handle any necessary encoding.

Argument encoding is performed as follows:

For non-.exe commands, the required quoting of special characters (<, >, & , |) prevents the implicit use of redirection and pipelines. For use cases where the application requires explicit use of the shell capabilities the application can invoke cmd.exe /C with the executable and its arguments. The arguments are checked and encoded as above for .exe applications. The same technique has been supported in earlier versions. Extra precautions should be taken in creating the argument passed to a shell to avoid unintentional and possibly risky side effects.

Note that on Linux and macOS, there is no checking for redirection and pipeline characters in arguments. Those characters are only interpreted by command shells such as sh, bash , or zsh and pose a risk only if the executable is a shell. Note the risk of reading or writing files can occur with any command as a normal argument or command option. The protection afforded by restricting, on Windows, redirection and pipeline characters is minimal and not consistent across operating systems. ProcessBuilder does not do any checking or encoding for specific programs on any operating system.

The setting of a security manager does not have any effect on the interpretation or encoding of command line arguments. This is a change from earlier versions that use a safe mode similar to jdk.lang.Process.allowAmbiguousCommands=false
when a security manager is enabled. If a security manager is enabled, the permission to execute the program is checked; this is unchanged from previous versions.

Examples

The motivation for these changes showed some cases where the current command line encoding using the lenient mode may put the application at risk if it incorporates input from the environment or untrusted sources. The example below show how the application to use ProcessBuilder to avoid or mitigate those risks.

Using a List of Arguments instead of a Single String

To run a java Hello program with a string containing spaces use separate arguments.

An array of strings is easier to use and more reliable than a single command line. When using a single string, it must be carefully encoded such that it can be decoded back into the individual arguments. The application is more complex because it must be aware of spaces, special characters, and encoding of quotes. The following may fail if spaces are dropped, added, or special characters are not quoted.

String cmd = "java" + " " + "Hello" + " " + "\"Now is the time.\"";
Process p = Runtime.getRuntime().exec(cmd);

Compare with using an argument list. The runtime handles the encoding of command and arguments containing spaces and quotes to keep arguments separate and distinct.

String[] args = {"java", "Hello", "Now is the time."};
Process p = Runtime.getRuntime().exec(args);
-or-
Process p = new ProcessBuilder(args).start();

Enabling redirection and pipelines

To run a command dir and use a shell pipe to display the contents.

When invoking a normal executable, such as dir the characters are not quoted and are passed to the program unquoted. In this example, the arguments are passed to the dir command. dir is not a shell and does not handle re-direction or pipeline special characters. The output of dir will say it can't find the file "|more".

List<String args = List.of("dir", "|more");
Process p = new ProcessBuilder(args).start();

To paginate the directory listing, cmd.exe is used as the executable.
cmd.exe invokes the dir command and interprets the special character to pipe the output to more.

List<String args = List.of("cmd.exe", "/C", "dir", "|more");
Process p = new ProcessBuilder(args).start();

Redirection and pipelines with .cmd and .bat scripts

To run a .cmd or .bat script and redirect the output to a file.

If the executable is .cmd or .bat, not an .exe, the arguments containing special characters are quoted to avoid implicitly causing redirection, pipelines, or group execution. In this case, the script will be passed the string">log.out" and redirection will not occur.

List<String args = List.of("log.cmd", ">log.out");
Process p = new ProcessBuilder(args).start();

To redirect the log output, cmd.exe is used as the executable shell and is passed the log.cmd as the command to run and the requested redirection.

List<String args = List.of("cmd.exe", "/C", "log.cmd", ">log.out");
Process p = new ProcessBuilder(args).start();

The Default Command Line Encoding is Good Enough

Passing a directory name with spaces to an application.

A typical directory path may or may not contain spaces and end in a backslash ("\"). The straight-forward code works whether the application is an .exe or non-.exe. Note: double backslash in the source is a single backslash in the Java string.

List<String args = List.of("cmd.exe", "/C", "do.cmd", "C:\\Program Files\\");
Process p = new ProcessBuilder(args).start();

ProcessBuilder encodes the argument (because of the space) by surrounding it with double-quotes. The number of backslashes before the quote is doubled as required by the C++ Command line encoding to ensure the quote is considered the matching final quote and not considered a literal quote. The argument in the command line is now:"C:\\Program Files\\\\"

In the case where the application is an .exe executable, the command line is parsed and the doubling of backslashes before the final quote is reversed, yielding the original argument "C:\\Program Files\\".

In the case where the application is a .cmd script, the argument string contains the quotes and additional backslash, "\"C:\\Program Files\\\\\". Since the argument is a Windows directory path, the addition of the backslash is benign and ignored when doing file operations.

Compatibility

Command arguments that do not contain double-quotes or special characters work as before, there are no changes and no special encoding is needed. Arguments containing spaces or tabs also work as before. The most common case for existing applications is lenient mode in which there has been no checking for balanced quotes or use of special characters. The only encoding was done to add quotes to arguments that contain space or tab. While most applications have well-formed arguments, with these changes, exceptions are thrown for malformed arguments; the application should be corrected to balance the quotes.

In the lenient mode of previous releases, there has been no difference between the encoding of arguments of .exe and non-. exe programs with respect to special characters. With these changes, arguments for non-.exe programs containing characters : space tab & < > [ ] | { } ^ = ; ! ' + , ~ are quoted. Existing double quotes in arguments, as long as they are balanced, are fine as is. Arguments containing embedded quotes that should be passed as a literal quote cannot be encoded and an exception will be thrown. An additional fallback, is to invoke the shell, such as cmd.exe and pass the arguments as describe above.

For programs that cannot be updated, backward compatibility with the more lenient mode is achieved by setting the system property jdk.lang.Process.allowAmbiguousCommands=true. The property is set on the command line and cannot be changed using System.setProperty and applies to every use of ProcessBuilder.

Existing applications can be checked for compatibility using current releases by setting jdk.lang.Process.allowAmbiguousCommands=false on the command line. On JDK 18 and earlier, it performs stricter checking of quotes and special characters. Note: the set of special characters for non-.exe programs is expanded with this proposal.

On Linux and macOs operating systems, the jdk.lang.Process.allowAmbiguousCommands system property is unused and arguments are passed literally.

Testing

Existing tests will be updated to verify the new encodings. Compatibility tests will confirm lenient mode enabled with jdk.lang.Process.allowAmbiguousCommands = true has the same behavior as previous JDK versions.

Commands will be tested for .exe, non-exe, and cmd.exe /C use cases.

Risks and Assumptions

The argument passing to start a process is heavily dependent on the interpretation of quotes and special characters in application parameters, the subsequent encoding of those parameters to create a Windows command line and corresponding parsing of the command line to recover the arguments. There is a risk that making the interpretation stricter may throw exceptions in cases that previously were allowed or the command line may be encoded differently.

The interpretation of quotes and special characters is very close to the lenient mode, the most frequent use case in the absence of a security manager. The lenient mode can be re-enabled by setting the jdk.lang.Process.allowAmbiguousCommands system property.