Stack Overflow handling in HotSpot JVM

As you probably know, HotSpot JVM has one-to-one mapping of Java threads to OS threads. Each thread is associated with its own stack. Sounds simple… until you realize that in Java there are at least 3 different notions of a stack:

Java Virtual Machine Stack, which stores local variables and keeps track of method invocations. As the specification says, this stack does not need to be contiguous and may be heap allocated. E.g. Java ME virtual machine (that I worked on at Sun Microsystems) indeed had chunked stacks allocated in Java heap.
Operand Stack, which holds operands for bytecode instructions.
Native Stack - the classical “C” stack required for native method execution.

JVM stack layout

HotSpot JVM uses one contiguous stack for all the above purposes. Java methods and native methods share the same stack with Java and native frames interleaved. Unlike C, where stack space can be allocated dynamically with alloca(), the maximum stack usage of any Java method is known beforehand, but may vary depending on whether the method is interpreted or compiled. The compiled frame is often smaller than the interpreter frame for the same method – this seems intuitive, since the optimizing compiler does not need to have all the local variables and all the monitors on the stack.

Stack frames

Each thread stack has limited capacity. The stack size is configured with -Xss option and defaults to 1 MB on a 64-bit system. This is usually enough to place several thousand average frames.

When a call chain becomes too deep and there is no free space on the stack to create more frames, Java obviously throws StackOverflowError. But how does JVM know when it’s time to throw an error? It simply performs a stack overflow check on every method invocation. There are some interesting details though.

Interpreter vs. Stack banging

Interpreter performs a straightforward check in a method prologue. It takes the address of a stack limit for the current thread and compares it to the stack pointer in RSP register adjusted by the size of the current stack frame, see TemplateInterpreterGenerator::generate_stack_overflow_check.

  mov    %edx,%eax
  shl    $0x3,%rax         ; rax = number_of_locals * 8 bytes
  add    $0x58,%rax        ; + constant frame overhead
  add    0x410(%r15),%rax  ; + current_thread.stack_limit
  cmp    %rax,%rsp
  ja     0x7fffe054e55f    ; if rsp > rax, continue normally
                           ; otherwise throw StackOverflowError

Even though the check is relatively simple, it still involves at least one memory load, a few arithmetic instructions and a conditional jump on every method invocation. That would be too much for a hot JITted code.

Compiler’s strategy is different. In most cases, the stack overflow check in C1- or C2-compiled code is just a single store instruction:

  mov    %eax,-0x14000(%rsp)

It writes any meaningless value from EAX register onto the stack somewhere above the current stack pointer. If there is enough free space above the current frame, this instruction is harmless. Otherwise it touches one of the guard pages, and the hardware exception is triggered. This technique is called stack banging.

Guard pages

Guard zone

When JVM allocates a stack for a new thread, it reserves a few pages on the top of the stack – the guard zone. All the guard pages are initially protected from both reading and writing, so that any access to these pages causes Segmentation Fault (SIGSEGV). However, these SIGSEGVs are managed by the JVM, i.e. the signal handler detects which guard page has been accessed, and acts accordingly.

Yellow zone is used for detecting recoverable stack overflows. If a stack banging instruction hits the yellow zone, SIGSEGV happens, and the signal handler resumes the current thread from the continuation that throws StackOverflowError. Meanwhile the yellow zone is disabled (i.e. the pages are unprotected) to give some extra stack space for the exception throwing code. The yellow pages are reguarded later during stack unwinding.
Red zone is for handling unrecoverable stack overflows. This is the last line of defence. Normally, the red pages should never be touched, but if this happens for any reason, JVM treats this as a fatal error and dies leaving the final crash report. The red zone is unguarded only to provide some stack space for writing hs_err_pid.log.
Reserved zone has been added in JDK 9 to address the problem when StackOverflowError occurs in a critical section, e.g. while executing ReentrantLock.lock(). If an error is thrown in the middle of lock() or unlock() method, the ReentrantLock object may remain in an inconsistent state, so no other thread would be able to acquire the lock again.

The problem and the solution is described in JEP 270. The idea is to reserve an additional stack page for execution of such critical methods. Reserved zone acts like a yellow zone with one exception: when stack banging hits the reserved zone during the execution of a critical method, JVM unprotects the reserved pages allowing the critical method to finish. StackOverflowError is deferred until the critical method returns.

Current HotSpot implementation treats a method critical if it is annotated with @jdk.internal.vm.annotation.ReservedStackAccess. As of JDK 12, there are only a few of them:
- acquire / release methods in ReentrantLock.Sync and ReentrantReadWriteLock.Sync
- lock / unlock methods in StampedLock
- AccessController.wrapException

When a stack overflow happens in one of the above methods, JVM emits the warning and allows the method to complete:

OpenJDK 64-Bit Server VM warning: Potentially dangerous stack overflow in ReservedStackAccess annotated method
java.util.concurrent.locks.ReentrantLock$FairSync.tryAcquire(I)Z

Although JEP 270 makes lock and unlock methods somewhat atomic, it does not guarantee the invocation of these methods will always succeed. Unfortunately, the simplest lock-unlock pattern remains fragile:

  lock.lock();
  try {
      deepCall();
  } finally {
      lock.unlock();
  }

If StackOverflowError happens here inside deepCall, JVM will attempt to execute the finally block, but a call to unlock method may result in another StackOverflowError, leaving the object locked forever. For some reason ReentrantLock.unlock() method itself is not annotated with @ReservedStackAccess

Shadow pages

Remember that HotSpot has Java and native frames on the same stack? This means, a stack overflow may also happen inside a native method. The problem is that JVM knows nothing about the layout of non-Java frames, and of course, cannot unwind the native part of the stack.

How to deal with that? The idea is to ensure the stack overflow does not happen in the native code at all! For this purpose HotSpot reserves yet another bunch of stack pages. This shadow zone is a normal unprotected stack area, except that it can be used exclusively by the native or VM code, but not by Java methods. The shadow zone must be large enough to accommodate the deepest VM call or the deepest native function from the standard JDK library. The typical example is Java_java_net_SocketOutputStream_socketWrite0 which allocates 64 KB buffer on the stack.

The default number of shadow pages on Linux x64 is 20 – this means, native methods can use up to 80 KB of the stack without a risk of JVM crash. This is where the offset -0x14000(%rsp) in the bang instruction comes from.

0x14000 = 80 KB = 20 pages  -- exactly the size of the shadow zone

Okay, can I create a JNI method that will allocate more than 80 KB on the stack? It’s better not to, or I won’t be protected from the crash anymore. If I really need so much stack space (perhaps, because of a 3rd party native library out of my control), I can always add more shadow pages with -XX:StackShadowPages=N.

public class NativeStackOverflow {

    private static void recursion(int depth, Runnable nativeCall) {
        nativeCall.run();
        recursion(depth + 1, nativeCall);
    }

    public static void main(String[] args) {
        System.loadLibrary("deepNative");

        // Fails with StackOverflowError
        recursion(0, NativeStackOverflow::deepNative64K);

        // Crashes JVM, unless StackShadowPages > 32
        recursion(0, NativeStackOverflow::deepNative128K);
    }

    // JNIEXPORT void JNICALL
    // Java_NativeStackOverflow_deepNative64K() {
    //     char buf[64 * 1024];
    //     memset(buf, sizeof(buf), 1);
    // }
    private static native void deepNative64K();

    // JNIEXPORT void JNICALL
    // Java_NativeStackOverflow_deepNative128K() {
    //     char buf[128 * 1024];
    //     memset(buf, sizeof(buf), 1);
    // }
    private static native void deepNative128K();
}

Hit the mark

By default, the yellow zone consists of two 4 KB pages (-XX:StackYellowPages=2). But what if a Java frame is larger than that? A very exotic, but still valid case. Does not stack banging accidentally jump over the yellow zone? To ensure the overflow check does not miss the yellow zone, HotSpot touches all the involved pages from the bottom to the top. E.g. before entering a Java method with 10 KB frame, the compiler issues 3 bang instructions:

  mov    %eax,-0x14000(%rsp)
  mov    %eax,-0x15000(%rsp)
  mov    %eax,-0x16000(%rsp)

Not all Java methods require a stack overflow check though. HotSpot has an optimization to skip stack banging for small leaf methods, i.e. the methods with no Java calls and with the frame size less than 1/8 page, see Compile::need_stack_bang.

How large is the guard zone

All the guard and shadow zones can be resized with the corresponding JVM options. The defaults for Linux x64 are:

-XX:StackRedPages=1
-XX:StackYellowPages=2
-XX:StackReservedPages=1
-XX:StackShadowPages=20

These values affect the minimum stack size a thread can have. Given that HotSpot considers the minimum usable stack of 40 KB, the lower bound of the actual -Xss in JDK 11/12 is

min_stack_size = 40 KB + (1 + 2 + 1 + 20) * 4 KB = 136 KB

If I try to set a lower value, I’ll expectedly get an error:

$ /usr/java/jdk-12.0.2/bin/java -Xss100k

The Java thread stack size specified is too small. Specify at least 136k
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Not only HotSpot JVM uses guard pages. Glibc also reservers a page or a few at the end of the stack when creating a new thread. However, since JVM already handles stack overflows on its own, additional glibc guard would be a waste. That’s why HotSpot explicitly disables glibc guard for all Java threads by calling pthread_attr_setguardsize with zero argument.

Conclusion

HotSpot employs a very efficient stack overflow check which is usually a single harmless store instruction. The further optimization allows to skip an overflow check in the compiled code for small leaf methods.

Special mprotect’ed guard pages at the top of the stack are responsible for detecting overflows. Yellow pages detect regular Java stack overflows. Red pages are for handling fatal errors. Starting from JDK 9, an additional page is reserved below the yellow zone for graceful handling of overflows occured in java.util.concurrent lock methods, however, the mechanism is not bullet-proof.

For more information about Java stack I invite you to watch my presentation
Everything you wanted to know about Stack Traces and Heap Dumps