With Pith

Ethan Petuchowski

Brief Overview of Garbage Collection in the HotSpot JVM

Disclaimer: This post only claims to represent my current understanding of the workings of the HotSpot VM’s garbage collector. This understanding comes from reading the references listed below, as well as making some assumptions. It is not necessarily fully correct.

Brief Overview

The garbage collector separates the heap into 5 sections of 4 types. I don’t know whether these spaces are “virtual” (i.e. remapped to discontiguous “pages” like Linux’s virtual memory addresses) or not. I suspect they are not.

  1. Eden — where objects go when they’re first allocated in the running program
  2. Survivors 1 & 2 a.k.a. “young space” — objects in Eden are moved here if they survive a minor (“young”) GC
  3. Tenured (a.k.a. Old) — long-lived objects (that have survived [a configurable number of] minor GCs) are moved and then live in here
    • We can tell the JVM to allocate all objects larger than n bytes directly into the old space.
  4. Permanent — this is where the JVM’s own objects live (e.g. classes and JITed code). It behaves just like the tenured space.

The GC is arranged “generationally” because (according to the “generational hypothesis”) it is assumed the longer objects live, the longer into the future their life expectancy is. So if we move the older objects into a separate bin, we can do quick, efficient, lucrative “minor GCs” in which we only garbage collect from Eden and the Survivor spaces.

Minor GC is triggered when Eden becomes full. It uses the root references to collect the reference set, and moves all live objects from Eden and one survivor space into the other survivor space (a.k.a. “mark-and-sweep”. I guess this means we’re physically moving the object in RAM because we have to update all the objects references to the new location.

Root references for minor GC are from the stack (I think this means entire stacks for all running threads) and old space. HotSpot uses “dirty cards” as an optimization to not have to trace through references from all members of the old space, only the modified ones.

Full GC would intuitively be triggered by running out of space in the “tenured” or “permanent” bins, but this is not necessarily the case.