If your GC is a moving collector, then absolutely this is something to watch out for.
There are, however, a number of runtimes that will leave memory in place. They are effectively just calling `malloc` for the objects and `free` when the GC algorithm detects an object is dead.
Go, the CLR, Ruby, Python, Swift, and I think node(?) all fit in this category. The JVM has a moving collector.
Every garbage collector has to constantly sift through the entire reference graph of the running program to figure out what objects have become garbage. Generational GC's can trace through the oldest generations less often, but that's about it.
Tracing garbage collectors solve a single problem really really well - managing a complex, possibly cyclical reference graph, which is in fact inherent to some problems where GC is thus irreplaceable - and are just about terrible wrt. any other system-level or performance-related factor of evaluation.
> Every garbage collector has to constantly sift through the entire reference graph of the running program to figure out what objects have become garbage.
There's a lot of "it depends" here.
For example, an RC garbage collector (Like swift and python?) doesn't ever trace through the graph.
The reason I brought up moving collectors is by their nature, they take up a lot more heap space, at least 2x what they need. The advantage of the non-moving collectors is they are much more prompt at returning memory to the OS. The JVM in particular has issues here because it has pretty chunky objects.
> The reason I brought up moving collectors is by their nature, they take up a lot more heap space, at least 2x what they need.
If the implementer cares about memory use it won't. There are ways to compact objects that are a lot less memory-intensive than copying the whole graph from A to B and then deleting A.
It doesn't matter. The GC does not know what heap allocations are in memory vs swap, and since you don't write applications thinking about that, running a VM with a moving GC on swap is a bad idea.
Yeah but in practice I'm not sure that really works well with any GCs today? Ive tried this with modern JVM and Node vms, it always ended up with random multi second lockups. Not worth the time.
MemBalancer is a relatively new analysis paper that argues having swap allows maximum performance by allowing small excesses, that avoids needing to over-provision ram instead. The kind of gc does not matter since data spends very little time in that state and on the flip side, most of the time the application has twice has access to twice as much memory to use
Python’s not a mover but the cycle breaker will walk through every object in the VM.
Also since the refcounts are inline, adding a reference to a cold object will update that object. IIRC Swift has the latter issue as well (unless the heap object’s RC was moved to the side table).
A moving collector has to move to somewhere and, generally by it's nature, it's constantly moving data all across the heap. That's what makes it end up touching a lot more memory while also requiring more memory. On minor collections I'll move memory between 2 different locations and on major collections it'll end up moving the entire old gen.
It's that "touching" of all the pages controlled by the GC that ultimately wrecks swap performance. But also the fact that moving collector like to hold onto memory as downsizing is pretty hard to do efficiently.
Non-moving collectors are generally ultimately using C allocators which are fairly good at avoiding fragmentation. Not perfect and not as fast as a moving collector, but also fast enough for most use cases.
Java's G1 collector would be the worst example of this. It's constantly moving blocks of memory all over the place.
> It's that "touching" of all the pages controlled by the GC that ultimately wrecks swap performance. But also the fact that moving collector like to hold onto memory as downsizing is pretty hard to do efficiently.
The memory that's now not in use, but still held onto, can be swapped out.