Virtually invariably, "GC is bad" assumes (1) lots of garbage (2) long pause tim...

throwaway684936 · on March 31, 2022

What do you consider a long pause time?

In userland I consider anything above maybe 1 or 2 milliseconds to be a long pause time. The standards only get higher when it's something like a kernel.

A kernel with even a 0.2ms pause time can be unacceptable when working with ultra low-latency audio for example.

throwaway894345 · on March 31, 2022

The criticisms I've heard typically reference pause times in the tens or hundreds of milliseconds. Agreed that different domains and applications have different requirements for pause times. I would be very interested to see histograms of pause times for different GCs, but I'm pretty sure the specific results would vary a lot depending on the specific corpus of applications under benchmark. If your 99%-ile pause time is tens of microseconds, is that good enough for video games? Audio kernels?

MaulingMonkey · on March 30, 2022

This is the no true scottsman argument. I mean, no true modern GC. And it's bullshit. Let's be topical and pick on Go since that's the language in the title:

https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i...

30% of CPU spent on GC, individual GC pauses already in the milliseconds, despite a tiny half-gig heap in 2019. For gamedev, a single millisecond in the wrong place can be enough to miss vsync and have unacceptable framerate stutter. In the right place, it's "merely" 10% of your entire frame budget, for VR-friendly, nausea-avoiding framerates near 100fps. Or perhaps 90% if "single digit milliseconds" might include 9ms.

Meanwhile, the last professional project I worked on had 100ms pauses every 30 seconds because we were experimenting with duktape, which is still seeing active commits. Closer to a 32GB heap for that project, but most of that was textures. Explicit allocation would at least show where the problematic garbage/churn was in any profiler, but garbage collection meant a single opaque codepath for all garbage deallocation... without even the benefit of explicit static types to narrow down the problem.

titzer · on March 30, 2022

From your link (which I remember reading at the time):

> So by simply reducing GC frequency, we saw close to a ~99% drop in mark assist work, which translated to a~45% improvement in 99th percentile API latency at peak traffic.

Did you look at the actual article? (Because it doesn't support your point). They added a 10GB memory ballast to keep the GC pacer from collecting too much. That is just a bad heuristic in the GC, and should have a tuning knob. I'd argue a tuning knob isn't so bad, compared to rewriting your entire application to manually malloc/free everything, which would likely result in oodles of bugs.

Also:

> And it's bullshit.

Please, we can keep the temperature on the conversation down a bit by just keeping to facts and leaving out a few of these words.

MaulingMonkey · on March 30, 2022

> Did you look at the actual article? (Because it doesn't support your point).

I did and it does for the point I intended to derive from said article:

>> However, the GC pause times before and after the change were not significantly different. Furthermore, our pause times were on the order of single digit milliseconds, not the 100s of milliseconds improvement we saw at peak load.

They were able to improve times via tuning. Individual GC pause times were still in the milliseconds. Totally acceptable for twitch's API servers (and in fact drowned out by the several hundred millisecond response times), but those numbers mean you'd want to avoid doing anything at all in a gamedev render thread that could potentially trigger a GC pause, because said GC pause will trigger a vsync miss.

> I'd argue a tuning knob isn't so bad, compared to rewriting your entire application to manually malloc/free everything, which would likely result in oodles of bugs.

Memory debuggers and RAII tools have ways to tackle this.

I've also spent my fair share of time tackling oodles of bugs from object pooling, meant to workaround performance pitfalls in GCed languages, made worse by the fact that said languages treated manual memory allocation as a second class citizen at best, providing inadequate tooling for tackling the problem vs languages that treat it as a first class option.

titzer · on March 30, 2022

> but those numbers mean you'd want to avoid doing anything at all in a gamedev render thread that could potentially trigger a GC pause, because said GC pause will trigger a vsync miss.

You might want to take a look at this:

https://queue.acm.org/detail.cfm?id=2977741

MaulingMonkey · on March 30, 2022

I have, it's a decent read - although somewhat incoherent. E.g. they tout the benefits of GCing when idle, then trash the idea of controlling GC:

> Sin two: explicit garbage-collection invocation. JavaScript does not have a Java-style System.gc() API, but some developers would like to have that. Their motivation is proactively to invoke garbage collection during a non-time-critical phase in order to avoid it later when timing is critical. [...]

So, no explicitly GCing when a game knows it's idle. Gah. The worst part is these are entirely fine points... and somewhat coherent in the context of webapps and webpages. But then when one attempts to embed v8 - as one does - and suddenly you the developer are the one that might be attempting to time GCs correctly. At least then you have access to the appropriate native APIs:

* https://v8docs.nodesource.com/node-7.10/d5/dda/classv8_1_1_i... * https://v8docs.nodesource.com/node-7.10/d5/dda/classv8_1_1_i...

A project I worked on had a few points where it had to explicitly call GC multiple times back to back. Intertwined references from C++ -> Squirrel[1] -> C++ -> Squirrel meant the first GC would finalize some C++ objects, which would unroot some Squirrel objects, which would allow some more C++ objects fo be finalized - but only one layer at a time per GC pass.

Without the multiple explicit GC calls between unrooting one level and loading the next, the game had a tendency to "randomly"[2] ~double it's typical memory budget (thanks to uncollected dead objects and the corresponding textures they were keeping alive), crashing OOM in the process - the kind of thing that would fail console certification processes and ruin marketing plans.

[1]: http://squirrel-lang.org/

[2]: quite sensitive to the timing of "natural" allocation-triggered GCs, and what objects might've created what reference cyles etc.

titzer · on March 30, 2022

> So, no explicitly GCing when a game knows it's idle.

I mean, that is literally what the idle time scheduler in Chrome does. It has a system-wide view of idleness, which includes all phases of rendering and whatever else concurrent work is going on.

> Intertwined references from C++ -> Squirrel[1] -> C++ -> Squirrel meant the first GC would finalize some C++ objects, which would unroot some Squirrel objects, which would allow some more C++ objects fo be finalized - but only one layer at a time per GC pass.

This is a nasty problem and it happens a lot interfacing two heaps, one GC'd and one not. The solution isn't less GC, it's more. That's why Chrome has GC of C++ (Oilpan) and is working towards a unified heap (this may already be done). You put the blame on the wrong component here.

MaulingMonkey · on March 30, 2022

> I mean, that is literally what the idle time scheduler in Chrome does. It has a system-wide view of idleness, which includes all phases of rendering and whatever else concurrent work is going on.

Roughly, but it has poor insight into a game's definition of "idle". Probably fine for most web games, but prioritizing, say, "non-idle" game-driven prefetching and background decompression over GC can be the wrong tradeoff.

> This is a nasty problem and it happens a lot interfacing two heaps, one GC'd and one not. The solution isn't less GC, it's more. That's why Chrome has GC of C++ (Oilpan) and is working towards a unified heap (this may already be done). You put the blame on the wrong component here.

Two non-GCed heaps doesn't have this problem, nor do two "GC"ed heaps if we use an expansive definition of GC that includes refcounting-only systems - it only arises when using multiple heaps, when at least one of them is the deferred scanny finalizer-laden GC style. While you're correct that "more GC" is a solution, it's not the only solution, it has it's drawbacks, and holding GC blameless here is disingenuous when it's the only common element. That these GCs compose poorly with other heaps is a drawback of said GCs.

If I try mixing Squirrel and C#, I'll run the risk of running into the same issues, despite both being GC based. I'm sure you'll agree that attempting to coax them into using the same heap is a nontrivial endeavor. I've been in the position of having two different JavaScript engines (one for "gameplay", one for UI) in the same project - but they were fortunately used for separate enough things to not create these kind of onion layers of intertwined garbage. While silly from a clean room technical standpoint, it's the kind of thing that can arise when different organizational silos end up interfacing - a preexisting reality I've been inflicted by more than once.

MaulingMonkey · on March 30, 2022

> Please, we can keep the temperature on the conversation down a bit by just keeping to facts and leaving out a few of these words.

Sure. Let's avoid some of these words too:

> unsupported, tribalist, firmly-held belief that is unsupported by hard data.

Asking for examples is fine and great, but painting broad strokes of the dissenting camp before they have a chance to respond does nothing to help keep things cool.

throwaway894345 · on March 30, 2022

I don't think you know what "no true scotsman" means--I'm not asserting that Go's GC is the "true GC" but that it is one permutation of "GC" and it defies the conventional criticisms levied at GC. As such, it's inadequate to refute GC in general on the basis of long pauses and lots of garbage, you must refute each GC (or at least each type/class of GC) individually. Also, you can see how cherry-picking pathological, worst-case examples doesn't inform us about the normative case, right?

MaulingMonkey · on March 30, 2022

>> And it's bullshit.

> cherry picks worst-case examples and represents them as normative

Neither of my examples are anywhere near worst-case. All texture data bypassed the GC entirely, for example, contributing to neither live object count nor GC pressure. I'm taking numbers from a modern GC with value types that you yourself should be fine and pointed out, hey, it's actually pretty not OK for anything that might touch the render loop in modern game development, even if it's not being used as the primary language GC.

MaulingMonkey · on March 30, 2022

> I don't think you know what "no true scotsman" means--I'm not asserting that Go's GC is the "true GC"

At no point in invoking https://en.wikipedia.org/wiki/No_true_Scotsman does one bother to define what a true scotsman is, only what it is not by way of handwaving away any example of problems with a category by implying the category excludes them. It's exactly what you've done when you state "People who argue against GC are almost always arguing against" some ancient, nonmodern, unoptimized GC.

Modern GCs have perf issues in some categories too.

> As such, it's inadequate to refute GC in general on the basis of long pauses and lots of garbage, you must refute each GC (or at least each type/class of GC) individually.

I do not intend to refute the value of GCs in general. I will happily use GCs in some cases.

I intend to refute your overbroad generalization of the anti-GC camp, for which specific examples are sufficient.

> Also, you can see how cherry-picking pathological, worst-case examples doesn't inform us about the normative case, right?

My examples are neither pathological nor worst case. They need not be normative - but for what it's worth, they do exemplify the normative case of my own experiences in game development across multiple projects with different teams at different studios, when language level GCs were used for general purpouses, despite being bypassed for bulk data.

It's also exactly what titzer was complaining was missing upthread:

> I've seen this sentiment a lot, and I never see specifics. "GC is bad for systems language" is an unsupported, tribalist, firmly-held belief that is unsupported by hard data.

throwaway894345 · on March 31, 2022

> At no point in invoking https://en.wikipedia.org/wiki/No_true_Scotsman does one bother to define what a true scotsman is, only what it is not by way of handwaving away any example of problems with a category by implying the category excludes them. It's exactly what you've done when you state "People who argue against GC are almost always arguing against" some ancient, nonmodern, unoptimized GC.

Yes, that is how a NTS works, which is why you should be able to see that my argument isn't one. If I had said the common criticism of GCs ("they have long pause times") is invalid because the only True GC is a low-latency GC, then I would have made an NTS argument. But instead I argued that the common criticism of GCs doesn't refute GCs categorically as GC critics believe, it only refutes high-latency GCs. I'm not arguing that True GCs are modern, optimized, etc, only that some GCs are modern and optimized and the conventional anti-GC arguments/assumptions ("long pause times") don't always apply.

aktau · on March 31, 2022

Not to detract from your general point, but I believe that this specific situation was addressed in Go 1.18's GC pacer rework: https://github.com/golang/proposal/blob/master/design/44167-....