Malloc has the exact same structure: it's hella fast until you run out of RAM an...

KerrAvon · on Aug 25, 2022

I can’t make any sense out of your equivalency assertion. GC won’t magically reclaim memory that is still referenced; there still comes a point where if you can’t demand-page new memory in, you’re going to fail.

Are you assuming the malloc-based system hasn’t been debugged and has severe memory leaks? Are you assuming a compacting collector will allow space to be reclaimed that free() wouldn’t.

lisper · on Aug 25, 2022

> GC won’t magically reclaim memory that is still referenced

I never said it would. What I said that making GC hard-real-time is no more difficult than making malloc/free hard-real-time. You can do manual memory management in a GC'd system. If you do that, the two systems are literally the same.

The only difference between GC and manual memory management is that in a GC'd system, unreferenced memory will eventually be reclaimed and in a manual system it won't. If you lose all references to an allocated block in a manual system, it's gone forever, whereas in a GC'd system you can get it back. This enables a programming style that puts less burden on the coder to prevent memory leaks. But none of this has anything to do with being hard-real-time. Hard-real-time is hard, period, end of story. GC is completely orthogonal to this. Hard-real-time is hard in a GC'd system and it is exactly as hard in a non-GC'd system. The only difference between the two is the same in both hard-real-time and non-hard-real-time cases: in a non-GC'd system, the coder has an extra burden to write code that doesn't leak. That's it.

pizlonator · on Aug 25, 2022

GC is inherently harder. It has nothing to do with expectations.

You have all the proof burden of using malloc (total size of live objects must fit in memory and WCET of allocation can’t suck) plus the additional proof burden that the collector’s cycle is schedulable.

lisper · on Aug 25, 2022

> GC is inherently harder.

You need to pay closer attention to the claim I am actually making. Of course GC is harder than malloc/free. But malloc/free is not hard-real-time by default [1], and making malloc/free hard-real-time is just as hard as (indeed is pretty much the same problem as) making GC hard-real-time. Hard-real-timeness is a hard problem that is almost entirely independent of the allocation/deallocation strategy.

[1] https://militaryembedded.com/avionics/safety-certification/j...

pharrington · on Aug 25, 2022

That linked article recommends using a custom allocation/deallocation strategy, because malloc/free do not know about your application's specific memory management requirements.

p_l · on Aug 25, 2022

IIRC, malloc() is generally banned, instead of "database" approaches of object pools (possibly partitioned, possibly not), or in fact GC are used.

pizlonator · on Aug 25, 2022

I’m paying extremely close attention to repeated claim that malloc is just as hard as GC for hard real time. That claim is false. See my previous comments for the reason why.

lisper · on Aug 25, 2022

Here is your previous comment in its entirety:

> GC is inherently harder. It has nothing to do with expectations.

> You have all the proof burden of using malloc (total size of live objects must fit in memory and WCET of allocation can’t suck) plus the additional proof burden that the collector’s cycle is schedulable.

I'm sorry, but I don't find that at all persuasive. You haven't even presented an argument. All you've done is proclaim that I have the burden of proof. Sorry, but I have no such obligation. I am providing information that I believe in good faith to be true based on my experience, but it comes with no warranty of any kind. I may well be wrong. In particular, there may have been a research breakthrough in real-time malloc/free that I am unaware of. It has been quite a while (>20 years) since I kept up with this literature. But back when I was an expert there were fundamental reasons why it wasn't any easier to make malloc/free hard-real-time than GC, so a new development on this front would be a major breakthrough. It's possible that such a breakthrough has occurred and I just haven't heard about it. But it seems unlikely. It's not like I've been living in a cave, and a web search doesn't yield any recent new results.

But the appropriate response if you want to challenge me on this would be to point me to the publication where this alleged simple solution to hard-real-time malloc/free is described, not to simply proclaim that I have the burden of proof.

pizlonator · on Aug 25, 2022

The proof burden isn’t on you. The proof burden is on someone trying to ship a hard real time system with a GC. That person has to convince themselves that their system meets its deadlines. In a system that uses GC, the GC is either:

- stop the world. Then you have to prove that the GC pause will never be so long that it causes you to miss deadlines and that the pause can’t happen so frequently (even if small) that you miss deadlines. That’s impractical to do unless your heap is tiny. But some success stories of GC in hard RT did take the “lots of tiny heaps” approach, though I don’t know if they went as far as meeting the kind of proof burden that is demanded by the hardest of real time. With malloc, you could do tiny heaps if you wanted to, but you’d never be forced to do it just to appease malloc.

- the GC is a task. Then you have to prove that the task will meet its deadline. You have to do that for any task in a real time system. Unfortunately, the GC task’s deadline depends on allocation rate, which depends on the schedule of the entire rest of your system, plus a worst case allocation rate (bytes allocated per second) analysis. You don’t have to think about allocation rates if you’re using malloc. You just have to know how fast malloc is. With GC, you also have to know how fast it can allocate, and you need all that other stuff. It’s more work to get right.

- Work based incremental GC. These can avoid the schedulability analysis, sort of, but come with a lot of other baggage. The worst misfeature of these collectors is that the execution time of allocation is hella jittery. Sometimes very fast, other times very slow, though always O(1). That’s the kind of thing you could almost use in a hard RT system so long as the worst case isn’t too bad, but in practice it’s a nightmare because if you really assume that every allocation will perform as badly as the worst allocation case in one of these GCs then you’ll have to overprovision to the point of absurdity.

Most folks use a concurrent GC with some element of work based. So, the GC is a task, and its schedule is somehow tightly coupled to the current allocation rate. That’s good enough for soft RT and it can be good enough for some things on the softer side of hard RT. But even that’s hard to pull off and involves a beast of a collector, tight integration between RTOS and GC, and tight integration of the GC and test plan. No matter how you do it, way harder than malloc. You have all the problems of malloc plus other problems that malloc doesn’t have.

lisper · on Aug 26, 2022

It's true that malloc/free does not have these problems. It has different problems. A standard implementation of malloc/free is going to sometimes do heap defragmentation, or search a free list for a block of the right size, or something that isn't going to be O(1), and making it O(1) is not going to be significantly easier than making a GC O(1). And it is not true that GC has all the problems of malloc. GC can relocate objects. Malloc can't.

pizlonator · on Aug 26, 2022

the fact that GC can move objects is a plus for GC only if you don’t care about pauses. If you do care about pauses then moving objects makes everything harder.

lisper · on Aug 26, 2022

We seem to be talking past each other. Let's go back to basics:

Neither GC nor malloc/free (which I am going to start abbreviating as MF) are 100% guaranteed not to fail under all circumstances. As a trivial example, if you try to allocate more memory than you actually have physically available, both GC and MF will fail. That is not the only example of a circumstance under which they will fail, but that is irrelevant. The point is simply that there are circumstances under which both GC and MF can fail. This is true regardless of any real-time considerations. So the possibility of failure is a fact of life that cannot be discharged even with arbitrarily clever allocation technologies, and even in the absence of hard-real-time constraints.

Likewise, there are circumstances where both GC and MF are essentially equivalent. If (for example) you only ever allocate a finite amount of memory, and that amount is less than what is available, and you never free anything, and you never produce any garbage, then both GC and MF will work. As an ancillary observation, even the most naive implementation of GC and MF will return results in O(1) time, and the multiplicative constant will be very low. In other words, under these circumstances, both GC and MF can be reasonably considered to be "hard-real-time". Their behaviors will be essentially indistinguishable.

The only circumstance under which the question of GC vs MF becomes interesting at all is in more complicated situations, where you have complex allocation and de-allocation patterns, and the total amount of memory you are allocating over the lifetime of the program is more than the amount of physical memory available. Under those circumstances you have to be clever and re-use memory that has been previously allocated and freed, either manually, or by a collector.

Under those circumstances, the devil is very much in the details. It is not hard, for example, to come up with an allocation/deallocation pattern that will (almost certainly) cause an allocator that cannot relocate objects to fail, but which would succeed using a relocating allocator.

So although you are correct when you say that "if you care about pauses then moving objects makes everything harder," that is missing the point rather badly. The alternative to "making everything harder" is accepting failure in circumstances under which they would not have failed if you had "made everything harder".

Which brings me full circle to my original point: making things hard-real-time is hard, full stop. Hard-real-time GC is hard, but it is not the GC part that makes it hard, it is the hard-real-time part that makes it hard. And yes, of course you can make it less hard, but at the cost of having a system that is less capable, and which will fail in more circumstances. Making those kinds of trade-offs is part of what makes this problem hard. The differences between GC and MF are just part of the details. They are not the fundamental drivers of the difficulty of this problem.

ungamedplayer · on Aug 26, 2022

Why not put option 2 on another cpu, that way you will only get minimalistic interrupts to the main workload?

Is that a done thing ?

pizlonator · on Aug 26, 2022

Sure. But that doesn’t solve the schedulability problem. You’d still have to prove that the GC task completes soon enough. What does soon enough mean? All of the hard things I mentioned. Parallelism may help with provisioning but it doesn’t reduce the proof burden.

admax88qqq · on Aug 25, 2022

Malloc does not have constant nor guaranteed runtime.

It leads to memory fragmentation and requires non constant searches to find a free spot for the next malloc.

pizlonator · on Aug 25, 2022

It’s extremely difficult to make anything in GC have constant time guarantees.

Although non real time GCs can defrag the heap, it’s a huge problem to do it in a real time GC.

admax88qqq · on Aug 26, 2022

Sane problems as malloc and free. And due to GC being abstracted and faster amortized time per allocation you have more resources to work with.

Look at the Metronome GC from IBM

You basically have to guarantee a certain level of resources or a certain allocation rate. Things you have to do anyways in non GCd languages.

This whole conversation is moot. Hard real time does not mesh with dynamic allocation, unless your allocation behaviour is very well understood and predictable, in which case it ain't hard to design a GC that works for it.

Malloc/free are not free. They have their own limitations and performance characteristics.

silon42 · on Aug 25, 2022

In a real time system you probably shouldn't run out of RAM (or it should be a recoverable error, like dropping a packet).

But GC pause is inevitable if the GC can't reliably keep up (like a spike in workload).

lisper · on Aug 25, 2022

> GC pause is inevitable

No, it isn't. That's the magic of hard-real-time GC. It isn't easy, but it is possible.

[EDIT: A more accurate claim is that GC pause is no more inevitable than an allocation failure in a malloc/free system.]

pizlonator · on Aug 25, 2022

GC pauses are inevitable unless:

- you think it’s ok to crash your program with an OOM just because the GC couldn’t keep up with allocation rate. To my knowledge, nobody ever thinks this is ok, so:

- you inevitably pause. Maybe it’s rare. Maybe you are “real time” only in the sense that you want responsiveness 99.99% of the time rather than 100% of the time.

- you prove schedulability. The “good” alternative that I would assume few people do is prove that the GC can keep up (I.e. prove that it is schedulable). To do that you need to have WCET and WCAR (worst case allocation rate) analyses and you need to feed those into a schedulability analysis. Then there’s a bunch of ridiculous math. If you’re lucky, you can prove you’re schedulable, but it seems like a strictly worse dev plan than just writing C/Ada/Rust/whatever code that doesn’t GC.

I list the third option because I know it’s theoretically possible to do it, but I don’t recommend it and I don’t consider it to be a serious option if you’re building something that lives will depend on.

tlb · on Aug 25, 2022

I agree it's very hard to do that when your code needs 90% of the CPU to keep up.

But lots of systems need less than 10% for the real-time code. Or less than 1%. In that case, it can be easy to convince yourself that straightforward incremental GCs can keep up.

Perhaps you'd argue those systems are over-provisioned, if their CPU is under 10% busy. But there may be good reasons to have extra speed, like supporting a management UI (running at non-RT priority). Or it may be a worthwhile tradeoff for reduced engineering effort. Gigaflop CPUs are like $50 these days, so if your realtime code only needs a megaflop you're likely to end up with very low CPU utilization.

a1369209993 · on Aug 25, 2022

> But lots of systems need less than 10% for the real-time code. Or less than 1%. In that case, it can be easy to convince yourself that straightforward incremental GCs can keep up.

Basically this, although it applies to memory as well, not just CPU. If you've got enough headroom, it's easy to prove that the application physically cannot allocate fast enough to outpace to GC. (Along the lines of: each call to cons does 1/N of a mark-sweep cycle, and you have more than N+M cons cells available, where M is the maximum the application ever has live simultaneously.) Reducing headroom from there is just a performance optimization.

(Proving the program never uses more than M units of memory is nontrivial, but you'd have to that with malloc anyway, so GC doesn't cost any extra in that regard.)

pizlonator · on Aug 25, 2022

If you’re running on the softer side of real time then it’s acceptable to just run an experiment that confirms that you really only use a small percentage of CPU and that to your knowledge the GC never blew up and made you miss a deadline.

But on the harder side of real time you’re going to have to prove this. And that is super hard even if you’re overprovisioning like crazy.

tlb · on Aug 25, 2022

Agreed, for a sufficiently hard definition of hard real time.

Hard real time gets a lot of theoretical attention, because it's sometimes provable. Whether the decisions you're making by the deadline are sensible and not, say, cranking the horizontal stabilizer all the way down when one of the angle-of-attack sensors is broken, are far more consequential in most systems than missing a tick.

Companies definitely shoot themselves in the foot by focusing so hard on the hard real time constraint that they make it 10x harder to reason about the actual behavior of the system, and then they get the more important thing wrong. I've seen this in a few systems, where it's almost impossible to discover what the control loop behavior is from reading the code.

pizlonator · on Aug 25, 2022

Problem with GC is that its preferred mode of operation is to stop the world for much more than one tick.

Scanning a respectable-size heap on a respectably fast machine, sans fancy GC optimizations, could easily take 30 seconds. Modern production GCs rarely pause you for 30 seconds. Real time GCs certainly try very hard to avoid ever doing that. But:

- The optimizations that make a GC run faster than 30 sec are speculative: they target empirically found common cases. Not common cases of something the programmer has control over, but common cases of heap shape, which is a chaotic function of the GC itself, the way the OS lays out memory, the program, the program’s input, and lots of other stuff. Those common case optimizations are successful enough that GC cycle times often look more like 30 milliseconds than 30 seconds. So, the terrifying thought if you’re using a GC in real time is: what if at the worst time possible, the heap shape becomes something that isn’t common case, and the GC that you thought took 30 ms now takes 30 sec.

- Real time GCs can let the program do work while the GC is happening, so even if it takes 30 seconds, the program can keep chugging along. But here’s the catch: no memory is being reclaimed until the GC reaches the end, and some or all memory allocated during the GC cycle will remain unfreed until the next GC cycle. So if your 30ms concurrent collector decides to take 30sec instead, you’ll either run out of memory and crash or run out of memory and pause the world for 30sec.

Basically, the more you know about RTGC, the less you’ll want to use them when lives matter.

bombcar · on Aug 26, 2022

Isn’t this where the “cruise missile program will run out of RAM in an hour but the maximum flight time before detonation is 30m so we good” story goes?

adwn · on Aug 25, 2022

> with GC you expect to run out of RAM and you expect the system to not just go down in screaming flames when that happens

No, you don't. The main point of GC is that you don't have to think about memory as much as with manual memory management. In fact, most GC languages don't even have an API to explicitly handle an out-of-memory condition: the program or thread will just crash or abort. With malloc(), you can at least properly react to allocation failures [1], which is crucial for hard real-time systems.

[1] Like storing the runtime state to flash and restarting the software.

p_l · on Aug 25, 2022

That's not a property of malloc()/free()[1], that's a property of the runtime system which can be satisfied both in GC'ed languages and those with manual memory management.

[1] To the point that porting code originating from Linux is rife with bugs when you run on systems that have the possibility of malloc() failing (99.9% setups of linux never report malloc() failure)

kaba0 · on Aug 26, 2022

You can only react to OOM when the OS gives you the necessary tools - under ordinary OSs malloc will not fail, even if it couldn’t actually allocate - actual allocation only happens when one writes to the location.

But it doesn’t really apply to the topic at hand, just wanted to add.

alfiedotwtf · on Aug 25, 2022

> Malloc has the exact same structure: it's hella fast until you run out of RAM and it returns an allocation failure.

Doesn't that depend on the allocator though? Some malloc()s will perform some cleanup depending on their internal structure, which is not deterministic

lisper · on Aug 26, 2022

Sure. But in general both malloc and cons are pretty zippy most of the time.