Lisp can be “hard” real time [pdf] (2000)

willis936 · on Aug 25, 2022

Okay, now prove that your program written in Lisp will not miss cycles regardless of the data sent to it for at least the length of the mission. The issue is in being able to prove a program is deterministic. Wildcards like schedulers and garbage collectors sharing resources with code that needs to be real time are notoriously difficult to properly set bounds on behavior.

If you aren't doing that then it's not hard real time.

ajross · on Aug 25, 2022

That's... not really a correct description of "hard real time". All system architectures have performance edge cases. The idea of a "hard" real time system is that those be known and fully characterized, and that they can be mapped to equally-fully-characterized application requirements. That way you know all the failure modes ahead of time.

Not all aspects of a real time system need latency control like you're imagining. It's entirely possible to imagine, say, an aircraft autopilot or self-driving system (something that needs ~100ms round-trip latencies to approximate a human pilot) being implemented using a garbage collected runtime, as long as you can guarantee those GC pauses are acceptably small (lots of such systems exist).

The same would be a very hard ask of a piston engine controller, where the timing needs are closer to 100us. Nonetheless the (again to pick an arbitrary example) logging/telemetry subsystem of that engine controller doesn't need that. So as long as you have an OS that can partition the engine tasks from the low priority stuff in a reliable way, you can still imagine a Lisp/Java/.NET/Go/whatever runtime being used there.

Real time is a design philosophy, not a tech stack.

kaba0 · on Aug 25, 2022

Why would it be hard? If anything a fat runtime makes it easier - see hard-real time JVMs used in military tech.

Hard real time is not too hard in general as it is usually combined with quite low performance-requirements. Boundedness itself is the important property, as opposed to soft-real time which is much more finicky (video games, audio).

pizlonator · on Aug 25, 2022

Hard real time GC is hard.

The RTGC applications in military tech that I had heard of were on the soft end of hard real time. It was mission critical, not safety critical. Maybe there’s something newer that I hadn’t heard about though.

Some of the really bad problems are:

- How to make GC operations like allocation have the level of determinism that hard RT programmers want.

- How to prove that your GC can keep up with allocation rate while at the same time also only taking a deterministic amount of your time.

Not saying it can’t be done. It’s just hard.

lisper · on Aug 25, 2022

> Hard real time GC is hard.

Hard real time is hard, full stop. If you have malloc and free you will encounter the exact same real-time challenges as you will with GC.

Heck, you don't even need malloc and free. All you need is a WHILE loop to make hard-real-time extremely challenging.

pizlonator · on Aug 25, 2022

GC makes hard real time harder. Way worse than malloc.

GC usually works by amortization. 99.99% of allocations are hella fast and then there’s that one that triggers a one second GC that stops the world. There are algorithms to “spread out” the GC cycle over some span of time during which you interleave the GC and your actual code. But then proving that this is schedulable in the worst case is hell. So much hell. I get that real time is hard anyway already. GC just makes it worse.

lisper · on Aug 25, 2022

Malloc has the exact same structure: it's hella fast until you run out of RAM and it returns an allocation failure.

The only difference between the two is that with GC you expect to run out of RAM and you expect the system to not just go down in screaming flames when that happens. And yes, that is harder to arrange. But it's not because there is anything inherently harder about GC. The only difference is the expectations about what will happen in the long run, and how much work the programmer has to do to prevent running into the worst-case scenario.

KerrAvon · on Aug 25, 2022

I can’t make any sense out of your equivalency assertion. GC won’t magically reclaim memory that is still referenced; there still comes a point where if you can’t demand-page new memory in, you’re going to fail.

Are you assuming the malloc-based system hasn’t been debugged and has severe memory leaks? Are you assuming a compacting collector will allow space to be reclaimed that free() wouldn’t.

lisper · on Aug 25, 2022

> GC won’t magically reclaim memory that is still referenced

I never said it would. What I said that making GC hard-real-time is no more difficult than making malloc/free hard-real-time. You can do manual memory management in a GC'd system. If you do that, the two systems are literally the same.

The only difference between GC and manual memory management is that in a GC'd system, unreferenced memory will eventually be reclaimed and in a manual system it won't. If you lose all references to an allocated block in a manual system, it's gone forever, whereas in a GC'd system you can get it back. This enables a programming style that puts less burden on the coder to prevent memory leaks. But none of this has anything to do with being hard-real-time. Hard-real-time is hard, period, end of story. GC is completely orthogonal to this. Hard-real-time is hard in a GC'd system and it is exactly as hard in a non-GC'd system. The only difference between the two is the same in both hard-real-time and non-hard-real-time cases: in a non-GC'd system, the coder has an extra burden to write code that doesn't leak. That's it.

pizlonator · on Aug 25, 2022

GC is inherently harder. It has nothing to do with expectations.

You have all the proof burden of using malloc (total size of live objects must fit in memory and WCET of allocation can’t suck) plus the additional proof burden that the collector’s cycle is schedulable.

lisper · on Aug 25, 2022

> GC is inherently harder.

You need to pay closer attention to the claim I am actually making. Of course GC is harder than malloc/free. But malloc/free is not hard-real-time by default [1], and making malloc/free hard-real-time is just as hard as (indeed is pretty much the same problem as) making GC hard-real-time. Hard-real-timeness is a hard problem that is almost entirely independent of the allocation/deallocation strategy.

[1] https://militaryembedded.com/avionics/safety-certification/j...

pharrington · on Aug 25, 2022

That linked article recommends using a custom allocation/deallocation strategy, because malloc/free do not know about your application's specific memory management requirements.

p_l · on Aug 25, 2022

IIRC, malloc() is generally banned, instead of "database" approaches of object pools (possibly partitioned, possibly not), or in fact GC are used.

pizlonator · on Aug 25, 2022

I’m paying extremely close attention to repeated claim that malloc is just as hard as GC for hard real time. That claim is false. See my previous comments for the reason why.

lisper · on Aug 25, 2022

Here is your previous comment in its entirety:

> GC is inherently harder. It has nothing to do with expectations.

> You have all the proof burden of using malloc (total size of live objects must fit in memory and WCET of allocation can’t suck) plus the additional proof burden that the collector’s cycle is schedulable.

I'm sorry, but I don't find that at all persuasive. You haven't even presented an argument. All you've done is proclaim that I have the burden of proof. Sorry, but I have no such obligation. I am providing information that I believe in good faith to be true based on my experience, but it comes with no warranty of any kind. I may well be wrong. In particular, there may have been a research breakthrough in real-time malloc/free that I am unaware of. It has been quite a while (>20 years) since I kept up with this literature. But back when I was an expert there were fundamental reasons why it wasn't any easier to make malloc/free hard-real-time than GC, so a new development on this front would be a major breakthrough. It's possible that such a breakthrough has occurred and I just haven't heard about it. But it seems unlikely. It's not like I've been living in a cave, and a web search doesn't yield any recent new results.

But the appropriate response if you want to challenge me on this would be to point me to the publication where this alleged simple solution to hard-real-time malloc/free is described, not to simply proclaim that I have the burden of proof.

pizlonator · on Aug 25, 2022

The proof burden isn’t on you. The proof burden is on someone trying to ship a hard real time system with a GC. That person has to convince themselves that their system meets its deadlines. In a system that uses GC, the GC is either:

- stop the world. Then you have to prove that the GC pause will never be so long that it causes you to miss deadlines and that the pause can’t happen so frequently (even if small) that you miss deadlines. That’s impractical to do unless your heap is tiny. But some success stories of GC in hard RT did take the “lots of tiny heaps” approach, though I don’t know if they went as far as meeting the kind of proof burden that is demanded by the hardest of real time. With malloc, you could do tiny heaps if you wanted to, but you’d never be forced to do it just to appease malloc.

- the GC is a task. Then you have to prove that the task will meet its deadline. You have to do that for any task in a real time system. Unfortunately, the GC task’s deadline depends on allocation rate, which depends on the schedule of the entire rest of your system, plus a worst case allocation rate (bytes allocated per second) analysis. You don’t have to think about allocation rates if you’re using malloc. You just have to know how fast malloc is. With GC, you also have to know how fast it can allocate, and you need all that other stuff. It’s more work to get right.

- Work based incremental GC. These can avoid the schedulability analysis, sort of, but come with a lot of other baggage. The worst misfeature of these collectors is that the execution time of allocation is hella jittery. Sometimes very fast, other times very slow, though always O(1). That’s the kind of thing you could almost use in a hard RT system so long as the worst case isn’t too bad, but in practice it’s a nightmare because if you really assume that every allocation will perform as badly as the worst allocation case in one of these GCs then you’ll have to overprovision to the point of absurdity.

Most folks use a concurrent GC with some element of work based. So, the GC is a task, and its schedule is somehow tightly coupled to the current allocation rate. That’s good enough for soft RT and it can be good enough for some things on the softer side of hard RT. But even that’s hard to pull off and involves a beast of a collector, tight integration between RTOS and GC, and tight integration of the GC and test plan. No matter how you do it, way harder than malloc. You have all the problems of malloc plus other problems that malloc doesn’t have.

lisper · on Aug 26, 2022

It's true that malloc/free does not have these problems. It has different problems. A standard implementation of malloc/free is going to sometimes do heap defragmentation, or search a free list for a block of the right size, or something that isn't going to be O(1), and making it O(1) is not going to be significantly easier than making a GC O(1). And it is not true that GC has all the problems of malloc. GC can relocate objects. Malloc can't.

pizlonator · on Aug 26, 2022

the fact that GC can move objects is a plus for GC only if you don’t care about pauses. If you do care about pauses then moving objects makes everything harder.

lisper · on Aug 26, 2022

We seem to be talking past each other. Let's go back to basics:

Neither GC nor malloc/free (which I am going to start abbreviating as MF) are 100% guaranteed not to fail under all circumstances. As a trivial example, if you try to allocate more memory than you actually have physically available, both GC and MF will fail. That is not the only example of a circumstance under which they will fail, but that is irrelevant. The point is simply that there are circumstances under which both GC and MF can fail. This is true regardless of any real-time considerations. So the possibility of failure is a fact of life that cannot be discharged even with arbitrarily clever allocation technologies, and even in the absence of hard-real-time constraints.

Likewise, there are circumstances where both GC and MF are essentially equivalent. If (for example) you only ever allocate a finite amount of memory, and that amount is less than what is available, and you never free anything, and you never produce any garbage, then both GC and MF will work. As an ancillary observation, even the most naive implementation of GC and MF will return results in O(1) time, and the multiplicative constant will be very low. In other words, under these circumstances, both GC and MF can be reasonably considered to be "hard-real-time". Their behaviors will be essentially indistinguishable.

The only circumstance under which the question of GC vs MF becomes interesting at all is in more complicated situations, where you have complex allocation and de-allocation patterns, and the total amount of memory you are allocating over the lifetime of the program is more than the amount of physical memory available. Under those circumstances you have to be clever and re-use memory that has been previously allocated and freed, either manually, or by a collector.

Under those circumstances, the devil is very much in the details. It is not hard, for example, to come up with an allocation/deallocation pattern that will (almost certainly) cause an allocator that cannot relocate objects to fail, but which would succeed using a relocating allocator.

So although you are correct when you say that "if you care about pauses then moving objects makes everything harder," that is missing the point rather badly. The alternative to "making everything harder" is accepting failure in circumstances under which they would not have failed if you had "made everything harder".

Which brings me full circle to my original point: making things hard-real-time is hard, full stop. Hard-real-time GC is hard, but it is not the GC part that makes it hard, it is the hard-real-time part that makes it hard. And yes, of course you can make it less hard, but at the cost of having a system that is less capable, and which will fail in more circumstances. Making those kinds of trade-offs is part of what makes this problem hard. The differences between GC and MF are just part of the details. They are not the fundamental drivers of the difficulty of this problem.

ungamedplayer · on Aug 26, 2022

Why not put option 2 on another cpu, that way you will only get minimalistic interrupts to the main workload?

Is that a done thing ?

pizlonator · on Aug 26, 2022

Sure. But that doesn’t solve the schedulability problem. You’d still have to prove that the GC task completes soon enough. What does soon enough mean? All of the hard things I mentioned. Parallelism may help with provisioning but it doesn’t reduce the proof burden.

admax88qqq · on Aug 25, 2022

Malloc does not have constant nor guaranteed runtime.

It leads to memory fragmentation and requires non constant searches to find a free spot for the next malloc.

pizlonator · on Aug 25, 2022

It’s extremely difficult to make anything in GC have constant time guarantees.

Although non real time GCs can defrag the heap, it’s a huge problem to do it in a real time GC.

admax88qqq · on Aug 26, 2022

Sane problems as malloc and free. And due to GC being abstracted and faster amortized time per allocation you have more resources to work with.

Look at the Metronome GC from IBM

You basically have to guarantee a certain level of resources or a certain allocation rate. Things you have to do anyways in non GCd languages.

This whole conversation is moot. Hard real time does not mesh with dynamic allocation, unless your allocation behaviour is very well understood and predictable, in which case it ain't hard to design a GC that works for it.

Malloc/free are not free. They have their own limitations and performance characteristics.

silon42 · on Aug 25, 2022

In a real time system you probably shouldn't run out of RAM (or it should be a recoverable error, like dropping a packet).

But GC pause is inevitable if the GC can't reliably keep up (like a spike in workload).

lisper · on Aug 25, 2022

> GC pause is inevitable

No, it isn't. That's the magic of hard-real-time GC. It isn't easy, but it is possible.

[EDIT: A more accurate claim is that GC pause is no more inevitable than an allocation failure in a malloc/free system.]

pizlonator · on Aug 25, 2022

GC pauses are inevitable unless:

- you think it’s ok to crash your program with an OOM just because the GC couldn’t keep up with allocation rate. To my knowledge, nobody ever thinks this is ok, so:

- you inevitably pause. Maybe it’s rare. Maybe you are “real time” only in the sense that you want responsiveness 99.99% of the time rather than 100% of the time.

- you prove schedulability. The “good” alternative that I would assume few people do is prove that the GC can keep up (I.e. prove that it is schedulable). To do that you need to have WCET and WCAR (worst case allocation rate) analyses and you need to feed those into a schedulability analysis. Then there’s a bunch of ridiculous math. If you’re lucky, you can prove you’re schedulable, but it seems like a strictly worse dev plan than just writing C/Ada/Rust/whatever code that doesn’t GC.

I list the third option because I know it’s theoretically possible to do it, but I don’t recommend it and I don’t consider it to be a serious option if you’re building something that lives will depend on.

tlb · on Aug 25, 2022

I agree it's very hard to do that when your code needs 90% of the CPU to keep up.

But lots of systems need less than 10% for the real-time code. Or less than 1%. In that case, it can be easy to convince yourself that straightforward incremental GCs can keep up.

Perhaps you'd argue those systems are over-provisioned, if their CPU is under 10% busy. But there may be good reasons to have extra speed, like supporting a management UI (running at non-RT priority). Or it may be a worthwhile tradeoff for reduced engineering effort. Gigaflop CPUs are like $50 these days, so if your realtime code only needs a megaflop you're likely to end up with very low CPU utilization.

a1369209993 · on Aug 25, 2022

> But lots of systems need less than 10% for the real-time code. Or less than 1%. In that case, it can be easy to convince yourself that straightforward incremental GCs can keep up.

Basically this, although it applies to memory as well, not just CPU. If you've got enough headroom, it's easy to prove that the application physically cannot allocate fast enough to outpace to GC. (Along the lines of: each call to cons does 1/N of a mark-sweep cycle, and you have more than N+M cons cells available, where M is the maximum the application ever has live simultaneously.) Reducing headroom from there is just a performance optimization.

(Proving the program never uses more than M units of memory is nontrivial, but you'd have to that with malloc anyway, so GC doesn't cost any extra in that regard.)

pizlonator · on Aug 25, 2022

If you’re running on the softer side of real time then it’s acceptable to just run an experiment that confirms that you really only use a small percentage of CPU and that to your knowledge the GC never blew up and made you miss a deadline.

But on the harder side of real time you’re going to have to prove this. And that is super hard even if you’re overprovisioning like crazy.

tlb · on Aug 25, 2022

Agreed, for a sufficiently hard definition of hard real time.

Hard real time gets a lot of theoretical attention, because it's sometimes provable. Whether the decisions you're making by the deadline are sensible and not, say, cranking the horizontal stabilizer all the way down when one of the angle-of-attack sensors is broken, are far more consequential in most systems than missing a tick.

Companies definitely shoot themselves in the foot by focusing so hard on the hard real time constraint that they make it 10x harder to reason about the actual behavior of the system, and then they get the more important thing wrong. I've seen this in a few systems, where it's almost impossible to discover what the control loop behavior is from reading the code.

pizlonator · on Aug 25, 2022

Problem with GC is that its preferred mode of operation is to stop the world for much more than one tick.

Scanning a respectable-size heap on a respectably fast machine, sans fancy GC optimizations, could easily take 30 seconds. Modern production GCs rarely pause you for 30 seconds. Real time GCs certainly try very hard to avoid ever doing that. But:

- The optimizations that make a GC run faster than 30 sec are speculative: they target empirically found common cases. Not common cases of something the programmer has control over, but common cases of heap shape, which is a chaotic function of the GC itself, the way the OS lays out memory, the program, the program’s input, and lots of other stuff. Those common case optimizations are successful enough that GC cycle times often look more like 30 milliseconds than 30 seconds. So, the terrifying thought if you’re using a GC in real time is: what if at the worst time possible, the heap shape becomes something that isn’t common case, and the GC that you thought took 30 ms now takes 30 sec.

- Real time GCs can let the program do work while the GC is happening, so even if it takes 30 seconds, the program can keep chugging along. But here’s the catch: no memory is being reclaimed until the GC reaches the end, and some or all memory allocated during the GC cycle will remain unfreed until the next GC cycle. So if your 30ms concurrent collector decides to take 30sec instead, you’ll either run out of memory and crash or run out of memory and pause the world for 30sec.

Basically, the more you know about RTGC, the less you’ll want to use them when lives matter.

bombcar · on Aug 26, 2022

Isn’t this where the “cruise missile program will run out of RAM in an hour but the maximum flight time before detonation is 30m so we good” story goes?

adwn · on Aug 25, 2022

> with GC you expect to run out of RAM and you expect the system to not just go down in screaming flames when that happens

No, you don't. The main point of GC is that you don't have to think about memory as much as with manual memory management. In fact, most GC languages don't even have an API to explicitly handle an out-of-memory condition: the program or thread will just crash or abort. With malloc(), you can at least properly react to allocation failures [1], which is crucial for hard real-time systems.

[1] Like storing the runtime state to flash and restarting the software.

p_l · on Aug 25, 2022

That's not a property of malloc()/free()[1], that's a property of the runtime system which can be satisfied both in GC'ed languages and those with manual memory management.

[1] To the point that porting code originating from Linux is rife with bugs when you run on systems that have the possibility of malloc() failing (99.9% setups of linux never report malloc() failure)

kaba0 · on Aug 26, 2022

You can only react to OOM when the OS gives you the necessary tools - under ordinary OSs malloc will not fail, even if it couldn’t actually allocate - actual allocation only happens when one writes to the location.

But it doesn’t really apply to the topic at hand, just wanted to add.

alfiedotwtf · on Aug 25, 2022

> Malloc has the exact same structure: it's hella fast until you run out of RAM and it returns an allocation failure.

Doesn't that depend on the allocator though? Some malloc()s will perform some cleanup depending on their internal structure, which is not deterministic

lisper · on Aug 26, 2022

Sure. But in general both malloc and cons are pretty zippy most of the time.

agumonkey · on Aug 25, 2022

I wonder if people in RT domains just skip imperative programming to go into reactive (a la lucid/lustre/esterel) signals.

samatman · on Aug 25, 2022

Lua gets used on more embedded systems than you might expect, since it's small, fast, and doesn't need floats.

You turn the garbage collector off, tune the allowed step size, and trigger collection in steps when it's allowed. Having the Lua VM and GC around is not the hard part of getting hard real time guarantees out of such a system.

With Lua, those parts get written in C for the most part. With Common Lisp it's possible to use one language for both parts. At least as I'm told, by those who should know, I've not tried writing any CL which wasn't comfortably high level.

outworlder · on Aug 25, 2022

Lua is one of those amazing technologies that one wonders why it isn't just about everywhere.

Complex configuration files, embedded scripts, even browser scripting should all have been done in Lua.

(Or Lisp, but people have less of a visceral reaction when they see Lua syntax)

Theoretically some Common Lisp implementations could have GC disabled. Like you, I have not tried that in practice.

Karrot_Kream · on Aug 26, 2022

In my experience, Lua is everywhere, for some definition of everywhere :P I see it all over the place in so many different settings, embedded, high level scripting, gaming, etc etc

pdimitar · on Aug 26, 2022

I agree and I already used Lua successfully for scripting.

My gripe with it is: no easy parallelism and I don't mean OS threads. I suppose it's not a priority for them, sadly.

melony · on Aug 25, 2022

Wdym Lua has no floats? Lua defaults to doubles (big floats) not integers.

samatman · on Aug 25, 2022

What I said is that Lua doesn't need floats.

You can build Lua on a 32 bit chip which doesn't have a native float type, without floating point support. Which is handy.

LeifCarrotson · on Aug 25, 2022

As of Lua 5.3 (from 2015), Lua has both doubles and integers. The famous book "Programming in Lua", still often used and quoted, did apply to 5.2, which had only doubles.

samatman · on Aug 25, 2022

Worth adding that the only 5.1 anyone still uses (not you Warcraft), LuaJIT, has native support for C numeric types, including complex numbers. They can be referenced literally as 12LL, you can do arithmetic on them, but have to call `tonumber` to use them where Lua expects a 'number'.

lisper · on Aug 25, 2022

> notoriously difficult to properly set bounds on behavior

Actually it's trivial: just put a bound on the amount of available memory and do a worst-case analysis to GC it all.

The hard part is not engineering the system to have a bound, it is engineering the system so that the bound is reasonable for practical use.

OliverM · on Aug 25, 2022

Are there proofs for other languages/runtimes available? I'd love to see the level of rigour needed to demonstrate that

WJW · on Aug 25, 2022

One thing I know of is the CoPilot language/library/thingy from NASA (no relation to Githubs copilot), which can takes in code in a Haskell DSL and outputs code that is guaranteed to run in constant memory and uses an equal number of cycles each loop. See it here: https://copilot-language.github.io/

Also if you ever find yourself needing to write hard real-time code for your arduino, check out my blog post about the arduino-copilot library at https://wjwh.eu/posts/2020-01-30-arduino-copilot.html :)

OliverM · on Aug 25, 2022

Thank you, your blog post was a pleasure to read. CoPilot looks great!

Ono-Sendai · on Aug 25, 2022

My programming language Winter has bounded memory and space usage: https://github.com/glaretechnologies/winter, for at least a subset of the language.

naasking · on Aug 25, 2022

Deterministic programs are great, but that's not a prerequisite for hard realtime. A hard realtime system only needs to be guaranteed to satisfy a schedule.

jacquesm · on Aug 25, 2022

I read that as 'deterministic within the context of the hard real time portion' and that is a prerequisite otherwise you may end up overrunning your slot leading to a possible cascade of problems.

naasking · on Aug 25, 2022

It's not a prerequisite in that case either, as long as all possible execution paths fall within the time budget you're fine. Making the execution path deterministic is one way to do that and makes analysis simpler, but it's not the only way.

jacquesm · on Aug 25, 2022

A single loop with an undetermined end condition and you're toast. So that's why having deterministic behavior is a thing to strive for because it allows you to make a guarantee. If all of your code paths are 'straight through' you just tally up the longest sections and if that's below your budget then you know you are going to be ok in all situations.

pgorczak · on Aug 25, 2022

… if you’re running a single task at a time. Usually multiple tasks need exclusive access to shared resources. And they might be triggered by external inputs which are non-deterministic, making the number of possible system states you need to check intractable.

jacquesm · on Aug 25, 2022

Yes, hard real time is hard. So don't multi-task, use as simple a scheduler as you can afford and make sure all your time critical paths are limited.

I've built quite a few hard real time systems (machine control, mostly) and the penalty for messing that up would be either a toolstrike, loss of a workpiece, loss of a machine or loss of limb or life so I'm aware of the pitfalls and would definitely not want a multi-tasking OS as the core for anything hard real time. For soft real time that works fine, but for hard real time you need absolute control. In an extreme case that can mean a separate CPU running with all interrupts disabled (and NMI tied) so that you will never be surprised. Another CPU can handle concurrent stuff.

keithnz · on Aug 26, 2022

I've also done a bunch of machine control, and I generally really like the multi CPU architecture for real time control, if you are making your own hardware, I don't think it is too extreme. I've also found a lot of the systems end up being a mix of hard/soft realtime, where the bits that directly monitor/control the machine are hard, and then there is soft "decision" type stuff, farming them out to their own CPUs is often much easier to make guarantees of the hard real time bits.

hajile · on Aug 25, 2022

The halting problem says that you cannot make this guarantee for ANY sufficiently complex program.

Even a mere 9kloc L4 kernel (of intentionally not-so-dense C-code) took many man years to prove.

EDIT: Has anyone ever offered such a proof for a program that is just 100kloc? What about 1mloc? Do people actually believe that most non-trivial "realtime" programs are provably realtime?

tralarpa · on Aug 25, 2022

That might be the reason why the JPL coding standard says:

"All loops shall have a statically determinable upper-bound on the maximum number of loop iterations. It shall be possible for a static compliance checking tool to affirm the existence of the bound. An exception is allowed for the use of a single non-terminating loop per task or thread where requests are received and processed. Such a server loop shall be annotated with the C comment: /* @non-terminating@ */."

hajile · on Aug 25, 2022

It took around 5-10 man-years to verify the L4 kernel using a theorem prover and that's just 8700 LOC plus 600 lines of assembly.

Without such a theorem solver, it is all just "best practices" then hoping for the best.

ducktective · on Aug 25, 2022

Is there a tech stack or language that allows us to write formally proven and real-time (bounded-time tasks) for embedded systems?

What if we can be lenient on these conditions (like soft real time, 500ms bounds)?

I'm looking for something like shell scripting (easy and functional) + busybox set of utilities + language built-ins to write concurrent applications. Like an OS with a specialized shell scripting language. Does something like this exist?

deterministic · on Aug 28, 2022

I am currently playing around with Dafny and it is very promising. I have use it to generate proven correct high-performance C# code without too much efforts. It’s great fun actually. Like a really cool puzzle game.

psd1 · on Aug 26, 2022

cgi-bin

ablob · on Aug 25, 2022

Yet I can give you an estimate of the runtime of multiple programs I use to an accuracy within seconds (order of magnitude is hours). The halting problem itself is a good estimate for computational complexity (i.e. can my thing simulate a TM) and makes statements for the general case, but that's just it: the general case.

There are a whole class of programs/languages that are decidable (i.e. they will terminate) and that can be proven as well by automated means.

Now this does depend on what you mean with sufficiently complex, but for special programs you can prove a lot of things (by hand); the halting problem does not prevent that. However, a side effect is that you may not be able to prove that property for the program you want.

hajile · on Aug 25, 2022

> Yet I can give you an estimate of the runtime of multiple programs I use to an accuracy within seconds (order of magnitude is hours).

Can you do this for all possible inputs? If not, then your informal pinky swear that it really will get things done on time isn't worth very much.

monocasa · on Aug 25, 2022

Part of why L4 took man years to prove is that the proof contains all sorts of stuff like the formal model of ARM machine code. Ideally we'd get to a point where those models can be shared across projects, just like for running software we don't start each project by writing a compiler. There's still work to get there for sure, but just wanted to add context as someone who wants to see more formally proven software.

hajile · on Aug 25, 2022

That's an even bigger problem.

Running through every possible x86 instruction is out of the question. Even provably creating every possible instruction scenario would be all but impossible.

Caching is a very hard problem. Add in a bunch of neural networks to control those caches and programmers aren't even sure what's actually happening (let alone actually having formal verification).

Large, Out-of-order chips use millions of lines of code to implement their algorithms in hardware. They do lots of unit testing and some formal verification of the smaller bits, but the sum of the parts isn't formally verified.

It takes a half-decade to create a chip without formal verification. With complete formal verification, we might not even have the first Pentium ready to ship yet (if ever).

While we're on the topic, very few programming language implementations are formally verified. For example, L4 was verified using Isabelle which is written in StandardML and compiled using the PolyML compiler. I don't believe the PolyML compiler is formally verified. This means that Isabelle isn't formally verified when compiled which means that L4 isn't formally verified either (technically speaking). And of course, even if all of these were verified, there's the "trusting trust" issue and the fact that the compiler must be running on a formally verified CPU.

I appreciate striving for formal verification, but code verifiable from the bottom-up is a very long way away from happening.

typon · on Aug 25, 2022

No, that's a misunderstanding. It says you can't come up with a general algorithm for arbitrary programs to detect whether they will halt, it says nothing about particular programs.

hajile · on Aug 25, 2022

That is true in theory, but not in practice.

A large program is itself essentially composed of many subprograms. You must prove every single aspect of those subprograms (hard in and of itself) then you must prove that every possible combination of those subprograms is also provable. As there is no general formula for the halting problem, each combination must be uniquely proven.

This essentially means that if you have N possible branches, then the difficulty proving for each unique case would be the factorial of N.

Proving 9kloc of code for L4 took 5-10 man years. Adding just 1kloc of similarly dense code would likely add decades of man-years. What about 100kloc or 1mloc?

Even if you could find tricks to optimize similar branches, the fundamental factorial problem doesn't go away (as making it go away would mean that you've solved the provably insolvable halting problem).

In practice, only the most trivial of programs are completely provable before the heat death of the universe.

willis936 · on Aug 25, 2022

It's a high standard, but it is the standard for safety-critical systems. Nothing has changed about the problem, so the goal posts should stay where they are despite what new tools we have.

deterministic · on Aug 28, 2022

> The halting problem says that you cannot make this guarantee for ANY sufficiently complex program.

That is correct but also completely irrelevant. In practice all software we actually use and need can be written in languages that are slightly less powerfully than Turing complete languages. Look at CompCert, seL4, Dafny, Coq, LEAN, F-Star etc. etc.

umanwizard · on Aug 25, 2022

No it doesn’t. It says there are some programs for which you can’t make this guarantee.

freilanzer · on Aug 25, 2022

The emphasis being on "hard" real time. Does anyone know of Lisp in (hard) real time applications in current use?

Keyframe · on Aug 25, 2022

https://corecursive.com/lisp-in-space-with-ron-garret/

freilanzer · on Aug 25, 2022

Cool, but: "The year is 1988."

beanjuiceII · on Aug 25, 2022

does that really change anything? other than ya people were doing this in lisp way back when

dunefox · on Aug 25, 2022

The question was specifically about current usage.

varjag · on Aug 25, 2022

For hard RT you need both the OS and the language runtime to maintain certain constraints, and I am not aware of Lisp implementations satisfying that. Nothing to be concerned about tho, it's a fairly small niche that is not addressed by most of the popular languages today too.

(Doing soft-RT work in Common Lisp though)

rurban · on Aug 25, 2022

Lisps in Space are hard real time

freilanzer · on Aug 25, 2022

But AFAIK Lisp isn't in space anymore, at least actively, right? Wasn't even the remote control agent rewritten in C++?

vindarel · on Aug 25, 2022

It is, at least, in James Webb https://www.stsci.edu/contents/newsletters/2018-volume-35-is... for the scheduling system.

p_l · on Aug 25, 2022

Scheduling runs ground side as normal Unix app

omginternets · on Aug 25, 2022

Surely that doesn't matter? Seems like they can be real-time, even if they are no longer?

What am I missing?

freilanzer · on Aug 25, 2022

You're missing that I asked for current projects in Lisp.

omginternets · on Aug 25, 2022

That's completely irrelevant, though.

- Claim: "it is possible to write hard-real-time lisp, as evidenced by people having done it in the past."

- You: "nobody is doing it right now"

That's not a counterpoint.

freilanzer · on Aug 26, 2022

I have no idea what you are even talking about. I'm asking about current projects in Lisp for real time, so it's very relevant. You might need to work on your reading comprehension.

discreteevent · on Aug 25, 2022

Is there any evidence for that? Robotics is not necessarily hard real time. Hard real time means that something must complete within a tight deadline every time it executes no matter what.

ducktective · on Aug 25, 2022

>within a tight deadline

Isn't hard real-time only about upper-bound of tasks? Like if every task could be proven to run under t seconds (not necessary small), it is eligible for being called hard real-time.

0xcde4c3db · on Aug 25, 2022

Formally, yes. Practically, once t is large enough, violations of the constraint are much more likely to involve "acts of god" (power outages, rogue backhoe operators, natural disasters, wars, etc.) even if conventional best-effort techniques are applied.

user3939382 · on Aug 26, 2022

Can someone post this as HTML :’(

whage · on Aug 25, 2022

In every paper there comes a line at which I completely lose what they are talking about. If I'm lucky, this happens after the abstract/intro. In this case, this was it: "A real-time symbolic processing system on a single processor should have the following four features". What?

singlow · on Aug 25, 2022

real-time - being concerned with predictable latency rather than overall throughput

symbolic processing - symbolic expressions (s-expressions) being the building blocks of lisp programs

single processor - not considering multitasking or multiprocessor concerns

whage · on Aug 25, 2022

I appreciate the clarification but what I didn't get is what "symbolic processing" has to do with "hard real-time" (who cares what the language looks like?) and similarly, how is "single processor" related to "hard real-time"

martincmartin · on Aug 25, 2022

This terminology was common in the 1990s, before AI meant "machine learning" (aka adjusting parameters). Symbolic processing involves having an object in memory for things in the real world, and manipulating them. There's some vague relation to object oriented programming. The point is, it involves creating & destroying objects, memory allocation, that sort of thing. As opposed to purely numerical computing: for a neural net with a fixed architecture, there are a fixed number of operations (multiplying, adding) to get an answer. It's easy to do that in real time, since the computation is not only bounded, but of constant time. Symbolic manipulations can take variable amounts of time, and depend on the particulars of the inputs & problem being solved.

Lisp was championed by people in AI working on symbolic reasoning back in the 1990s.

Jtsummers · on Aug 25, 2022

Symbolic reasoning based AI goes back to at least the 1950s. You're missing almost a half-century of history there.

Jtsummers · on Aug 25, 2022

"A real-time symbolic processing system on a single processor should have the following four features"

The language they're developing in/for is Lisp, which is used (a lot) for symbolic processing. That's the "symbolic processing" part. "real-time" is the constraint they're trying to achieve and relates to system performance. "single processor" is just another constraint. If you permit multiple processors then some of the problems making Lisp real-time go away or are mitigated as you can do things like shove the GC bit into a second processor and execute it in parallel. Being constrained to a single processor means you can't do this, when the GC is going it's using the full CPU and taking time away from other computations (symbolic processing, in their case).

kazinator · on Aug 25, 2022

> who cares what the language looks like

I would guess, the paper cares what the language looks like since it's about Lisp being real-time.

Lisp has been dubbed as a "symbolic processing language" since its early days, since MacCarthy used that language in some papers.*

Basically it means the ability to juggle complex data structures that can represent formulas; e.g. solving algebraic equations (something easily hacked up in Lisp) is an example of symbolic processing.

This is entirely relevant to the topic because the dynamic allocation used in symbolic processing presents challenges to real-time processing, like garbage collection pauses, and whatnot. What if you want asynchronous interrupts to be able to allocate cons cells and other objects?

In a Lisp program, memory-allocating expressions casually appear throughout a program. Something like (list a b c), if we translate it to C, would call a malloc-like routine three times. Some people doing some kinds of embedded work even today eschew dynamic allocation.

---

* Like: Recursive Functions of Symbolic Expressions and Their Computation by Machine. Memo 8, Artificial Intelligence Project, RLE and MIT Computation Center, March 13, 1959, 19 pages. https://dspace.mit.edu/handle/1721.1/6096

_19qg · on Aug 25, 2022

"symbolic processing" means computing with symbols, not numbers.

Like: (simplify '(2 * a + 4 * a)) -> (6 * a).

An application for symbolic processing could be a rule-based system. The application then consists of a lot of rules and facts. Facts also could be symbols, not just numbers. Like a physical pressure value could be a number, but it could also be a qualitative value like low-pressure.

A relatively old application (originating in the mid 80s) is real-time control with a rule-based system in technical systems. Technical systems could be a fleet of satellites, a chemical plant, a cement production plant, ...

The software might get sensor inputs and based on rules may cause physical actions like opening or closing a valve, while getting more live sensor inputs, then possibly triggering other rules.

Typically Lisp systems have some kind of GC and most of the time this will not allow real-time applications. There are are some specialized Lisp systems which have no GC and the application code is written in a way to not dynamically allocate memory during runtime - one says that the style is non-consing, where CONS is the basic operation to allocate a new list cell. This way they avoid garbage collections. This allows some real-time applications, like the mentioned rule-based process control software. This is rare and exotic stuff. It's also considered 'old AI' technology.

The paper describes also something exotic: a concurrent Garbage Collector for a Japanese Lisp Machine, which runs a Lisp operating system - that was a thing when Japan was investing a lot into Artificial Intelligence in the 'Fifth Generation Project'. Such a GC would be useful so that the machine's network stack could be written directly in Lisp, but also for applications (similar to above control of external physical systems): robots, machines, production plants, ... It was thought that an important application domain for expert systems could be in industrial control systems.

Test0129 · on Aug 25, 2022

To be clear for OP hard real-time means that missing a response guarantee constitutes a total system failure.

srcreigh · on Aug 25, 2022

https://student.cs.uwaterloo.ca/~cs452/S22/

ducktective · on Aug 25, 2022

Thanks a bunch!

You got any more of these "Real-time Programming" courses? Or books?

auggierose · on Aug 25, 2022

It's from May 2000 (https://shibaura.pure.elsevier.com/en/publications/lisp-can-...)

mhdhn · on Aug 25, 2022

Thank you. Side note: who thinks it's a great idea not to put any publication date on a paper? And is there a rule for HN to put at least the year in the title, especially if it's not clearly visible at the top of the linked paper. There oughta be.

capableweb · on Aug 25, 2022

Yeah, it's stupid not to include it. Sometimes you can guess it (poorly) by looking at the references, if there is more than a few. If the latest one is at 1998, you can guess the paper is from somewhere around 2-5 years after.

In this particular case, latest publish reference is at 1998 and one marked as "to be published" at 2000 so that makes it a bit easier to guess pretty exactly :)

czbond · on Aug 25, 2022

Thank you. I was bewildered why one would go to the effort in 2022 to LISP hard, or LISP real time.

dreamcompiler · on Aug 25, 2022

I've built hard realtime systems in Lisp and I did it because it was easier than doing the whole thing in e.g. C. The extra effort for me would have been not using Lisp.

dunefox · on Aug 25, 2022

Sounds interesting. Can you elaborate?

dreamcompiler · on Aug 29, 2022

Basic idea is to divide the problem into two parts: The Fast and Dumb part and the Slow and Smart part. You can write the F&D part in assembler or C and implement it as a hard real time process. You can write the S&S part as a Lisp process that only has soft real time constraints. The two processes have to communicate somehow, e.g. message passing etc.

As long as the F&D process is simple enough and of bounded length, it can continue to operate even if the Lisp process is delayed by GC.

Not every real time problem can be subdivided in this way, but for those that can, Lisp can work well.

gibsonf1 · on Aug 25, 2022

This is very much outdated as current sbcl.org compiled lisp is in many cases faster than any other language.

LordGrey · on Aug 26, 2022

I love lisp and I currently use SBCL now that CCL is fading away, but I'm a bit skeptical of your claim. Can you provide sources to back that up?

(This is aside from the assertion that "faster than language x" is not the same as "adheres to hard realtime constraints" which is what the overall topic is about.)

gibsonf1 · on Aug 26, 2022

A good example here: https://github.com/fukamachi/woo and here: https://news.ycombinator.com/item?id=26465857

LordGrey · on Aug 26, 2022

Thanks, I'll check out your links.