My point is that this limitation can be bypassed if you are smart enough. The Linux kernel is more complex than any single BPF program, it has side effects.
For example, let say that I am an attacker who which to execute a loop to brute force a key using a new side channel from kernel space. eBPF programs are now available to userspace with "insert new shiny here". I add one as a performance tracepoint, then one in the packet filter, then one modifying a vtable somewhere. Now, I manage to make them oscillate and call each other while never leaving kernel space. I now "deadlock" the kernel with my small BPF programs until I manage to extract what I am after, then let the "deadlock" go away.
It take only 1 bug to make this happen. Each BFP program individually return, but only to be called again. If you let users put these in the kernel, this will happen eventually. This kind of tricks has been used over and over with other technologies.
And the recursion, even tail recursion, in BPF is limited for that very reason.
All eBPF programs have a statically verifiable worst case execution time.