Jets exist without a JIT. In fact, the whole point of distinguishing "jets" as an idea is that they're a potential feature of naive bytecode interpreters, rather than of JITing interpreters or of (potentially optimizing) compilers.
I mean, you can think of an interpreter with jets, but without a JIT, as an interpreter which passes the code it loads through a specific kind of JIT that only does native codegen for explicitly specified patterns, and has no "generic" path, instead leaving everything else as calls back into the interpreter.
But that's not actually what's happening, because there is no JIT in such interpreters. JITing necessarily happens either during code-loading, or asynchronously with a profiling thread. Jets, meanwhile, happen within the instruction-stream decode stage of a VM, where the VM has enough of a read-ahead buffer that it can decode a long sequence of plain instructions that fits a given pattern, to an intrinsic. Jets are "recognized" within the VM's instruction pipeline itself, each time the pipeline's state matches a given bytecode-sequence pattern. They're a register-transfer level optimization.
In essence, jets are an implementation technique for bytecode interpreter optimization, alternative and complementary to a full JIT.
---
Also, I'm speaking about VMs here, but jets apply as a thing you can do just as well in a hardware CPU design, too.
An example of a common "hardware jet" is recognizing a multi-byte no-op sequence (that is, not a multi-byte NOP intrinsic, using a long instruction, but rather a sequence of single-byte NOPs), and making it have the same effect as a multibyte no-op intrinsic of the same size (i.e. given a NOP sequence of length N, the replacement would free up the ALU and other later pipeline stages for N-1 cycles.)
(There's probably another name this technique is known by in the hardware world—I'm not a hardware guy. I'm just highlighting the equivalence.)
And this isn't just a particular kind of microcode expansion, either. Microcode expansion is effectively a kind of cheap, throw-away JIT: CPUs expand their ISA to microcode not during the decode stage of their pipeline, but rather when the instruction pointer's movement causes the CPU to copy a new chunk of a code page into a cache-line. The cache line is expanded as a whole, and the result stored in a per-core microcode buffer. This works for regular instructions, since regular instructions are necessarily cache-line aligned; but the sort of composite, multi-instruction patterns a CPU might want to recognize aren't guaranteed to be cache-line aligned, so they won't get caught by this pass. A "hardware jet", on the other hand—since it happens at the register-transfer level—can optimize these instructions just fine. So CPU designers use both.
Thank you (seriously) for the detailed explantion.
JITs already do this, to my mind. What else are you going to call it when HotSpot sees bytecode for a loop initializing an array and replaces it with a call to memset? Special-casing instruction patterns is applicable whether or not you're doing native codegen. The big difference is that you have these VMs which have decided not to JIT so they need to fix their perf problems somewhere else, because (and Nock is especially bad about this) a naive interpreter is unacceptably slow.
I mean, you can think of an interpreter with jets, but without a JIT, as an interpreter which passes the code it loads through a specific kind of JIT that only does native codegen for explicitly specified patterns, and has no "generic" path, instead leaving everything else as calls back into the interpreter.
But that's not actually what's happening, because there is no JIT in such interpreters. JITing necessarily happens either during code-loading, or asynchronously with a profiling thread. Jets, meanwhile, happen within the instruction-stream decode stage of a VM, where the VM has enough of a read-ahead buffer that it can decode a long sequence of plain instructions that fits a given pattern, to an intrinsic. Jets are "recognized" within the VM's instruction pipeline itself, each time the pipeline's state matches a given bytecode-sequence pattern. They're a register-transfer level optimization.
In essence, jets are an implementation technique for bytecode interpreter optimization, alternative and complementary to a full JIT.
---
Also, I'm speaking about VMs here, but jets apply as a thing you can do just as well in a hardware CPU design, too.
An example of a common "hardware jet" is recognizing a multi-byte no-op sequence (that is, not a multi-byte NOP intrinsic, using a long instruction, but rather a sequence of single-byte NOPs), and making it have the same effect as a multibyte no-op intrinsic of the same size (i.e. given a NOP sequence of length N, the replacement would free up the ALU and other later pipeline stages for N-1 cycles.)
(There's probably another name this technique is known by in the hardware world—I'm not a hardware guy. I'm just highlighting the equivalence.)
And this isn't just a particular kind of microcode expansion, either. Microcode expansion is effectively a kind of cheap, throw-away JIT: CPUs expand their ISA to microcode not during the decode stage of their pipeline, but rather when the instruction pointer's movement causes the CPU to copy a new chunk of a code page into a cache-line. The cache line is expanded as a whole, and the result stored in a per-core microcode buffer. This works for regular instructions, since regular instructions are necessarily cache-line aligned; but the sort of composite, multi-instruction patterns a CPU might want to recognize aren't guaranteed to be cache-line aligned, so they won't get caught by this pass. A "hardware jet", on the other hand—since it happens at the register-transfer level—can optimize these instructions just fine. So CPU designers use both.