It's highly dependent on both workload and instruction set.
I'm sure it could be automatically optimized in theory, even without the solution being AI complete, but I don't think we have any idea how to do it right now.
No, not unless you're reflashing an FPGA. You'd have better luck sharing subcores for threadlets I think.
I'm sure it could be automatically optimized in theory, even without the solution being AI complete, but I don't think we have any idea how to do it right now.
No, not unless you're reflashing an FPGA. You'd have better luck sharing subcores for threadlets I think.