OpenSSL and similar libraries spend most of their time processing short packets. For example, encrypting a few hundred bytes using AES these days should take only a few hundred CPU cycles. This means that the overhead of calling the crypto code should be minimal, preferably 0. This is in part what I meant by "first-class". Perhaps I should have written "zero-overhead" instead.
I googled around just now for some benchmarks on the overhead of FFIs. I found this project [1] which measures the FFI overhead of a few popular languages. Java and Go do not look competitive there; Lua came surprisingly on top, probably by inlining the call.
Before you retort with an argument that a few cycles do not matter that much, remember that OpenSSL does not run only in laptops and servers; it runs everywhere. What might be a small speed bump on x86 can be a significant performance problem elsewhere, so this is something that cannot be simply ignored.
those linked tests are extremely disingenuous, it only shows the fixed cost of FFIs.
Considering that in C the plusone call is 4 or so cycles, and the Java example is 5 times slower, that's only 20 or so cycles. If the function we're FFIing into is 400 cycles, that's only a 1% decrease in speed. I'm willing to pay that price if it means not having to wake up to everything being vulnerable every couple of months.
This project attempts to measure the overhead of calling out from $LANGUAGE and into C, which is the reverse of what's necessary to solve the problem stated here — to write a low-level library in a high-level language.
There are other means of achieving a secure implementation, such as programming in a very high-level language, such as Cryptol, and compiling to a low-level language:
No, it's the exact problem we're faced with here: Calling OpenSSL from the outside is something you do a handful of times. The OP was concerned about parts of OpenSSL that require direct hardware access (thus, should be written in C). Because those parts of the code are extremely hot, having to cross FFI boundaries to reach them might be prohibitively expensive.
I googled around just now for some benchmarks on the overhead of FFIs. I found this project [1] which measures the FFI overhead of a few popular languages. Java and Go do not look competitive there; Lua came surprisingly on top, probably by inlining the call.
Before you retort with an argument that a few cycles do not matter that much, remember that OpenSSL does not run only in laptops and servers; it runs everywhere. What might be a small speed bump on x86 can be a significant performance problem elsewhere, so this is something that cannot be simply ignored.
[1] https://github.com/dyu/ffi-overhead