Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One cool advantage of having multiple compilers for a language is that you can use one as a check on the other.

For example, if you're worried that one of the compilers might be malicious, you can use the other compiler to check on it: https://dwheeler.com/trusting-trust

Even if you're not worried about malicious compilers, you can generate code, compiled it against multiple compilers, and sending inputs and see when they differ in the outputs. This has been used as a fuzzing technique to detect subtle errors in compilers.



> For example, if you're worried that one of the compilers might be malicious, you can use the other compiler to check on it: https://dwheeler.com/trusting-trust

This still requires the use of a use of trusted compiler though. Comparing two compilers arbitrarily shows if there is consensus, it does not give guarantees about correctness.

From the link.

    In the DDC technique, source code is compiled twice: once with a second
    (trusted) compiler (using the source code of the compiler’s parent), and then
    the compiler source code is compiled using the result of the first
    compilation. If the result is bit-for-bit identical with the untrusted
    executable, then the source code accurately represents the executable.


First, I forgot to disclose: I am the author of https://dwheeler.com/trusting-trust .

As discussed in detail in that dissertation, if you are using diverse double compiling to look for malicious compilers, the trusted compiler does not have to be perfect or even non-malicious. The trusted compiler could be malicious itself. The only thing you're trusting is that the trusted compiler does not have the same triggers or payloads as the compiler it is testing. The diverse double compiling check merely determines whether or not the source code matches the executable given certain assumptions. The compiler could still be malicious, but at that point the maliciousness would be revealed in its source code, which makes the revelation of any malicious code much, much easier.

You're absolutely right about the general case merely showing consistency, not correctness. I completely agree. But that still is useful. If two compilers agree on something, there is a decent chance that their behavior is correct. If two computers disagree on something, perhaps that is an area where the spec allows disagreement, but if that is not the case then at least one of the compilers is wrong. The check by itself won't tell you whirch one is wrong, but at least it will tell you where to look. In a lot of compiler bugs, having some sample code that causes the problem is the key first step.


Ha, I didn't even notice the username! I agree consensus (or lack thereof) is an useful property to demonstrate. I think I may have been a bit of a pedant in my prior comment.


Sounds fascinating. Are there real-world examples of malicious compilers?


Yes, there was a malicious compiler system for Apple iOS that was released in China a few years back and subverted a large number of mobile applications, including apps used in the US and Europe. There was also a subverted Delphi compiler a number of years back, though I don't think the subversion was dangerous it was more like a test case. And of course, Ken Thompson demonstrated the attack in the 1980s. There may be others, but I remember those offhand.


IIRC this was feasible because people in China are behind the GFW which throttles/blocks the mac app store, so most people download from in-country caches, which circumvents a lot / all of the app signing that Apple uses.


i read a story about a compiler adding malware to the compiled binary once.

they kept getting owned until they supposedly found a pretty dump hack which just appended the backdoor to the final compilation on the build server...

no clue if it was just a story though, as i personally havent experienced anything like that before.


I don't think this is what you're looking for, but Coding Machines[1] is a great little story in which the Ken Thompson hack[2] plays a role.

[1]https://www.teamten.com/lawrence/writings/coding-machines/

[2]https://www.win.tue.nl/~aeb/linux/hh/thompson/trust.html


Yes, that's right, that's another story about a subverted compiler. I don't have any way to verify it, but I have no reason to doubt the story. It is quite possible, and not even that difficult to do if you want to be that malicious. I don't have a URL for it, maybe someone else can provide that.


Neat. Reduces attacks to conspiracies.


Please don't quote with code blocks. Makes reading on mobile very difficult.

The quote reformatted:

> In the DDC technique, source code is compiled twice: once with a second (trusted) compiler (using the source code of the compiler’s parent), and then the compiler source code is compiled using the result of the first compilation. If the result is bit-for-bit identical with the untrusted executable, then the source code accurately represents the executable.


Yep! This is a very good property, and part of why mrustc is a big deal.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: