Being Sneaky in C

patrickmay · on June 12, 2015

I like the original Ken Thompson sneakiness in C (http://electronicdesign.com/dev-tools/thompson-ritchie-and-k...):

"Also in his Turing Award lecture, he described how he had incorporated a backdoor security hole in the original UNIX C compiler. To do this, the C compiler recognized when it was recompiling itself and the UNIX login program. When it recompiled itself, it modified the compiler so the compiler backdoor was included. When it recompiled the UNIX login program, the login program would allow Thompson to always be able to log in using a fixed set of credentials."

kenko · on June 13, 2015

That really isn't the same thing, though. That's language-agnostic conceptual sneakiness that happened to be implemented in C.

mkehrt · on June 12, 2015

Original (PDF): https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thomp...

thaumaturgy · on June 12, 2015

OpenBSD specifically modified malloc() a few years ago to prevent this sort of sneakiness (http://www.tw.openbsd.org/papers/eurobsdcon2009/otto-malloc.... [pdf]). So they route their malloc() calls through mmap() which returns randomized pages, and free() immediately returns memory to the kernel rather than leaving it mapped in the current process.

I'd be surprised if these changes haven't made it into FreeBSD, but afaik Linux doesn't work this way (by default, anyway).

kayamon · on June 12, 2015

I tested it on Ubuntu 14.04 x64, and the problem occurs there.

And of course, some programs use replacements like dlmalloc and do all their own allocation management anyway.

thaumaturgy · on June 12, 2015

> And of course, some programs use replacements like dlmalloc and do all their own allocation management anyway.

Yeah. I wrote my own allocator in C++ a long time ago. I wouldn't be surprised if there weren't quite a few other bits of software out there doing the same thing.

kayamon · on June 12, 2015

Wasn't Heartbleed due to exactly that - using their own local allocator instead of calling malloc/free?

thaumaturgy · on June 12, 2015

Partly. They were using their own allocator (openssl_malloc()), but even then they would've been OK if it weren't for the OBO error elsewhere in the heartbeat implementation. If they were using an OS-supplied malloc() instead of openssl_malloc(), the bug would've still been exploitable on some operating systems, but not others.

Either way, "don't write your own allocator" is a good lesson to learn.

Unless, of course, you're doing it for fun. In which case, efficient heap management really is a neat exercise.

ikeboy · on June 12, 2015

Heartbleed was due to not checking the bounds.

https://www.openssl.org/news/secadv_20140407.txt

https://xkcd.com/1354/

kazinator · on June 12, 2015

If your claim is that OpenBSD routes every call to malloc/free through mmap/munmap, it is simply not believable.

Do you have a URL to a git ... hahaha, pardon me, CVS repo?

marios · on June 12, 2015

http://bxr.su/OpenBSD/lib/libc/stdlib/malloc.c#omalloc ^Not the CVS repo, but close enough for those who want to check OpenBSD's malloc implementation. The malloc.conf manpage is also an interesting read : http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man5/...

And some context around malloc and the various techniques OpenBSD implements to make exploitation of bugs more difficult: http://www.openbsd.org/papers/dev-sw-hostile-env.html

The tl;dr is : if your code compiles and runs on OpenBSD, chances are it will fine on other Unices. The opposite is not necessarily true though.

comex · on June 12, 2015

I don't think that source refutes the parent's point: only large allocations go directly to mmap/munmap as opposed to being cached. Of course, there are other anti-exploitation measures too...

icebraining · on June 12, 2015

Here's the CVS repo: http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libc/stdlib...

kayamon · on June 12, 2015

Well you could read the paper he just linked to.

ojn · on June 12, 2015

It also makes the assumption that it's a little-endian system. On a big-endian system, the high order byte of the timestamp would be modified, which would probably be too obvious.

esmi · on June 13, 2015

"In C/C++, you can use bugs in one part of a program to cause trouble in another. That’s pretty darn underhanded."

I would argue every language has that property. But with C/C++ being so closely tied to the ABI of the machine perhaps they are more underhanded than others. But to me, this branding does feel a bit unfair.

Still, a fun contest and an interesting read.

spdionis · on June 13, 2015

You can always cause that kind of bug, but I would argue that C/C++ is massively more underhanded than other languages.

codezero · on June 12, 2015

The description of the bug in surveil.txt in the source archive was a bit easier for me to understand, really nifty :)

jonahx · on June 12, 2015

Would setting the malloc'd memory back to the original message before freeing it solve the problem?

kayamon · on June 12, 2015

Yes. You could also memset any memory to zero once you're done with it, which isn't a bad idea for applications with high security requirements.

TheLoneWolfling · on June 12, 2015

...Except, of course, the the compiler can and will optimize away any such memset.

Do not attempt to write secure software with C / C++. It is a Bad Idea(TM). Because what you wrote is not what gets run.

For instance: there is no way to write a portable secure memset in portable C / C++. (You can write a "secure" memset that works in all current compilers, but that is not the same thing. What doesn't get optimized now can and will be optimized tomorrow.)

davtbaum · on June 12, 2015

> ...Except, of course, the the compiler can and will optimize away any such memset.

It's dependent on the function attributes of memset (e.g. __attribute__((pure))) - it won't always be optimized out.

https://www.cs.auckland.ac.nz/references/c/gcc4.7/Function-A...

TheLoneWolfling · on June 12, 2015

I am not talking about if it does currently optimize away a function call to an alternative to memset, I am talking about if it is allowed to optimize it away.

The compiler is allowed to optimize away that function call regardless of if it is memset or your own alternative.

Note that that is not the same thing as saying if it does currently optimize away that function call.

lmm · on June 13, 2015

I'm not aware of any portable language that offers semantics that guarantee data is never rendered accessible to the outside world (I could well believe that current implementations don't, but I'm not aware of anything that specifies it). It always comes down to platform-specific APIs. But that's not a reason to not use C - unless you can suggest a better alternative?

TheLoneWolfling · on June 13, 2015

Quite frankly, assembly currently is about it. And that's not the most portable thing out there, I know.

I think that C or C++ could, without too much effort, support semantics that would allow for this sort of thing. Something as simple as a "secure" keyword that could be applied to variables (where it means "leak as little as possible about this variable when it goes out of scope") or functions (where it means the same, but for the function itself and all locals of the function).

trentnelson · on June 12, 2015

SecureZeroMemory: https://msdn.microsoft.com/en-us/library/windows/desktop/aa3...

TheLoneWolfling · on June 13, 2015

...I do not think the word "portable" means what you think it means.

Joky · on June 12, 2015

I don't get it, the compiler operates within the memory model specified by the language. If it "optimizes" a memset it does not change the behavior of the program (or it is a bug in the compiler which is a different topic).

Too · on June 13, 2015

Common misconception with C. A pointer does not mean a pointer to a sequence of capacitors in your RAM-memory. It really means a pointer to an abstract and temporary variable. How this abstract variable is executed on your hardware is implementation specific. Everything except input, output and explicitly defined side effects (volatile) is of no interest.

Really, you can print a c program on piece of paper and ask some slave to "execute" the program in his head given some input x, how he "implements" memset will surely be different than what a computer would, and if you only ask for the output y he will surely see that this memset doesn't affect y at all and skip doing it.

TheLoneWolfling · on June 12, 2015

You're correct - the compiler operates within the memory model of the language. But C / C++'s memory model is broken w.r.t. security.

There is no way to ensure that something is actually overwritten, because under the memory models of C and C++ you cannot ever read that memory again, even though in actuality you can.

Joky · on June 12, 2015

I believe volatile force the compiler to issue a real write to memory and can't be optimized away.

TheLoneWolfling · on June 12, 2015

Except, of course, that the compiler can perfectly legally copy the variable behind the scenes and not overwrite the copy.

I mean it when I say you cannot.

Joky · on June 13, 2015

I'm not sure what you mean here, if you have a volatile pointer that points to a memory buffer returned by malloc, how can the compiler prevents a write through the pointer from happening?

Edit: unless your point is a temporary copy can be spilled in memory and this copy will stay in memory and won't be overwritten?

TheLoneWolfling · on June 13, 2015

Yes, your edit is correct.

ikeboy · on June 12, 2015

Why not generate random data then read it? Or even a constant, then read it?

TheLoneWolfling · on June 12, 2015

Because the compiler will optimize it out. Even if you return the random data / constant the compiler will optimize out the store to the variable and just pass it through directly.

ori_b · on June 13, 2015

The problem is that the memory model specified by the language is a subset of the memory model specified by the hardware. This leads to exploitable systems when you lift those blinders.

kosmic_k · on June 12, 2015

So what is someone working with embedded devices for IoT supposed to do? Much of that work is based on ARM C/C++ Compilers.

Someone · on June 12, 2015

Use a compiler that has extensions that _do_ guarantee that memory gets erased. That's what the gcc function attributes are for.

Alternatively, use a library that the C compiler doesn't know so much about that it will attempt to remove calling functions in them.

If you copy your standard library's memset in a separate DLL that is not the standard C library, the compiler will not even see the code during compilation, so it has to compile a function call.

The linker (or a JIT in your C runtime) is allowed to remove calls to the function, if it can prove that it doesn't have side effects. However, to prove that, it has to look at the assembly of the function; it cannot use the way simpler heuristic "it came from <memory.h> and is called memset"

nkurz · on June 13, 2015

If you copy your standard library's memset in a separate DLL that is not the standard C library, the compiler will not even see the code during compilation, so it has to compile a function call.

Although I'd like things to behave this way, I don't think this is true. The C standard library was incorporated into the language spec for C89. The behaviors of the named functions within it are specified, and the compiler is allowed to inline it's own version (ignoring your custom code) and then optimize out the inlined portion.

So while it's possible that the external linkage approach still works with certain compilers, it's not portable. I believe you are OK with the external approach if you use a non-standard name (my_secure_memset_pretty_please()), but that just shifts the problem to forcing the compiler to generate your external function without making the same dangerous optimizations.

In the end, I fear you are left with three options: blind faith, non-standard language extensions, or switching to a more secure language (likely assembly). If there are other options, I love to hear about them.

pjc50 · on June 12, 2015

In practice, of course, memset actually works, because it's a function and the compiler's usage tracing is nowhere near able to spot that you don't reference those zeros that you write.

(IoT security is doom for other reasons though, mostly UI, updatability and cloud services)

vardump · on June 12, 2015

Really? In my experience most compilers treat it pretty much like an intrinsic and generate specialized set of instructions if they can.

TheLoneWolfling · on June 12, 2015

For now.

And then that code you wrote now silently becomes deadly a few years down the line.

pjc50 · on June 12, 2015

I would be very interested to see the compiler that can optimise away memset across a shared library boundary.

pcwalton · on June 13, 2015

LLVM can and will do it. It will assume it knows what a function named "memcpy" (for example) does and optimizes accordingly. (Look at TargetLibraryInfo.cpp and grep for LibFunc::memset in, for example, SimplifyLibCalls.cpp.)

(That said, I think TheLoneWolfling is being too strong with his/her claims. You can get modern compilers to avoid dangerous optimizations; it's just not for the faint of heart.)

TheLoneWolfling · on June 13, 2015

I never said you couldn't get a particular compiler to. Or indeed, all current compilers.

I am saying that it's impossible to do so and remain in the realm of portable C / C++.

There is a distinction.

TheLoneWolfling · on June 13, 2015

Also: isn't that a bug? Is there something in a C / C++ standard that states that a function named "memcpy" (for example) is necessarily the normal function?

pcwalton · on June 13, 2015

Compilers have been doing this for a long time. The optimizations that this enables are essential for performance. They shouldn't stop; if the spec prohibits it, the spec should change (and if it doesn't, the compilers should ignore the spec).

TheLoneWolfling · on June 13, 2015

Good to know about the first and last part.

And as for the second part... Meh. I don't see any optimizations that hard-coding calling something named "memcpy" (or whatever) does that cannot be enabled by looking at the actual code that gets linked. Albeit with more difficulty.

TheLoneWolfling · on June 12, 2015

A JITter would be able to do that. And the JVM can (and will) do the equivalent for Java code.

Remember: there is nothing that specifies that C / C++ needs to be compiled.

Also: you could have said the same twenty years ago about many of the optimizations that currently compilers do.

coryrc · on June 12, 2015

1. Compile without optimization (maybe just the crypto parts, if that's possible). 2. Write all crypto stuff in assembly, link in as static binary.

TheLoneWolfling · on June 12, 2015

Neither of those work in general.

W.r.t. 1, the compiler's definition of no optimization today is not the same thing as it was last version, or will be next version. For instance, on IA-64 there are things the compiler has to do that are typically considered optimizations.

W.r.t. 2, you have to make sure there is no link-time optimization happening.

mpu · on June 13, 2015

Lto does not work on assembly, it only works if some IR is stored in the .o files (like gimple for gcc) iirc.

TheLoneWolfling · on June 13, 2015

Currently.

However, that is not an inherent restriction - that is only a restriction on current compilers. It is entirely possible for a compiler to read the assembly of things being linked and optimize based on that.

epsylon · on June 13, 2015

You could also dynamically link.

TheLoneWolfling · on June 13, 2015

That does not solve the problem. That only hides it and means it will be deadly later.

For instance, when someone runs it in an emulator for backwards compatibility purposes. Or when someone runs it in a JITter. Or even just if the compiler decides to special-case for the existing link target.

TheLoneWolfling · on June 12, 2015

Either:

* Use a specific compiler and verify.

* Don't use C / C++.

* Panic.

moyix · on June 12, 2015

C11 actually adds the memset_s function, which is guaranteed by the language spec not to be optimized away:

> memset may be optimized away (under the as-if rules) if the object modified by this function is not accessed again for the rest of its lifetime. For that reason, this function cannot be used to scrub memory (e.g. to fill an array that stored a password with zeroes). This optimization is prohibited for memset_s: it is guaranteed to perform the memory write.

http://en.cppreference.com/w/c/string/byte/memset

TheLoneWolfling · on June 12, 2015

Except, of course, that memset_s is still not enough.

The compiler can and will copy things around, and it is not required to memset_s said cop(y)/(ies) away.

kllrnohj · on June 12, 2015

> For instance: there is no way to write a portable secure memset in portable C / C++.

Of course there is, you just use the volatile keyword. volatile guarantees that all read/writes have corresponding memory accesses and cannot be optimized away.

It's not going to be as fast as memset but it's definitely portable and it won't be THAT slow. Then for platforms that have memset_s defer to that instead, otherwise fallback to the totally portable volatile + for loop.

dezgeg · on June 12, 2015

> Of course there is, you just use the volatile keyword.

Colin Percival of FreeBSD disagrees:

http://www.daemonology.net/blog/2014-09-04-how-to-zero-a-buf...

TheLoneWolfling · on June 13, 2015

And that doesn't even get into the other aspect of it:

Namely that C / C++ allows temporary copies of variables that are not cleared afterwards. The most obvious case of this being things being temporarily copied into registers / stack, but there are other examples as well.

kllrnohj · on June 15, 2015

Not really. He just says if the compiler can optimize away prior to it being declared volatile.

TheLoneWolfling · on June 12, 2015

If only it were so simple.

But that does not work. Full stop. The compiler can optimize in ways that still leak the contents of the thing that was supposed to be memset-ted away.

kllrnohj · on June 15, 2015

You're talking about a different problem now.

But I'm pretty sure you're just aggressively anti-C/C++ so whatever.

TheLoneWolfling · on June 16, 2015

You seem to misconstrue me.

C / C++ are very good languages in all sorts of ways. However, there are components that currently have... flaws. This being one of them. As such, I complain about said flaws, in the hopes that someone will take notice, and/or someone will point me in the direction of things that contain the good parts of C / C++ without said flaws.

I have already learned a fair bit about bounds checking, SIMD instructions, etc, etc from this. And I always want to know more.

*

And no, it is the same problem. Namely, that the memory models of C and C++ doesn't match with the underlying hardware, and the mismatch is such that things that are trivial to do on the underlying hardware are literally impossible to do with C and C++.

Part of this is for compatibility purposes, but there are ways to keep the compatibility that don't present this sort of problem.

vvanders · on June 12, 2015

Volatile should only be used with hardware registers. It doesn't do exactly what you want here. It will guarantee that memory will be accessed but it doesn't guarantee the ordering which can lead to some really nasty behaviour.

The only place that keywork should be used is a qualifier for member functions or when used in an embedded sense. It's not well defined outside of that scope.

kllrnohj · on June 15, 2015

volatile's designed purpose is for memory-mapped things, not hardware registers.

overgard · on June 13, 2015

You could just compile with optimizations off.

TheLoneWolfling · on June 13, 2015

There is no such thing in portable C / C++.

There are some ISAs that pretty much require optimizations, for instance.

jimjimjim · on June 13, 2015

cheat. after resetting the memory, copy the first and last bytes and log them out somewhere. that'll prevent any optimizations

TheLoneWolfling · on June 13, 2015

If only it were so simple. You underestimate the evilness of compilers.

The compiler can (and will!) just propagate the values through directly and skip the memset.

rcthompson · on June 12, 2015

If you're modifying memory immediately before freeing it (i.e. after the last time you read it), don't you have to be extra super careful to do so in a way that the compiler won't optimize the operation into nothingness? (I don't program in compiled languages very much, so I don't know the details about this.)

thaumaturgy · on June 12, 2015

Yes, you're correct. See e.g.: http://www.eliteraspberries.com/blog/2012/10/zero-and-forget... (which has a link to some HN discussion at the end of the article).

kayamon · on June 12, 2015

Actually yeah, that's a very good point. Sometimes compilers are just too clever.

I think some systems have a "secure memset" function that can be used for things like this - i.e. one that's guaranteed not to be optimized out.

ape4 · on June 12, 2015

Yes, memset_s(). Which makes me think there should be a free_s().

TheLoneWolfling · on June 13, 2015

memset_s, or a mythical free_s, is not enough.

The compiler can, and will, make copies of data behind the scenes. And not erase said copies.

What we really need is a keyword / modifier that says that when X passes out of scope no state related to X may be leaked. Ideally, that can be applied to a function / block as well as a variable.

(Or rather, not necessarily no state. Read "as little state as possible", preferably with modifiers that panic unless the compiler can ensure specific things.)

mpu · on June 13, 2015

The C standard works at an abstraction level that makes it unsuitable for security applications, I would advocate for a new language here. It needs serious PL research with informarion flow reasoning, what we need is just a new kind of language much more machine aware (yes, more low-level) than C.

TheLoneWolfling · on June 13, 2015

Agreed on some level.

On the other hand, something as simple as a keyword marking a variable as "as secure as possible given hardware constraints (read: should wipe any temporary copies and the variable itself after it goes out of scope, should attempt to prevent it from being written to non-volatile storage, that sort of thing)" (sort of like how inline works), with compilers required to bail if the constraint cannot be done to the level specified, would be a massive step in the right direction.

TheLoneWolfling · on June 12, 2015

Yes.

There is, quite literally, no way to ensure data is not leaked (namely, that data is zeroed / etc) in portable C / C++.

alain94040 · on June 13, 2015

Or in assembly, for that matter. You never what data the CPU is duplicating behind your back: store buffers, prediction, etc.

TheLoneWolfling · on June 13, 2015

At least with assembly most of the time you can't ever get that data back. Although not always.

With C it's far too easy to get the data back.

Also, you missed the worst example: CPU cache.

pjc50 · on June 12, 2015

For key material, you also ought to page-lock it in memory to prevent it ever being swapped out.

TheLoneWolfling · on June 13, 2015

No.

The compiler would just optimize that set right back out.

amelius · on June 12, 2015

Would there be a way to do this automatically? Like a "sneaky pre-compiler"?

itistoday2 · on June 13, 2015

Looking at the source, this is where the alarm bells should go off in a reviewer's head:

    memcpy(filter->buffer, output->piu_text_utf8, sizeof(output->piu_text_utf8));

1. memcpy is less safe than memmove and strncpy. strncpy should be used.

2. The two character arrays should use the same constant in defining their length, and that constant should be used both in the struct definitions and here in the copy operation.

3. The code is written in C in spite of it being 2014 at the time.

rdc12 · on June 13, 2015

The code is from an entry in the Underhanded C Contest, it is pretty much a given that the entries will be written in C.

itistoday2 · on June 13, 2015

I am aware of that.