Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Being Sneaky in C (codersnotes.com)
158 points by alexggordon on June 12, 2015 | hide | past | favorite | 88 comments


I like the original Ken Thompson sneakiness in C (http://electronicdesign.com/dev-tools/thompson-ritchie-and-k...):

"Also in his Turing Award lecture, he described how he had incorporated a backdoor security hole in the original UNIX C compiler. To do this, the C compiler recognized when it was recompiling itself and the UNIX login program. When it recompiled itself, it modified the compiler so the compiler backdoor was included. When it recompiled the UNIX login program, the login program would allow Thompson to always be able to log in using a fixed set of credentials."


That really isn't the same thing, though. That's language-agnostic conceptual sneakiness that happened to be implemented in C.



OpenBSD specifically modified malloc() a few years ago to prevent this sort of sneakiness (http://www.tw.openbsd.org/papers/eurobsdcon2009/otto-malloc.... [pdf]). So they route their malloc() calls through mmap() which returns randomized pages, and free() immediately returns memory to the kernel rather than leaving it mapped in the current process.

I'd be surprised if these changes haven't made it into FreeBSD, but afaik Linux doesn't work this way (by default, anyway).


I tested it on Ubuntu 14.04 x64, and the problem occurs there.

And of course, some programs use replacements like dlmalloc and do all their own allocation management anyway.


> And of course, some programs use replacements like dlmalloc and do all their own allocation management anyway.

Yeah. I wrote my own allocator in C++ a long time ago. I wouldn't be surprised if there weren't quite a few other bits of software out there doing the same thing.


Wasn't Heartbleed due to exactly that - using their own local allocator instead of calling malloc/free?


Partly. They were using their own allocator (openssl_malloc()), but even then they would've been OK if it weren't for the OBO error elsewhere in the heartbeat implementation. If they were using an OS-supplied malloc() instead of openssl_malloc(), the bug would've still been exploitable on some operating systems, but not others.

Either way, "don't write your own allocator" is a good lesson to learn.

Unless, of course, you're doing it for fun. In which case, efficient heap management really is a neat exercise.



If your claim is that OpenBSD routes every call to malloc/free through mmap/munmap, it is simply not believable.

Do you have a URL to a git ... hahaha, pardon me, CVS repo?


http://bxr.su/OpenBSD/lib/libc/stdlib/malloc.c#omalloc ^Not the CVS repo, but close enough for those who want to check OpenBSD's malloc implementation. The malloc.conf manpage is also an interesting read : http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man5/...

And some context around malloc and the various techniques OpenBSD implements to make exploitation of bugs more difficult: http://www.openbsd.org/papers/dev-sw-hostile-env.html

The tl;dr is : if your code compiles and runs on OpenBSD, chances are it will fine on other Unices. The opposite is not necessarily true though.


I don't think that source refutes the parent's point: only large allocations go directly to mmap/munmap as opposed to being cached. Of course, there are other anti-exploitation measures too...



Well you could read the paper he just linked to.


It also makes the assumption that it's a little-endian system. On a big-endian system, the high order byte of the timestamp would be modified, which would probably be too obvious.


"In C/C++, you can use bugs in one part of a program to cause trouble in another. That’s pretty darn underhanded."

I would argue every language has that property. But with C/C++ being so closely tied to the ABI of the machine perhaps they are more underhanded than others. But to me, this branding does feel a bit unfair.

Still, a fun contest and an interesting read.


You can always cause that kind of bug, but I would argue that C/C++ is massively more underhanded than other languages.


The description of the bug in surveil.txt in the source archive was a bit easier for me to understand, really nifty :)


Would setting the malloc'd memory back to the original message before freeing it solve the problem?


Yes. You could also memset any memory to zero once you're done with it, which isn't a bad idea for applications with high security requirements.


...Except, of course, the the compiler can and will optimize away any such memset.

Do not attempt to write secure software with C / C++. It is a Bad Idea(TM). Because what you wrote is not what gets run.

For instance: there is no way to write a portable secure memset in portable C / C++. (You can write a "secure" memset that works in all current compilers, but that is not the same thing. What doesn't get optimized now can and will be optimized tomorrow.)


> ...Except, of course, the the compiler can and will optimize away any such memset.

It's dependent on the function attributes of memset (e.g. __attribute__((pure))) - it won't always be optimized out.

https://www.cs.auckland.ac.nz/references/c/gcc4.7/Function-A...


I am not talking about if it does currently optimize away a function call to an alternative to memset, I am talking about if it is allowed to optimize it away.

The compiler is allowed to optimize away that function call regardless of if it is memset or your own alternative.

Note that that is not the same thing as saying if it does currently optimize away that function call.


I'm not aware of any portable language that offers semantics that guarantee data is never rendered accessible to the outside world (I could well believe that current implementations don't, but I'm not aware of anything that specifies it). It always comes down to platform-specific APIs. But that's not a reason to not use C - unless you can suggest a better alternative?


Quite frankly, assembly currently is about it. And that's not the most portable thing out there, I know.

I think that C or C++ could, without too much effort, support semantics that would allow for this sort of thing. Something as simple as a "secure" keyword that could be applied to variables (where it means "leak as little as possible about this variable when it goes out of scope") or functions (where it means the same, but for the function itself and all locals of the function).



...I do not think the word "portable" means what you think it means.


I don't get it, the compiler operates within the memory model specified by the language. If it "optimizes" a memset it does not change the behavior of the program (or it is a bug in the compiler which is a different topic).


Common misconception with C. A pointer does not mean a pointer to a sequence of capacitors in your RAM-memory. It really means a pointer to an abstract and temporary variable. How this abstract variable is executed on your hardware is implementation specific. Everything except input, output and explicitly defined side effects (volatile) is of no interest.

Really, you can print a c program on piece of paper and ask some slave to "execute" the program in his head given some input x, how he "implements" memset will surely be different than what a computer would, and if you only ask for the output y he will surely see that this memset doesn't affect y at all and skip doing it.


You're correct - the compiler operates within the memory model of the language. But C / C++'s memory model is broken w.r.t. security.

There is no way to ensure that something is actually overwritten, because under the memory models of C and C++ you cannot ever read that memory again, even though in actuality you can.


I believe volatile force the compiler to issue a real write to memory and can't be optimized away.


Except, of course, that the compiler can perfectly legally copy the variable behind the scenes and not overwrite the copy.

I mean it when I say you cannot.


I'm not sure what you mean here, if you have a volatile pointer that points to a memory buffer returned by malloc, how can the compiler prevents a write through the pointer from happening?

Edit: unless your point is a temporary copy can be spilled in memory and this copy will stay in memory and won't be overwritten?


Yes, your edit is correct.


Why not generate random data then read it? Or even a constant, then read it?


Because the compiler will optimize it out. Even if you return the random data / constant the compiler will optimize out the store to the variable and just pass it through directly.


The problem is that the memory model specified by the language is a subset of the memory model specified by the hardware. This leads to exploitable systems when you lift those blinders.


So what is someone working with embedded devices for IoT supposed to do? Much of that work is based on ARM C/C++ Compilers.


Use a compiler that has extensions that _do_ guarantee that memory gets erased. That's what the gcc function attributes are for.

Alternatively, use a library that the C compiler doesn't know so much about that it will attempt to remove calling functions in them.

If you copy your standard library's memset in a separate DLL that is not the standard C library, the compiler will not even see the code during compilation, so it has to compile a function call.

The linker (or a JIT in your C runtime) is allowed to remove calls to the function, if it can prove that it doesn't have side effects. However, to prove that, it has to look at the assembly of the function; it cannot use the way simpler heuristic "it came from <memory.h> and is called memset"


If you copy your standard library's memset in a separate DLL that is not the standard C library, the compiler will not even see the code during compilation, so it has to compile a function call.

Although I'd like things to behave this way, I don't think this is true. The C standard library was incorporated into the language spec for C89. The behaviors of the named functions within it are specified, and the compiler is allowed to inline it's own version (ignoring your custom code) and then optimize out the inlined portion.

So while it's possible that the external linkage approach still works with certain compilers, it's not portable. I believe you are OK with the external approach if you use a non-standard name (my_secure_memset_pretty_please()), but that just shifts the problem to forcing the compiler to generate your external function without making the same dangerous optimizations.

In the end, I fear you are left with three options: blind faith, non-standard language extensions, or switching to a more secure language (likely assembly). If there are other options, I love to hear about them.


In practice, of course, memset actually works, because it's a function and the compiler's usage tracing is nowhere near able to spot that you don't reference those zeros that you write.

(IoT security is doom for other reasons though, mostly UI, updatability and cloud services)


Really? In my experience most compilers treat it pretty much like an intrinsic and generate specialized set of instructions if they can.


For now.

And then that code you wrote now silently becomes deadly a few years down the line.


I would be very interested to see the compiler that can optimise away memset across a shared library boundary.


LLVM can and will do it. It will assume it knows what a function named "memcpy" (for example) does and optimizes accordingly. (Look at TargetLibraryInfo.cpp and grep for LibFunc::memset in, for example, SimplifyLibCalls.cpp.)

(That said, I think TheLoneWolfling is being too strong with his/her claims. You can get modern compilers to avoid dangerous optimizations; it's just not for the faint of heart.)


I never said you couldn't get a particular compiler to. Or indeed, all current compilers.

I am saying that it's impossible to do so and remain in the realm of portable C / C++.

There is a distinction.


Also: isn't that a bug? Is there something in a C / C++ standard that states that a function named "memcpy" (for example) is necessarily the normal function?


Compilers have been doing this for a long time. The optimizations that this enables are essential for performance. They shouldn't stop; if the spec prohibits it, the spec should change (and if it doesn't, the compilers should ignore the spec).


Good to know about the first and last part.

And as for the second part... Meh. I don't see any optimizations that hard-coding calling something named "memcpy" (or whatever) does that cannot be enabled by looking at the actual code that gets linked. Albeit with more difficulty.


A JITter would be able to do that. And the JVM can (and will) do the equivalent for Java code.

Remember: there is nothing that specifies that C / C++ needs to be compiled.

Also: you could have said the same twenty years ago about many of the optimizations that currently compilers do.


1. Compile without optimization (maybe just the crypto parts, if that's possible). 2. Write all crypto stuff in assembly, link in as static binary.


Neither of those work in general.

W.r.t. 1, the compiler's definition of no optimization today is not the same thing as it was last version, or will be next version. For instance, on IA-64 there are things the compiler has to do that are typically considered optimizations.

W.r.t. 2, you have to make sure there is no link-time optimization happening.


Lto does not work on assembly, it only works if some IR is stored in the .o files (like gimple for gcc) iirc.


Currently.

However, that is not an inherent restriction - that is only a restriction on current compilers. It is entirely possible for a compiler to read the assembly of things being linked and optimize based on that.


You could also dynamically link.


That does not solve the problem. That only hides it and means it will be deadly later.

For instance, when someone runs it in an emulator for backwards compatibility purposes. Or when someone runs it in a JITter. Or even just if the compiler decides to special-case for the existing link target.


Either:

* Use a specific compiler and verify.

* Don't use C / C++.

* Panic.


C11 actually adds the memset_s function, which is guaranteed by the language spec not to be optimized away:

> memset may be optimized away (under the as-if rules) if the object modified by this function is not accessed again for the rest of its lifetime. For that reason, this function cannot be used to scrub memory (e.g. to fill an array that stored a password with zeroes). This optimization is prohibited for memset_s: it is guaranteed to perform the memory write.

http://en.cppreference.com/w/c/string/byte/memset


Except, of course, that memset_s is still not enough.

The compiler can and will copy things around, and it is not required to memset_s said cop(y)/(ies) away.


> For instance: there is no way to write a portable secure memset in portable C / C++.

Of course there is, you just use the volatile keyword. volatile guarantees that all read/writes have corresponding memory accesses and cannot be optimized away.

It's not going to be as fast as memset but it's definitely portable and it won't be THAT slow. Then for platforms that have memset_s defer to that instead, otherwise fallback to the totally portable volatile + for loop.


> Of course there is, you just use the volatile keyword.

Colin Percival of FreeBSD disagrees:

http://www.daemonology.net/blog/2014-09-04-how-to-zero-a-buf...


And that doesn't even get into the other aspect of it:

Namely that C / C++ allows temporary copies of variables that are not cleared afterwards. The most obvious case of this being things being temporarily copied into registers / stack, but there are other examples as well.


Not really. He just says if the compiler can optimize away prior to it being declared volatile.


If only it were so simple.

But that does not work. Full stop. The compiler can optimize in ways that still leak the contents of the thing that was supposed to be memset-ted away.


You're talking about a different problem now.

But I'm pretty sure you're just aggressively anti-C/C++ so whatever.


You seem to misconstrue me.

C / C++ are very good languages in all sorts of ways. However, there are components that currently have... flaws. This being one of them. As such, I complain about said flaws, in the hopes that someone will take notice, and/or someone will point me in the direction of things that contain the good parts of C / C++ without said flaws.

I have already learned a fair bit about bounds checking, SIMD instructions, etc, etc from this. And I always want to know more.

*

And no, it is the same problem. Namely, that the memory models of C and C++ doesn't match with the underlying hardware, and the mismatch is such that things that are trivial to do on the underlying hardware are literally impossible to do with C and C++.

Part of this is for compatibility purposes, but there are ways to keep the compatibility that don't present this sort of problem.


Volatile should only be used with hardware registers. It doesn't do exactly what you want here. It will guarantee that memory will be accessed but it doesn't guarantee the ordering which can lead to some really nasty behaviour.

The only place that keywork should be used is a qualifier for member functions or when used in an embedded sense. It's not well defined outside of that scope.


volatile's designed purpose is for memory-mapped things, not hardware registers.


You could just compile with optimizations off.


There is no such thing in portable C / C++.

There are some ISAs that pretty much require optimizations, for instance.


cheat. after resetting the memory, copy the first and last bytes and log them out somewhere. that'll prevent any optimizations


If only it were so simple. You underestimate the evilness of compilers.

The compiler can (and will!) just propagate the values through directly and skip the memset.


If you're modifying memory immediately before freeing it (i.e. after the last time you read it), don't you have to be extra super careful to do so in a way that the compiler won't optimize the operation into nothingness? (I don't program in compiled languages very much, so I don't know the details about this.)


Yes, you're correct. See e.g.: http://www.eliteraspberries.com/blog/2012/10/zero-and-forget... (which has a link to some HN discussion at the end of the article).


Actually yeah, that's a very good point. Sometimes compilers are just too clever.

I think some systems have a "secure memset" function that can be used for things like this - i.e. one that's guaranteed not to be optimized out.


Yes, memset_s(). Which makes me think there should be a free_s().


memset_s, or a mythical free_s, is not enough.

The compiler can, and will, make copies of data behind the scenes. And not erase said copies.

What we really need is a keyword / modifier that says that when X passes out of scope no state related to X may be leaked. Ideally, that can be applied to a function / block as well as a variable.

(Or rather, not necessarily no state. Read "as little state as possible", preferably with modifiers that panic unless the compiler can ensure specific things.)


The C standard works at an abstraction level that makes it unsuitable for security applications, I would advocate for a new language here. It needs serious PL research with informarion flow reasoning, what we need is just a new kind of language much more machine aware (yes, more low-level) than C.


Agreed on some level.

On the other hand, something as simple as a keyword marking a variable as "as secure as possible given hardware constraints (read: should wipe any temporary copies and the variable itself after it goes out of scope, should attempt to prevent it from being written to non-volatile storage, that sort of thing)" (sort of like how inline works), with compilers required to bail if the constraint cannot be done to the level specified, would be a massive step in the right direction.


Yes.

There is, quite literally, no way to ensure data is not leaked (namely, that data is zeroed / etc) in portable C / C++.


Or in assembly, for that matter. You never what data the CPU is duplicating behind your back: store buffers, prediction, etc.


At least with assembly most of the time you can't ever get that data back. Although not always.

With C it's far too easy to get the data back.

Also, you missed the worst example: CPU cache.


For key material, you also ought to page-lock it in memory to prevent it ever being swapped out.


No.

The compiler would just optimize that set right back out.


Would there be a way to do this automatically? Like a "sneaky pre-compiler"?


Looking at the source, this is where the alarm bells should go off in a reviewer's head:

    memcpy(filter->buffer, output->piu_text_utf8, sizeof(output->piu_text_utf8));
1. memcpy is less safe than memmove and strncpy. strncpy should be used.

2. The two character arrays should use the same constant in defining their length, and that constant should be used both in the struct definitions and here in the copy operation.

3. The code is written in C in spite of it being 2014 at the time.


The code is from an entry in the Underhanded C Contest, it is pretty much a given that the entries will be written in C.


I am aware of that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: