You probably should, but the problem is still there because std::vector implemen...

saurik · on Aug 30, 2014

Actually, I don't think it is possible to implement vector in the way you describe (so I'd argue that in fact there is not even a single working implementation that does that): using new[] will cause a default construction of all the elements, but std::vector must copy construct (or, in C++11, move construct, but never assign) the elements from the old array to the new array. AFAIK all implementations of std::vector thereby have a separate phase of allocating memory for the underlying storage of the array (which is done using an allocator, and which essentially cannot be implemented in terms of new) and constructing the elements (which is done using the placement new operator in a loop as it copies the data).

The real issue here is that std::allocator doesn't support realloc. (Note that you couldn't really use realloc as spec'd, though, because in the case where it fails to reallocate in place it blows away the old memory area and copies the data itself, which would have incorrect semantics: this is actually discussed in the article from Facebook as the "notorious design of realloc()" which made the usage of jemalloc required for this task. However, std::allocator could still have features for handling this situation, and implements based on malloc would simply have to not support the reallocate functionality. I imagine if more allocators actually supported the idea of reallocation this could be proposed to the committee.)

That said, the statement that std::vector can't grow the chunk in place is still way too heavy-handed: std::vector will grow its size in place within the memory area allocated to it by deconstructing and constructing elements within the over-sized memory area it has allocated. In general, this memory area grows by doubling in size whenever the capacity is breached, which leads to vector having amortized constant time complexity to insert a new element.

nly · on Aug 30, 2014

> which is done using an allocator, and which essentially cannot be implemented in terms of new

Actually the standard says std::allocator<T> just calls the global operator new function[0]. When you call construct(), placement-new is then used to construct the objects in place. This makes it easy to replace the memory allocator globally as well as on a per-container basis.

[0] http://en.cppreference.com/w/cpp/memory/new/operator_new

bnegreve · on Aug 30, 2014

It's the same with realloc: there is no guarantee that it will grow the chunk in place.

xroche · on Aug 30, 2014

No. Modern realloc are efficient, when moving large memory blocks, because they rely on the kernel ability to quickly relocate memory regions without involving memcpy() (through mremap() on Linux).

Edit: shamelessly citing my blog entry on this subject: http://blog.httrack.com/blog/2014/04/05/a-story-of-realloc-a...

davidtgoldblatt · on Aug 30, 2014

This isn't true for either of the common high performance mallocs. tcmalloc doesn't support mremap at all, and jemalloc disables it in its recommended build configuration. glibc will remap, but any performance gains that get realized from realloc tricks would probably be small compared to the benefits of switching to a better malloc implementation.

aktau · on Aug 30, 2014

Fantastic blog post! Now I don't have to start digging myself :). I've always thought realloc could do a few optimizations, glad to find out the details.

marksamman · on Aug 30, 2014

guarantee != possibility. There's no guarantee with realloc, but there's no possibility with new[], copy and delete[].

You can't grow the size you allocated with new[] in-place, and because you need to retain the existing data it's not safe to delete[] the old buffer, call new[] and hope that it points to the previous memory address and assume that the existing data remains intact.

A realloc implementation can try to grow the buffer if there's contiguous space, and if it succeeds it doesn't need to deallocate or copy anything. I haven't had a look at realloc implementations so I don't know if that, or other optimizations are done in practice, but I assume that realloc's worst performance case is somewhere around new[], copy and delete[]'s best case.

The copy mechanism in std::vector may also have a significant overhead over realloc if it has to call the copy constructor (or ideally move constructors in C++11) of every object in the vector, although I can imagine a C++ equivalent of realloc doing so too.

CJefferson · on Aug 30, 2014

The main issue is that you can't say to realloc "if you can't expand in place, just leave it"

This is occasionally annoying in C, but not a great issue. In C++, just memcpying classes leads to awful things happening.

bnegreve · on Aug 30, 2014

In reply to sibling posts.

- if you need dynamic storage you should use a std::vector instead of new [].

- with std::vector-s, if you want to save the cost of the copy you can redefine the vector allocator to use realloc or even mremap.

Edit @xroche: For example, you can create an allocator that manages an initial chunk of memory (allocated with malloc) and grows it when it's required using realloc inside the 'allocate' call.

Edit2: So I've implemented an allocator that behaves (almost) like a described above but I admit that it's a bit more hacky than I though. Drop me an email if you're interested.

xroche · on Aug 30, 2014

I'm skeptical with the ability to have movable objects with just a dedicated allocator. How do you handle that with only allocate() and deallocate() ?

Edit² @bnegreve: but how do you handle moved blocks then ? And this would involve copying all objects anyway, because this is the std::vector's logic to allocate+memcpy+deallocate when the capacity is reached. The only solution is to ... rewrite std::vector, which is precisely what FB did :)