*When the request for growth comes about, the vector (assuming no in-place resiz...

kabdib · on Aug 30, 2014

I solved an "realloc is really costly" problem by ditching the memory-is-contiguous notion, paying a little more (really just a few cycles) for each access rather than spending tons of time shuffling stuff around in memory. This eliminated nearly all reallocations. The extra bit of computation was invisible in the face of cache misses.

I'm guessing that most customers of std::vector don't really need contiguous memory, they just need something that has fast linear access time. In this sense, std::vector is a poor design.

colomon · on Aug 30, 2014

No, that's quite intentional. If you don't need contiguous memory, you can consider using std::deque. http://www.cplusplus.com/reference/deque/deque/

kabdib · on Aug 30, 2014

There are weasel words in the standard that let implementations still be pretty inefficient. The problem is that memory reallocation is a leaky abstraction, and an interface that doesn't make guarantees about memory behavior can't be relied upon at scale.

The implementation of std::deque I just read uses a circular queue, resized to a power of two upon growth. It would exhibit the same bad performance as std::vector.

yoklov · on Aug 31, 2014

Are you sure? You're not allowed to invalidate iterators or pointers to elements in a deque, so it shouldn't be reallocating memory (aside from its underlying map of nodes, which will need to grow very rarely).

Libstdc++ describes how it's implemented here: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-...

Libc++ doesn't have as descriptive a comment but it's impemented basically the same way here https://github.com/llvm-mirror/libcxx/blob/master/include/de...

nly · on Aug 31, 2014

You might not want to use deque because of how much memory it can use while still small, e.g. libc++s implementation uses a 4KiB page:

http://rextester.com/VIB96468

In the GNU stdlibc++ implementation the object itself is pretty large (and they use a page size of 512 bytes):

http://rextester.com/RHYKB83240

amadvance · on Aug 31, 2014

I implemented what you describe using the BSR (BitScanReverse) instruction as base of an indirect addressing of an exponentially growing data size.

Essentially, instead of a direct addressing like V[i], you can do VV[log2(i)][i]. Where log2() is implemented with a single BSR instruction.

The addressing table for the first level is very small in size, and it can be assumed always in cache.

Here the .h/.c source: https://github.com/amadvance/tommyds/blob/master/tommyds/tom...

https://github.com/amadvance/tommyds/blob/master/tommyds/tom...

Did you used something different ?

nly · on Aug 31, 2014

You might be interested in this paper

https://cs.uwaterloo.ca/research/tr/1999/09/CS-99-09.pdf

srean · on Aug 30, 2014

Contiguity, a semblance of control over what stays in cache what doesn't and potential for vectorization is precisely the reason that I use std:vector when I do. I cannot emphasize this enough. Getting the cache behavior correct pretty much makes or breaks my code, in the sense it often dominates performance characteristics once I have got the algorithm correct. One can have speedups of several tens of multipliers. You may work on a project where performance is irrelevant, but that does not mean performance is universally irrelevant.

In short std:vector has served me well, I dont want it to lose performance.

r-s · on Aug 31, 2014

There is a major use case for std::vector having contiguous memory, which is passing a reference to the first element of a vector into a function expecting a pointer. This is fine assuming the vector is not empty.

coliveira · on Aug 30, 2014

std::vector is supposed to work in the common cases, so you don't need to manually realloc for small/medium sized vectors. If you need to handle a lot of data, avoiding realloc is still an issue and you should pre-compute the length and use vector::resize (or the length-aware constructor).

mlvljr · on Aug 30, 2014

The latter nicely addresses to the recent HN discussion about "Friendly C"; may I ask what kind of software you are building, incidentally? (...don't tell it's medical :O)