How fast should your dynamic arrays grow?

When programming in Java or C++, your arrays have fixed sizes. So if you have an array of 32 integers and you need an array with 33 integers, you may need to create a whole new array. It is inconvenient. Thus, both Java and C++ provide dynamic arrays. In C++, people most commonly use the STL vector whereas in Java, the ArrayList class is popular.

Dynamic arrays are simple. They use an underlying array that might be larger than needed. As the dynamic array grows in size, the underlying array might become too small. At this point, we increase the underlying array. However, each time we do so, we have to copy all of the data over, allocate a new array and clear from memory the older array. That is a relatively expensive computation. To minimize the running time, we often grow the underlying array by a factor x (e.g., if x is 2, then the underlying arrays always doubles in size).

A nice result from computer science is that even if we grow the dynamic array by one element N times, the running time will still be in linear time because of the particular way we grow the underlying array. Indeed, roughly speaking, we have to copy about N + N/x + N/x2 + … or N x / (x – 1) elements to construct a dynamic array of N elements (for N large). Hence, the running time is linear in N.

However, the complexity still depends on x. Clearly, the larger x is, the fewer elements you need to copy.

  • When x is 3/2, we need to copy about 3 N elements to create a dynamic array of size N.
  • When x is 2, we need to copy only about 2 N elements.
  • When x is 4, we need to copy only about 1.3 N elements.

It would seem best to pick x as large as possible. However, larger values of x might also grow the underlying array faster than needed. This wastes memory and might slow you down.

So how fast do people grow their arrays?

  • In Java, ArrayList uses x = 3/2.
  • In C++, the GCC STL implementation uses x = 2.

The Java engineers are more conservative than the C++ hackers. But who is right? And does it matter?

To investigate the problem, I wrote a small benchmark in C++. First, I create a large static array and set its integer values to 0, 1, 2, … Then I do the same thing with dynamic arrays using various growth factors x. I report the speeds in millions of integers per second (mis) on an Intel Core i7 with GCC 4.7.

  x    speed (mis)  
static array2500
1.5240
2340
4480

Of course, this test is only an anecdote, but it does suggest that

  • dynamic arrays can add significant overhead
  • and that a small growth factor might be particularly slow if you end constructing a large array.

To alleviate these problem, both the C++ STL vector and the Java ArrayList allow you to set a large capacity for the underlying array.

Of course, people writing high performance code know to avoid dynamic arrays. Still, I was surprised at how large the overhead of a dynamic array was in my tests.

My source code is available.

Note: Yes, if you run your own benchmarks, the results will differ. Also, I am deliberately keeping the mathematical details to a minimum. Please do not nitpick my theoretical analysis.

Update: Elazar Leibovich pointed me to an alternative to the STL vector template created by Facebook engineers. The documentation is interesting. Gregory Pakosz pointed me to another page with a related discussion about Java.

8 thoughts on “How fast should your dynamic arrays grow?”

  1. A problem what I stumbled over is if you have millions of entries in your list – you would then need more than double of the current size even if you just need one more entry. For my graphhopper project I just created a segmented array which entirely avoids copying and has a maximum amount of wasted capacity (==segment size). It has still the disadvantage of a bit slower access as you need some modulo operations to access the 2d array, but with bit operations you can get a bit faster. I didn’t study the performance as memory requirements forced me to introduce it but nevertheless I would be interested if you could add this to your benchmarks + add one more for access time? https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/storage/RAMDataAccess.java

  2. Thank you for that article, I enjoyed reading it.

    Note that the Windows implementation of the STL is using a grow factor or 3/2, unlike the GCC implementation.

    Also on reallocation, std::vector is not simply moving the data to the new memory block using for example memcpy, for rather it supports non-POD types and either call the move constructor or the move assignment operator for each element, one by one. Doing the same with c++11 disabled could be even more expansive.

    It could be interesting to be able to specify which grow factor to use depending on one large the array already is, and also to set a flag telling the vector that we are storing Plain Old Data and that constructors needs not be called, instead enabling block copy.

    To Peter:
    The problem with segmented arrays is that since their memory is not allocated as one continuous block, they lose cache coherence. Reading one such array from beginning to end would jump though memory between each segments, forcing the cpu to fetch the data from the main memory instead of hitting the cpu cache.

  3. @ Valrandir

    For my tests, I did not use the STL vector template. So the numbers I give use memcpy. I did benchmark with vector and it was indeed slower, probably for the reason you gave. I did not try STL with C++11 enabled.

    As for segmented arrays… intuitively, if the segments are large enough, it should not create much harm if you are scanning all of the vector from the beginning till the end.

  4. One thing I noticed in the source code is that the access pattern used might favor the static array as the same block might be used by the OS to allocate and deallocate the static array.

Leave a Reply

Your email address will not be published. Required fields are marked *