In C++, we might implement dynamic lists using the vector template. The int-valued constructor of the vector template allocates at least enough memory to store the provided number of elements in a contiguous manner. How much memory does the following code use?
std::vector<uint8_t> v1(10); std::vector<uint8_t> v2(1000000);
The naive answer is 1000010 bytes or slightly less than 1 MB, but if you think a bit about it, you quickly realize that 1000010 bytes might be a lower bound. Indeed, the vector might allocate more memory and there is unavoidably some overhead for the vector instance.
Thankfully, it is easy to measure it. I wrote a little C++ program to measure actual memory usage in terms of allocated pages attributed to the program. We find that we use far more memory (2x or 4x more) than a naive analysis might suggest.
start of the program | after the first vector | at the end | |
---|---|---|---|
ARM-based macOS | 1.25 MB | 1.25 MB | 2.25 MB |
Intel-based Linux | 1.94 MB | 1.94 MB | 4.35 MB |
Interestingly, reserving memory may not use any new memory as pointed by a reader (Martin Leitner-Ankerl). In my tests, adding the following two lines did not change memory usage:
std::vector<uint8_t> v3; v3.reserve(1000000000);
Further reading: Measuring memory usage: virtual versus real memory
Tooling: Ivica Bogosavljevic recommends that Linux users try heaptrack to better understand memory usage. Aleksey Kladov prefers Bytehound.
I find it interesting that when you replace
std::vector v2(1000000);
with
std::vector v2;
v2.reserve(1000000);
Memory usage doesn’t change at all after the reserve() call. Linux only provides allocated pages when they are actually touched
It is a good point and I have updated the blog post accordingly.
Yes, this is why I pre-allocate a large memory footprint and then touch/pin every allocated page at application start time in order to prevent latency hiccups from runtime page faults (which actually increase *real* memory usage).
See also: https://github.com/MattPD/cpplinks/blob/master/performance.tools.md#memory—profiling
It can be a fun learning exercise, but isn’t it a legal requirement to cover Valgrind whenever discussing C++ and memory? 😛
A quick look about Heaptrack suggests that people primarily sing its praises for speed. Perhaps I’ll try it sometime as I did have an issue a few months ago where the code took a few hours to reach the pain point when run with Valgrind. The concerns with new tooling are of course to 1) get used to it and 2) wondering how buggy it is since it hasn’t been as battle tested.
It’s also doesn’t inspire much confidence that when asked about related tools and why he made Heaptrack he replied, “Hey Erwan,
the simple answer is that I was not aware of these solutions… ” Usually one would like to know something about what is available before diving in headfirst.
The Cherno over on YouTube has some nice videos on rolling a little bit of your own memory tracking.
Valgrind 3.18 does not support these instructions. I have not tried with valgrind 3.19.
As with the dynamic library performance article, understanding what you’re actually measuring is often useful before making too many statements about the results.
> Ivica Bogosavljevic recommends that Linux users try heaptrack to better understand memory usage.
I can also recommend https://github.com/koute/bytehound, for me it gave much better visibility than heaptrack (the data collection algorithm is similar between the two, but in bytehound I can in one click get a flamegraph of live allocations at a given point in time, which I’ve found the most useful visualization)
Easy guys, stop discussing
The NSA just “prohibited” C++ precisely because of memory concerns so all this discussion became virtual = 0
Good point is you can switch to Rust, yay!
Bad news is you will be no devs anymore:
You’ll become the compiler! 🤣
On a serious note, endure: Our wages will mostly tenfold over time👌🏻