In the C programming language, we typically manage memory manually. A typical heap allocation is a call to malloc followed by a call to free. In C++, you have more options, but it is the same routines under the hood.
// allocate N kB data = malloc(N*1024); // do something with the memory // ... // release the memory free(data);
It stands to reason that if your program just started and the value of N is large, then the call to malloc will result in an increased memory usage by about N kilobytes. And indeed, it is the case.
So what is the memory usage of your process after the call to “free”? Did the N bytes return to the system?
The answer is that, in general, it is not the case. I wrote a small program under Linux that allocates N kilobytes and then frees them. It will then measure the RAM usage after the call to free. The exact results will depend on your system, standard library and so on, but I give my results as an illustration.
As you can observe in the table, the memory does sometimes get released, but only when it is a large block of over 30 MB in my tests. It is likely because in such cases a different code path is used (e.g., calling mmap, munmap). Otherwise, the process holds on to its memory, never again releasing it to the system.
|memory requested||memory usage after a free|
|1 kB||630 kB|
|100 kB||630 kB|
|1000 kB||2000 kB|
|10,000 kB||11,000 kB|
|20,000 kB||21,000 kB|
|30,000 kB||31,000 kB|
|40,000 kB||1,200 kB|
|50,000 kB||1,300 kB|
|100,000 kB||1,300 kB|
Of course, there are ways to force the memory to be released to the system (e.g., malloc_trim may help), but you should not expect that it will do so by default.
Though I use C/C++ as a reference, the exact same effect is likely to occur in a wide range of programming languages.
What are the implications?
- You cannot measure easily the memory usage of your data structures using the amount of memory that the processes use.
- It is easy for a process that does not presently hold any data to appear to be using a lot of memory.
Further reading: glibc malloc inefficiency
15 thoughts on “Will calling “free” or “delete” in C/C++ release the memory to the system?”
This seems … expected? People who care about this are going to override operator new (with tcmalloc or whatever suits their needs) People who don’t override evidently don’t care. People who want to know how much space it takes to allocate their data structure would do well by using nallocx.
People who know about tcmalloc will find my blog post unsurprising, but they are not the market for this post.
This behavior is not specific to tcmalloc: any heap allocator has liberty to pre-allocate; returning memory to the system after free is even more rare. So the answer to the question in the title is NO and an implication is that heap memory commonly gets over-committed to save on the number of system calls, as your analysis demonstrates.
Notably, malloc is not even a system call, brk is. So when a user process calls malloc there may be no expectatiosn whatsoever that the allocator turns around does the setbrk. Things can get weirder as the allocator may choose to mmap a page far away from the brk threshold instead. This common technique is commonly used when a large chunk of memory is requested.
IMO, your conclusion that
is spot on. You are absolutely correct to assume that the memory usage cannot be calculated from the SIZEOF alone, even the page alignment is taken into consideration.
GLIBC has several tunables which allow you to decide how much memory is overallocated, at which size mmap will be used, and how quickly freed memory is returned to the system.
If every malloc/free required a system call, programs would run 1000x slower! GLIBC even checks whether the current process is single-threaded and bypasses atomic instructions if so. It is much faster to check this flag on each call than to just use atomics even if uncontended.
Thanks for the great link.
I don’t think it makes sense to talk about this topic without discussing the malloc library being used. C++ has nothing to do with anything happening here. Same goes to the previous alloc related posts.
Of course, the specific results will depend on many different factors, but my point here is that you cannot be certain that the memory will be returned to the system. This is a general statement that I can make without specifying the details of my system.
Agreed. Maybe highlighting how many C++ memory operations have nothing to do with C++ language per se but rather are highly influenced by the malloc library and OS features. This way the user learns directly what’s influenced by the language and what’s influenced by the environment.
P.S Your compression/optimization posts + your papers are amazing!
In my view, this is part of the C++ language, in the sense that the C++ specification does not require that memory be given back to the system. So if we ever have this expectation, we are making unwarranted inference.
I’d go so far as to say that when teaching C++ programming one should explicitly state that “free” does not release the memory to the system necessarily and that new and malloc may claim much more memory from the system that the code suggests.
This is similar to how people who learn Java should know about JIT compilation and garbage collection.
There are usually very sound reasons not to release memory back to the OS, paricularly in a multithreaded program. Each such release causes a “TLB shootdown”, in which threads on other cores are blocked while the cores’ “translation lookaside buffers”, caches of page mappings, are cleared, and further stalls as their entries are re-filled.
This is another reason to prefer single-threaded processes, which are less subject to such shootdowns, with less-coupled forms of parallelism.
Besides the TLB potholes, releasing memory means that the next time is requested, the OS is obliged to zero it before the process gets to see it again. Furthermore, each page will be marked read-only, causing a trap the first time it is touched, and then zeroed lazily.
As a result, freeing memory to the OS should only be essayed with the support of a great deal of measurement of the consequences.
If the memory is not necessarily released why bother with the hassle (and the danger of dangling references) of an explicit free and not just use the Boehm-Demers-Weiser garbage collector?
I have been using it for more than 10 years with no trouble.
You are right that whether you use a garbage collector or not, you typically do not have a tight control on how much RAM your program is using.
What happens if you allocate, free, allocate and then freeing the memory again?
Would be interested the results for your test program that allocated/freed 30,000kb. If it does this twice, does it end up using 31,000kb or 61,000kb?
My program actually does precisely what you suggest. I run through a loop to make sure I get stable results.
Any freed memory is used by subsequent allocations. So the memory is reused within the same application. It’s just not aggressively returned to the system.
It is possible for freed memory to become fragmented. For example allocate 101 blocks of 32 bytes, do some work, then free all except for one randomly chosen block. There are 3200 bytes of free memory which can be reused. However if you now try to allocate a single block of 3200 bytes, it won’t fit, so more memory is needed.
Most programs only use a few different block sizes, making such fragmentation in long running processes rare.
You may subscribe to this blog by email.