Most C++ programmers rely on “STL” for their data structures. The most popular data structure is probably vector, which is just a dynamic array. The set and the map are other useful ones.
The STL data structures are a minimalist design. You have relatively few methods. All of them allow you to compute the size of the data structure, that is, how many elements it contains, via the size() method. In recent C++ (C++11), the size() method must have constant-time complexity for all containers. To put it in clearer terms, the people implementing the containers can never scan the content to find out the number of elements.
These containers also have another method called empty() which simply returns true of the container is… well… empty. Obviously, an equivalent strategy would be to compare the size with zero: mystruct.size() == 0.
Determining whether a data structure is empty is conceptually easier than determining its size. Thus, at least in theory, calling empty() could be faster.
Inspecting the assembly output, I find that recent versions of GCC produce nearly identical code for the comparison of the size and the empty call. The exception being the list data structure where the assembly is slightly different, but not in a manner that should affect performance.
Of course, there are different implementations of C++ and it is possible that other implementations could provide more efficient code when calling empty(). An interesting question is whether effort is needed from the compiler.
Travis Downs wrote a list data structure by hand, but with a size() function that is linear time. He then implemented the empty function naively:
struct node { struct node *next; int payload; }; int count_nodes(const node* p) { int size = 0; while (p) { p = p->next; size++; } return size; } bool is_not_empty(const node* p) { return count_nodes(p) > 0; }
Amazingly, we find that the GCC compiler is able to compile Travis’ is_not_empty C++ function to constant-time code. The compiler inlines the count_nodes function into is_empty. Then the compiler figures out that as soon as you enter the loop once with count_nodes, then size is going to be greater than zero, so there is no need to keep looping.
However, the optimisation is not robust. Suppose that I wish instead to return an unsigned type instead of Travis’ int value. The problem with an unsigned type is that I might overflow if the list is very long. With a signed integer, the compiler is allowed to assume that overflows do not happen. It could be difficult for the compiler to tell whether count_nodes() return 0 or not, if the compiler must handle overflows. To handle this potential issue, I can forcefully bound the return value of count_nodes() to be no more than 1000. If I change the code to return a standard size_t type, like so…
#include <cstddef> struct node { struct node *next; int payload; }; size_t count_nodes(const node* p) { size_t size = 0; while (p) { p = p->next; size++; if(size == 1000) { return 1000; } } return size; } bool is_not_empty(const node* p) { return count_nodes(p) > 0; }
Sadly, GCC is now unable to optimize away the call. Maybe compilers are not yet all-powerful beings?
The lesson is that it is probably wise to get in the habit of calling directly empty() if you care about performance. Though it may not help much with modern STL data structures, in other code it could be different.
Of course, another argument is that the call to empty() is shorter and cleaner.
Credit: This blog post was motivated by a tweet by Richard Startin.
Still disappointed C++ and Rust containers don’t have a non_empty() method, and the Rust discussion fizzled out.
Have you considered writing `not array.empty()`?
The is_empty function in this blog post has an obvious mistake 🙂
> the compiler figures out that as soon as you enter the loop once with count_nodes, then size is going to be greater than zero, so there is no need to keep looping.
It could result in an endless loop, is it allowed to optimize that away? Also, it could wrap (to 0 for unsigned, or negative for signed). With 32 bit integers, it would be quite easy; with 64 bit, not sure how long it would take / memory size needed. Anyway, to me it looks like an incorrect optimization.
The other case, with a limit on 1000, on the other hand, can be optimized away.
Ah I see, integer overflow causes undefined behaviour. And for endless loops: GCC assumes that a loop with an exit will eventually exit (option -ffinite-loops)
-ffinite-loops (or conversely -O2 -fno-finite-loops) does not affect the assembly output of the optimizable example in this article.
In C++, an infinite loop without side effects is Undefined Behavior, so the compiler is allowed to optimize it away.
Maybe a typo? In the end of paragraph 2:
> to find out the number of containers
should probably be
> to find out the number of elements
Yes. Thanks.
Better?
bool is_empty(const node* p) {
return p ;
}
Er. return !p
I’m not sure what the point of this article is. The author explicitly states “the people implementing the containers can never scan the content to find out the number of elements.”
Then continues on to test against exactly what he described as against the rules. Its not hard to keep track of the size of a container via a class member variable. There is literally no need to scan the items. The empty() probably just returns the size as a bool. This entire discussion and article is pointless.
The author explicitly states “the people implementing the containers can never scan the content to find out the number of elements.”
For STL containers.
Here is the conclusion:
The assumption is that you will not just use STL containers in all of the code you are relying upon. Of course, if you are quite sure to only use recent STL containers, then you may consider the blog post pointless, but what makes you so sure?
thanks for bringing the awareness to us
I have spent a bit too much time in Python, but I wish the STL containers had an implicit conversion to bool. That would be even less code than calling empty().
I still prefer clarity and readability. I use empty() to show the intent to, well, check emptiness.
FYI: https://godbolt.org/z/bh3PMe8nG
Seems like “<=" allows more optimizations than a simple "!="
With that implementation, the count_nodes function returns 0 for an empty list and 1000 for a non-empty list. That is not what was meant for the overflow-preventing case.