When programming in C, one has to allocate and de-allocate memory by hand. It is an error prone process. In contrast, newer languages like Java often manage their memory automatically. Java relies on garbage collection. In effect, memory is allocated as needed by the programmer, and then Java figures out that some piece of data is no longer needed, and it retrieves the corresponding memory. The garbage collection process is fast and safe, but it is not free: despite decades of optimization, it can still cause major headaches to developers.
Java has native arrays (e.g., the int[] type). These arrays are typically allocated on the “Java heap”. That is, they are allocated and managed by Java as dynamic data, subject to garbage collection.
Java also has Buffer types such as the IntBuffer. These are high-level abstractions that can be backed by native Java arrays but also by other data sources, including data that is outside of the Java heap. Thus you can use Buffer types to avoid relying so much on the Java heap.
But my experience is that it comes with some performance penalty compared to native arrays. I would not say that Buffers are slow. In fact, given a choice between a Buffer and a stream (DataInputStream), you should strongly favour Buffer types. However, they are not as fast as native arrays in my experience.
I can create an array of 50,000 integers, either with “new int[50000]” or as “IntBuffer.allocate(50000)”. The latter should essentially create an array (on the Java heap) but wrappred with an IntBuffer “interface”.
A possible intuition is that wrapping an array with an high-level interface should be free. Though it is true that high level abstractions can come with no performance penalty (and sometimes, even, performance gains), whether they do is an empirical matter. You should never just assume that your abstraction comes for free.
Because I am making an empirical statement, let us test it out empirically with the simplest test I can imagine. I am going to add one to every element in the array/IntBuffer.
for(int k = 0; k < s.array.length; k++) { s.array[k] += 1; }
for(int k = 0; k < s.buffer.limit(); k++) { s.buffer.put(k, s.buffer.get(k) + 1); }
I get the following results on my desktop (OpenJDK 14, 4.2 GHz Intel processor):
int[] | 2.5 mus |
IntBuffer | 12 mus |
That is, arrays are over 4 times faster than IntBuffers in this test.
You can run the benchmark yourself if you’d like.
My expectation is that many optimizations that Java applies to arrays are not applied to Buffer types.
Of course, this tells us little about what happens when Buffers are used to map values from outside of the Java heap. My experience suggests that things can be even worse.
Buffer types have not made native arrays obsolete, at least not as far as performance is concerned.
Hi Daniel,
Thanks for benchmarking this! In my environment (Zulu14.28+21-CA, MacOS 11.0.1) the performance of the native byte-order buffer (buffer_crazy) is quite close to the int[] array. An off-heap version of the same (via allocateDirect) show as buffer_direct below fares much worse though:
Benchmark Mode Cnt Score Error Units
MyBenchmark.array avgt 15 3.801 ± 0.143 us/op
MyBenchmark.buffer avgt 15 18.480 ± 1.664 us/op
MyBenchmark.buffer_crazy avgt 15 4.087 ± 0.256 us/op
MyBenchmark.buffer_direct avgt 15 25.111 ± 0.536 us/op
The differences seem to stem from writes, as modifying the benchmark to do reads only (summing array values to a long, consumed by Blackhole) yields the same performance for all IntBuffer variants:
Benchmark Mode Cnt Score Error Units
SumBenchmark.array avgt 15 16.227 ± 0.536 us/op
SumBenchmark.buffer avgt 15 16.879 ± 1.179 us/op
SumBenchmark.buffer_crazy avgt 15 16.659 ± 0.643 us/op
SumBenchmark.buffer_direct avgt 15 17.774 ± 1.067 us/op
Regards,
Viktor
Indeed, the “buffer_crazy” workaround was born from a discussion on Twitter, and is expected to be in par with array: https://twitter.com/lemire/status/1333466140178313216
Mr. Lemire simply did not update the post.
There is now a bug filed on the IntArray version, too: https://bugs.openjdk.java.net/browse/JDK-8257531
That is correct, the post has not been updated.
Thanks for linking to the tweet, I should have done so in the updated code. I did link to the tweet in a github issue.
Is anyone here aware of a similar benchmark for C#, comparing the “traditional” methods of Arrays,
unsafe
pointers, and the Read/Write methods of theSystem.Runtime.InteropServices.Marshal
class with the new Memory and Span types?Note that the issue has been picked up: https://bugs.openjdk.java.net/browse/JDK-8257531 and is being fixed: https://github.com/openjdk/jdk/pull/1618/files
If all goes well, Java 16 will correctly vectorize
Buffer
(andMemorySegment
) accesses too, if appliable. The only missing piece are direct Buffers for now, but at least that’s something they’re aware of now, too.Just to highlight, the results here are rather specific to one particular VM implementation. I decided to check how this performed on Azul Zing (which uses a very different top tier compiler), and as I’d somewhat expected, saw a very different performance picture.
JDK 11.0.9.1, OpenJDK 64-Bit Server VM, 11.0.9.1+1-Ubuntu-0ubuntu1.20.04
Benchmark Mode Cnt Score Error Units
MyBenchmark.array avgt 15 3.151 ± 0.005 us/op
MyBenchmark.buffer avgt 15 18.463 ± 0.006 us/op
MyBenchmark.buffer_crazy avgt 15 3.161 ± 0.003 us/op
MyBenchmark.buffer_direct avgt 15 89.536 ± 0.049 us/op
JDK 11.0.9.1-internal, Zing 64-Bit Tiered VM, 11.0.9.1-zing_99.99.99.99.dev-b3323-product-linux-X86_64
Benchmark Mode Cnt Score Error Units
MyBenchmark.array avgt 15 3.683 ± 0.475 us/op
MyBenchmark.buffer avgt 15 3.871 ± 0.395 us/op
MyBenchmark.buffer_crazy avgt 15 3.856 ± 0.386 us/op
MyBenchmark.buffer_direct avgt 15 16.594 ± 0.002 us/op
These results were collected on an AWS EC2 c5.4xlarge image (which is a skylake server part). This was run on the “feature preview” (i.e. weirdly named beta) for Zing downloadable from here: https://docs.azul.com/zing/zing-quick-start-tar-fp.htm
Disclaimer: I used to work for Azul, and in particularly, on the Falcon JIT compiler Zing uses. I’m definitely biased here.