The absurd cost of finalizers in Go

The Go programming language makes it easy to call C code. Suppose you have the following C functions:

char* allocate() {
  return (char*)malloc(100);
}
void free_allocated(char *c) {
  free(c);
}

Then you can call them from Go as follows:

c := C.allocate()
C.free_allocated(c)

It works well.

You might argue that my functions are useless, but I designed them to be trivial on purpose. In practice, you will call C code to do something actually useful. Importantly, there is an allocation followed by a necessary deallocation: this is typical in realistic code.

Reasonably, Go programmers are likely to insist that the memory allocated from Go be released automatically. Thankfully, Go has a mechanism for this purpose. You put the C data in a Go structure which is subject to automatic garbage collection. Then you tie the instances of this structure to a “finalizer” function which will be called before the Go structures is garbage collected. The code might look as follows:

type Cstr struct {
  cpointer *C.char
}

func AllocateAuto() *Cstr {
  answer := &Cstr{C.allocate()}
  runtime.SetFinalizer(answer, func(c *Cstr) {  C.free_allocated(c.cpointer); runtime.KeepAlive(c) })
  return answer
}

 

So far so good. Go is doing very well up until now.

But what is the performance impact? We are comparing these two routines. First, the inconvenient version where you manually have to free the allocated memory…

p := Allocate()
Free(p)

and then the version which relies on Go’s memory management…

AllocateAuto()

Let us benchmark it. My benchmarking code is available, your results will differ from mine but I care only about the big picture.

In my case, the automated version is nearly ten times slower.

AllocateAuto 650 ns
Allocate-Free 75 ns

The 650 ns result is silly: it is thousands of CPU cycles.

Maybe it is the overhead due to garbage collection ? Thankfully, Go allows us to disable garbage collection with GOGC=off:

AllocateAuto (no GC) 580 ns
Allocate-Free (no GC) 75 ns

So the numbers are slightly better, but barely so.

We can try to see where the problem lies with profiling:

go test -cpuprofile gah -bench BenchmarkAllocateAuto -run -
go tool pprof gah
> top

We get that most of the processing time is spent in runtime.cgocall:

     1.67s 70.17% 70.17%      1.67s 70.17%  runtime.cgocall
     0.23s  9.66% 79.83%      0.23s  9.66%  runtime.usleep
     0.12s  5.04% 84.87%      0.12s  5.04%  runtime.pthread_cond_signal

What if I try a dummy finalizer?

func AllocateDummy() *Cstr {
 answer := &Cstr{C.allocate()}
 runtime.SetFinalizer(answer, func(c *Cstr) {})
 return answer
}

I get the same poor performance, suggesting that it is really the finalizer that is expensive.

This is seemingly consistent with Java, which also has finalizers:

Oh, and there’s one more thing: there is a severe performance penalty for using finalizers. On my machine, the time to create and destroy a simple object is about 5.6ns. Adding a finalizer increases the time to 2,400ns. In other words, it is about 430 times slower to create and destroy objects with finalizers. (Effective Java 2nd Edition: Item 7: Avoid finalizers)

Maybe there is a way to do better, I hope there is, but I suspect not.

Further reading. Some notes on the cost of Go finalizers (in Go 1.20) by Chris Siebenmann.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

26 thoughts on “The absurd cost of finalizers in Go”

  1. I’m not sure about Go’s GC, but in .NET and Java finalizers are run during GC. No CPU cycles are spent before that.

    1. In never incarnations of Java finalizers may not even be called. Finalizers should be replaced with try-finally or AutoClosables. It’s really a bad idea to depend on finalizers.

        1. In Go? io.Closer and defer. Or at least just defer. E.g.:

          func whatever() {
          c := C.allocate()
          defer C.free_allocated(c)

          // my code
          } // c is deallocated

  2. Have you tried measuring a dummy finalizer that does not call into C? The title says “the absurd cost of finalizers” but, judging by your profiling results, it might be the absurd code of CGo function calls (CGo is well-known to be slow, though by how much I don’t know).

  3. How are finalizers run if garbage collection is disabled? If Go is anything like Java, then finalizers are called by GC when it determines the object is no longer in use. The Go docs seem to back this up. (But I’m not a Go expert so something else may be going on.)

    https://pkg.go.dev/runtime#SetFinalizer

      1. Ah, I see, you are benchmarking only the setup of the finalizer, not waiting for the object’s finalizer to be called. Yes that overhead is surprising. I believe that the scenario from Effective Java includes includes time waiting for finalizer to be called, which includes GC latency, so it isn’t directly comparable.

  4. In Go, a typical pattern is

    r := createMyResource()
    // could check errors if thats a thing for r
    defer closeMyResource(r)
    // use the resource r without fears of leaks for the scope of the function

    Defer called used to be expensive but they were optimized a few years back. It would interesting to see how they compare to finalizers for your use case. I expect they would be more expensive then the manual call but safer in the face of panics or multiple return points in the using code.

      1. Big part of stdlib in go already designed to be used in “defer close()” pattern: file.Close(), bufio.Flush(), httpResponse.Body.Close(), mutex.Unlock(), close(chanel), … So, devs getting use to it anyway.
        Can’t force caller to check error returned by your function. Doesn’t mean it’s “not functional equivalent for exceptions” – it’s “enough equivalent”. And all go devs know what to do with “err” (even if beginners doing mistakes). I think “if err != nil” is conceptually same thing with “defer close()” – no compiler guaranties, need manual work, fine.
        We force devs to add “defers dbTransaction.Rollback()” by custom linter: https://github.com/ledgerwatch/erigon/blob/devel/rules.go#L31

  5. In Go, a typical pattern is
    this was also my first thought. Why not just defer C.free_allocated(c)?

    I cannot force the caller to use a defer in Go
    can you maybe give an example where this is a problem?

  6. In Go, a typical pattern is … defer

    this was also my first thought. Why not just defer C.free_allocated(c)?

    I cannot force the caller to use a defer in Go

    can you maybe give an example where this is a problem?

  7. For some content, Java’s Object.finalize() has been deprecated since ~Java 9, 2018, and not recommended for usage for years before then. It was removed entirely ~last year in Java 18. Performance and non-determinism are just reasons why.

    There’s a very detailed description of this issue at https://openjdk.org/jeps/421, not just from a Java perspective but a more generic VM and GC perspective.

    For Java specifically, though, try-with-resources and java.lang.ref.Cleaner‘s are two of the recommendations for replacement.

  8. The quotes Java measurement likely is apples and oranges, similar may apply to the Go code.
    Such languages live from inlining small objects to lessen the cost of memory management.
    Very likely the use of finalizers kills the inline ability (because one does not really need finalizes in cases where small obejects can be inlined), as they will be put in a memory management queue.
    IMHO a more realistic benchmark would be to allocate a list of 1000 such objects, then free them, so you don’t measure “boxing + finalizer” when you want to measure finalizers only.

  9. Finalizers should be the last option. If a design finds itself reaching for finalizers then perhaps the design needs rethought to use defer or to avoid using C to manage resources. I’m not saying finalizers (or something similar) aren’t needed sometimes but as the blog points out there is a cost.

  10. (Slightly offtopic)

    Imho this illustrates a more fundamental issue we have bought into with high level programming languages – while they allow more easy/convenient programming by abstracting many goodies, these goodies come to a price and increase the distance to the machine and a proper comprehension of its real work. Or to say it differently – if Go offers a finalizer, it will get used, no matter how bad it is.

    To illustrate that further, imho the broad adoption of GCs into languages led to a programming style, where devs dont care much about memory anymore, resulting in overly moving data around (e.g. copy constructors, hire&forget memory).

    There was a paper from google years ago stating that memcopy actions account for a very high percentage of their server load (dont remember the exact numbers anymore, I think it was >30%). Imho this will only get worse with the ongoing broader adoption of languages, that heavily rely on that hire&forget memory model, like Javascript/NodeJS and Python.

Leave a Reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.