SPONSORED ADS

Optimizing the Go garbage collector

Last Updated Apr 07, 2023

In systems with high memory usage, garbage collection can cause performance problem. Heavy garbage collection cycles can add a lot of latency to a request.

What is garbage collectors

When the software executes, it will need to eat memory (for example: chrome)

Garbage collectors is a program that automatically removes unwanted data held temporarily in memory during processing. In other words, it is a way to destroy the unused objects. It help clean up the memory by finding and removing data that is no longer being used, so that the program can have more memory available for new data.

golang-garbage-collection

To optimize Go Garbage Collector, we need to learn about how it work.

How does garbage collection work in Go?

The implementation of Go’s collector has changed and evolved with every release of Go. As of version 1.19, the Go programming language uses a concurrent, tricolor mark-and-sweep algorithm for garbage collection. This algorithm allows GC to run concurrently with the main program without significant impact to performance.

I'm not suggesting that the Go garbage collector is inherently slow or inefficient, it's important to recognize that Garbage Collectors in general can always be faster. As developers, we are always looking for ways to improve the performance of our programs right? By optimizing our code to minimize the amount of garbage generated, we can improve the efficiency of our programs. But first, let's talk a bit about the algorithm that golang used, the concurrent, tricolor mark-and-sweep algorithm.

Tricolor Mark and Sweep Algorithm

I have a bunch of PlayStation 5 and Xbox games in my room. Some of them are my favorites that I'm currently playing, like Persona 5, while others are games I don't play anymore.

When my wife asked me to clean up my games and sell the ones I don't play to make some money, I thought of the tricolor mark and sweep algorithm. So, I decided to follow this algorithm to make my task easier.

First, I grabbed three bags of different colors, black, white, and gray. I put all the game disks that I'm still playing with into the black bag, and the white bag is for all the other games that I'm no longer interested in.

However, there are many games, and sometimes I can be a bit lazy. So, after working for about five minutes, I put all the remaining games that I haven't checked yet into the gray bag.

Finally, I sold the games in the black bag, got some cash, and was ready to purchase new games. My room is clean, and my wife is happy. All in all, everything worked out well.

The "tricolor" part of the algorithm refers to the fact that the program uses three colors to mark the data. The "mark and sweep" part of the algorithm refers to the fact that the program first marks the live data, and then sweeps through the memory to free up the unused data.

You could read more about the algorithm in wikipedia if you want to learn more in deep, here is the link https://en.wikipedia.org/wiki/Tracing_garbage_collection

So, now we know how it work under the hook, time for us to do real work: to optimize the Go garbage collector.

Optimize the Go garbage collector

There are 4 things we could do:

  1. reducing the amount of garbage generated by our code
  2. using smaller data structures
  3. avoiding circular references
  4. tuning garbage collection settings

Reducing the amount of garbage generated by our code

Less garbage is always better, It improve the performance of our program, for sure!

One way to do this is use fixed-size slices, let's take an example

// define the size of slice upfront if you could
var slice = make([]int, 0, 10) 

func main() {
   for i := 0; i < 10; i++ {
      slice = append(slice, i + 1)
   }
   fmt.Println(slice)
}

By setting the capacity upfront, we avoid creating new slices and reduce the amount of garbage generated.

The other way is cache frequently allocated objects, for this we can use sync.Pool a thread-safe cache that stores a set of objects that can be reused across multiple goroutines.

import "sync"


// node module is the heaviest objects in the universe!
type NodeModule struct {
    // your fields..
}

// initializing pool
var pool = sync.Pool{
    New: func() interface{} {
        return &NodeModule{} // create a new object
    },
}

func main() {
    // get return object from the pool
    o := myPool.Get().(*NodeModule)
    // return object from the pool thank to defer !
    defer pool.Put(o)
    
    // use the object ...
}

syns.Pool has also its performance cost. Only use it when your object is big and you have to create it frequently.

Using smaller data structures

If you only need to represent numbers between 0 and 255, you can use an uint8 instead of an int.

Also avoid unnecessary copying slices, for example

    beep := []int{1, 2, 3, 4, 5}
    // boop here is a new slice
    boop := append([]int{}, a...)

with the help of append and spread operator (...) we create new slice boop with the same elements as beep, but without copying the data unnecessarily.

There are not much to talk on this topic, smaller the better.

Avoiding circular references

Circular references occur when two or more objects reference each other and creates a loop, cause memory leaks. Those sh1t are never garbage collected because the garbage collector is unable to determine which objects are still in use.

Circular references occur when two or more objects reference each other and creates a loop, cause memory leaks. Those sh1t are never garbage collected because the garbage collector is unable to determine which objects are still in use.\

type Person struct {
    name     string
    lover *Person
}

func main() {
    // there are 2 peoples
    eva := &Person{name: "Eva"}
    adam := &Person{name: "Adam"}

    // eva love adam
    eva.lover = *adam
    
    // adam also love eva
    adam.lover = *eva
    
    // together they create circular references !
}

The easy way to break a circular reference is to use a closure that captures the object and releases it when the closure is called.

type Person struct {
    name     string
    lover *Person
}

func (p *Person) Breakup(f func(person *Person)) {
    current := p
    for current != nil {
        f(current)
        current = current.lover
    }
}

func main() {
    eva := &Person{name: "Eva"}
    adam := Person{name: "Adam"}
    
    eva.lover = adam
    adam.lover = eva

    // break up and move on
    n1.Breakup(func(node *Node) {
        node.Next = nil
    })
}

By apply Breakup, adam and eva can move on, breaking the circular reference !

Tuning garbage collection settings

GOGC: This environment variable controls the target heap size growth between garbage collection cycles. By default, Go sets this value to 100, which means that the garbage collector will try to keep the heap size to 100% larger than its current size. Setting a higher value can increase memory usage but reduce garbage collection frequency.

runtime.MemStats: This struct provides detailed information about memory usage, including heap and stack size, garbage collection statistics, and more. It can be used to monitor and optimize memory usage at runtime.

runtime.GC(): This function can be used to manually trigger garbage collection cycles. This can be useful in scenarios where memory usage spikes are expected, such as during file uploads or large data processing.

Small tweaks here can have significant impacts on performance and memory usage. Be careful, alway test and monitor your application after each tuned.

That's all, see you in next topic. I hope you are doing well, having a great week !

Hi there. Nodeepshit is a hobby website built to provide free information. There are no chargers to use the website.

If you enjoy our tutorials and examples, please consider supporting us with a cup of beer, we'll use the funds to create additional excellent tutorials.

If you don't want or unable to make a small donation please don't worry - carry on reading and enjoying the website as we explore more tutorials. Have a wonderful day!