6 min read

Golang: Circular Queue from Scratch

What a RingQueue is. How to design it from scratch. And why it's super helpful in solving some of performance bottlenecks in your programs.

Hey friend,

As you know, Go has a few foundamental containers that are built-in in the language. Although more complex structure can be build on top of those, the standard library collection of containers is less mature and plentiful compared to something like Java or Python.

Lately, I've been doing a lot of memory-constrained programming and I wanted to look at a very simple and super useful data structure - a Circular queue.

It's also often called a Ring Queue or sometimes a RoundRobin Queue.

I'll do some microbenchmarking at the end and we'll see how this container performs.

Let's dive in :)

What a Ring Queue is and Why use it

So, what is a Ring Queue and why it's useful:

  1. It uses a static / fixed array buffer (no costly allocations)
  2. It does not shift elements on element removal (no costly memcopy)
  3. It can start anywhere in the buffer and loop over the end

You'd want to use RQ in these cases:

  1. You cannot allocate memory (not enough memory or no ready allocator if you're writing bare-metal code)
  2. Your data elements are significantly large that copying them is detrimental to program performace.

Here's a few illustrations of a possible RQ state:

Three states of a circular queue of 10 elements. Top to bottom: an empty queue, start and finish markers are on zero, then no-overlap state, end marker is further than the start marker, lastly an overlap state with the end marker before the start marker.

There is one tricky ambiguous state though, can you spot it? 😉

Because the end maker is exclusive i.e. it points at the element after the last element in the queue, both the empty queue state and the fully filled have the same marker poistions. We need to be careful about that in the code! 🤔

Two cases of a queue of 10 elements, top one shows an empty queue with both start and end markers at zero, the bottom image shows a full queue with both start and end markers at zero

Container Structure

I'll be creating a new Go module for this container since we can make it generic and should be able to reuse in future projects.

If you want to refresh on how to initialize Go modules (I don't do this often myself, so I look this up every time), head to this page: https://go.dev/doc/tutorial/create-module

Now, we have an empty module:

package ringqueue

// ...

Let's define our container structure and a simple constructor that allocates space and sets default values:

type RingQueue[T any] struct {
	data   []T  // container data of a generic type T
	isFull bool // disambiguate whether the queue is full or empty
	start  int  // start index (inclusive, i.e. first element)
	end    int  // end index (exclusive, i.e. next after last element)
}

func NewRingQueue[T any](capacity int) *RingQueue[T] {
	return &RingQueue[T]{
		data:   make([]T, capacity), // buffer allocation
		isFull: false,               // non-full at start
		start:  0,                   // start with 0 index
		end:    0,                   // end with 0 as well
	}
}

Add and Remove Methods

Now to the interesting part, how we manage the data inside the queue and add/remove elements there.

Let's think of the Add first, when we add:

  1. We add to the end of the queue, this means that our end marker that points at the next-after-last position in the queue is exactly where we need to place the new element
  2. Once we add an element at the end position, this is now the last element in the queue and we need to shift our end marker further
    3. Markers should not cross each other, if the end and start markers are at the same position, the queue is either empty or full. We're ok with the empty case, but we need to check if it's full, otherwise we'll write over the existing data.

Let's see how that's done in code:

func (r *RingQueue[T]) Push(elem T) error {
	if r.isFull {
		return fmt.Errorf("out of bounds push, container is full")
	}

	r.data[r.end] = elem              // place the new element on the available space
	r.end = (r.end + 1) % len(r.data) // move the end forward by modulo of capacity
	r.isFull = r.end == r.start       // check if we're full now

	return nil
}

The modulo operation gives us a convenient way to loop over the end of the available space.

Now, let's implement Remove operation:

func (r *RingQueue[T]) Pop() (T, error) {
	var res T // "zero" element (respective of the type)
	if !r.isFull && r.start == r.end {
		return res, fmt.Errorf("empty queue")
	}

	res = r.data[r.start]                 // copy over the first element in the queue
	r.start = (r.start + 1) % len(r.data) // move the start of the queue
	r.isFull = false                      // since we're removing elements, we can never be full

	return res, nil
}

Let's also do a quick quality of life improvement that would help us visualize the queue state and debug it if needed:

func (r *RingQueue[T]) String() string {
	return fmt.Sprintf(
		"[RRQ full:%v size:%d start:%d end:%d data:%v]",
		r.isFull,
		len(r.data),
		r.start,
		r.end,
		r.data)
}

This stringer function will be called when we try to print the object and will give a more readable representation than the raw data dump.

Just give me the code

Now, you would want to do some testing to see if this logic works as expected. I won't do this here, but if you're interested, you can find the entire code of this module and simple tests (by no means exhaustive) for it in my github repo:
https://github.com/sombr/go-container-roundrobin.git

RingQueue Performance

Do you like benchmarks? I love them, even though many cases they are non-exhaustive, relatively synthetic and might give a skewed view of reality :)

So, let's see how our implementation performs against a simple Go array.

Remember, what we're looking for is an array slow down due to excessive copying.

I'll run the worst case scenario as well, when you're always at full capacity and need to shift all the existing elements. This is actually a realistic use case when for example you use a array as an event-queue in a simulation program and always take the lowest timestamp from the start of the queue and push the highest timestamp to the end.

Three states of a regular array queue are shown: top shows a full queue, middle one a queue with the first element removed, bottom image shows a shift of all elements back to zero and an empty space at the end of the queue.

Note however, that a Ring Queue has an overhead of tracking the start and end markers and performing overlap check which is more operations than just an item copy into an array.

Micro Benchmark

Conveniently, Go offers a simple benchmarking framework. Let's add the following benchmark tests to our test file:

func BenchmarkRR(b *testing.B) {
	rr := NewRingQueue[int](100_000)

	for n := 0; n < b.N; n++ {
		if rr.IsFull() {
			rr.Pop()
		}
		rr.Push(n)
	}
}

func BenchmarkArray(b *testing.B) {
	var ar [100_000]int
	size := 0

	for n := 0; n < b.N; n++ {
		if size >= 100_000 {
			copy(ar[0:], ar[1:])
			size--
		}

		ar[size] = n
		size++
	}
}

Which one do you thing would be the fastest? 😉 Let's run it!

go test -bench=. -benchmem

Here's results on my Intel NUC Ubuntu Linux machine:

goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz
BenchmarkRR-12          59798865                18.50 ns/op
BenchmarkArray-12        1000000             17321.00 ns/op
PASS

That's exactly what we've expected, it's almost a 1000 times slower because of continious copying!

However, there's always a caveat. If I run the same test with just 10_000 queue size instead of 100_000:

goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz
BenchmarkRR-12          57184480                18.65 ns/op
BenchmarkArray-12        1000000              1068 ns/op
PASS
ok      github.com/sombr/go-container-roundrobin        2.160s

This looks significantly better!

Let's plot the relation of run times and queue lengths:

Here we can clearly see that a RingQueue performance stays constant regardless of the queue size whereas an Array-backed queue slows down linearly with the queue size increase.

In this experiement (your results might differ depending on the machine platform, OS and architecture) the break even point is around 150-200.

This also gives a good indication that if you purposefuly use low size queues, they might still be faster than the RingQueue due to the bookkeeping overhead.

That's it :) Please leave a comment if you want to discuss this and follow me for more deep dives into various aspects of system engineering 😉