Skip to content

Adventures in Engine Construction – Rectangle Packing

September 17, 2013
Optimal packing?

Optimal packing… or joptimal packing?

Introduction

Game engines are absolutely one of my greatest passions – although, they’d have to be, being both my job and my hobby. I have been working on a few toy ideas to develop a small engine, and I hope that this is the start of a series of blog posts dedicated to that. So, with no further adieu, let us begin our adventure!

I was considering naming this series “Adventures in Engine Design,” but then I felt that would be somewhat false advertising. Not to mention that my colleagues who read this blog would undoubtedly spend the next few years asking why I never seem to bring any design to their table.

Oh and also I’ve been silent for two years. To the three odd people who keep visiting this page, sorry about that!

The Problem

Quick! Pop quiz! Given an empty rectangular region, how do you allocate space for an arbitrary list of rectangles? Beyond some very basic estimates (let’s say the minimum and maximum possible sizes), devise an algorithm to allocate each rectangle a unique space in your initial region – they cannot overlap and you must allocate a slot immediately, you are not allowed to defer it to later.

This type of problem is called “Online Rectangular Bin Packing” according to a search I literally just did on Google, so while it may not actually be the real name for it, it at least will be the most hip at the time of writing, and if my WordPress telemetry is telling the truth, that is what you guys care about the most! This type of problem – or rather, a solution to it – is a pretty pertinent problem for us poor systems programmers, as several useful engine features end up relying on it.

Glyph Pages

Glyphs are an absolute bastard for most engines. In the good old days*, fonts in games were stored as a single texture, usually the ANSI characters from 0x20 (space) to 0x7E (tilde). After all, we know that once you go beyond the 0x80 Divide, all the characters start turning quite sinister – becoming mocking smiley faces, or, if you weren’t on DOS, letters with dots and other strange things. Besides, they were probably for foreigners, and therefore not worth the extra effort. There are a few holdouts though – if you dig into the DirectX SDK you can find this beauty:

Old school fonts

Old school fonts

This was once used for drawing debug text in D3D, but has since been deprecated by the (all but) loss of D3DX. When you rendered a string, it would either look up (or calculate at load time) the appropriate part of the texture for each letter, generate a quad for each character, and done – problem solved. This solution was pretty much perfect – even a gigantic string could be rendered in one draw call, as all the characters shared a texture. You were limited only by things like vertex count, overdraw, and imagination. Unfortunately, things started getting tricky from then on…

Two things happened, I suppose – people began to expect that their fonts should look nice (at any size!), and people who were inconsiderate enough to not use the same character set as the English alphabet (+ the required punctuation for C++) started buying games. Suddenly, not supporting Swahili was akin to being Hitler Lite**, and systems programmers scrambled for a solution.

Enter the amorphous concept of “glyph pages” or “glyph buffers” or “stuff to keep characters in”. You see, a TTF file like Arial supports basically every language ever made*** as it has a functional Unicode character set to for all the Windows users out there. While the 738kb it weighs in at is fairly large, that is tiny compared to the rendered size of the glyphs, especially given that the outlines in a font file are (essentially) scalable and therefore one TTF has not only all the Latin characters in, but all the East Asian ideograms at all the sizes you could want.****

From that, it’s clear that we need a data structure to support an arbitrary set of rendered characters, so we can just distribute a real font with our game (or use one that is built-in to the OS), draw the characters at whatever size we want, and store them in a texture for fast reuse and get the performance benefit of not having to change textures for each character. Also, we (probably) don’t have the luxury of knowing what character will be needed in advance, so we’ll need a system that is robust enough to take these things in it’s stride.

* Like, the 90s, I suppose.
** Godwin’s law, coincidentally, became enforced at approximately the same time. Also, we don’t actually do Swahili. That would require the speakers to buy games.
*** Within reason. Arial (“Unicode MS”) supports a huge chunk of the Unicode standard.
**** Also, now that the Unicode committee has actually gone certifiably mad, I only use font files that support such useful graphemes such as “Love Hotel” or “Pile of Poo”.

Partitioning and Packing

Now the problem is set up, what do we do? Crucially, what makes this particularly difficult is the online requirement – we can’t take a long time marking out these regions, and we can’t (e.g.,) sort the incoming rectangles to make the problem easier – after all, we have frames to render! While the offline methods are often similar, one would expect them to do a better job than an equivalent online method. As an aside, some code I was working with once did (offline) rectangle packing using a genetic algorithm to drive randomized blitting of textures to the atlas. It was horrific.

The problem can be further divided into a few parts:

  • Keeping track of “empty” or “used” space;
  • Selecting which empty region to place incoming rectangles into;
  • Positioning of the rectangles within the region;
  • Updating the map of used/empty space.

Ideally, all while doing it at a speed that will forever keep it out of your profiling.

Anyway, it turns out that some guys who really love rectangle packing have spent a long time thinking about this, and there are some well-known algorithms to deal with the problem. I’ll run through some very basic approaches before we talk about the meat of the issue.

Algorithm 1: Shelves

This one is simple – pick an axis (let’s choose Y, nobody likes X). Insert a rectangle into the top left of the atlas region. Split the atlas along the axis you picked – anything of equal or lesser size in that axis (depending on the heuristic) can fit next to the rectangle you just placed – anything that doesn’t fit goes into the un-allocated space below. This is called the Shelf structure because it produces “shelves” of similarly sized objects.

The Shelf Partition Technique

The Shelf Partition Technique

An example of a poor heuristic locking out space in the shelf as unusable.

An example of a poor heuristic locking out space in the shelf as unusable.

The algorithm isn’t fantastic at using space, and a good one would need a well tuned heuristic to pick whether to use an existing shelf or make a new one – keen readers will note that I didn’t use a naive shelf algorithm for the image above, otherwise it would look like the one to the right.

On the other hand, it requires evaluation of only N regions to place an object, where N is the number of shelves + the unshelved space. Since the packing within a shelf is trivial (pack to left), this algorithm performs incredibly quickly.

Algorithm 2: Guillotine

A cursory look at the shelf algorithm above leads to two conclusions: that the heuristic, even for a trivial algorithm (add to current shelf or make a new one), makes a big difference, and, that the shelf algorithm doesn’t track the free space well enough. Clearly, an algorithm which keeps more options open for possible packing positions is a better choice. Enter THE GUILLOTINE, which is essentially like having two shelves in parallel – instead of splitting only on one axis, split on two!

Guillotine Algorithm

Guillotine Algorithm

Of course, it’s not quite that easy – even after one split there are two overlapping, unique “free” regions. This is fine, but obviously the algorithm will need to do significantly more work to keep the internal state consistent when more rectangles are added. After all, the last thing you want is to overbook a region!

As the complexity grows, we also need to start looking very closely at the heuristic – after one split, we now have 4 possible places to add new rectangles. In the example above, you can see (hopefully) that the regions H1V0 and V1 are deleted when the new item is packed into H1, as intersecting regions are deleted on item placement – even with that happening, you can see that all the usable area is still “visible” to the algorithm – and some of the smaller regions (such as H2) are part of larger regions and therefore don’t become too small to be useful.

Algorithm 3: MaxRects

If you stare long enough at the images for the guillotine algorithm, you may come to the following conclusions: the problem with this algorithm is that it does not preserve large free regions. Observe that the largest “vertically-oriented” free region encompasses space in H0, H2 and H3, and yet there is no corresponding free region as tracked by the algorithm (I unfortunately messed up the graphic, V3, which looks like the obvious choice, only goes up to the top of H2, because it was constructed from the region H1), so, while there is clearly space for a tall (e.g., 20×256) block, an allocation of that size will in fact fail after the two items in the demonstration have been placed.

We trundle towards the inevitable conclusion that the algorithms essentially get better as you increase the number and size of free rectangles that you track. It may be surprising that the (arguably) “best” algorithm is simply naive in its approach – just track every leftover part of the region. That way you won’t lose any space on your texture sheet! Of course, you will need a cleanup step as well – any rectangle region wholly contained within another region should be discarded, otherwise you will be testing too many regions when you insert things.

MaxRects Algorithm

MaxRects Algorithm

In this case, the algorithm does pretty well – after a significant amount of extra cleanup, it has fewer regions than the guillotine algorithm, which it pays for with the extra runtime cost.

There are some other more exotic algorithms that I haven’t covered at all (Skyline – like shelves only with gravity, and a whole clade of tree/spacial partition based systems), but I decided to try my hand at implementing MaxRects.

MaxRects Algorithm Overview

AIEC-Rects7The algorithm in pseudocode is this:

  1. Check every “free” rectangle to see if your item can fit in it;
  2. Select the best free rectangle using some sort of (hopefully sensible) heuristic;
  3. Place your item in the chosen rectangle;
  4. Iterate over every free rectangle – if the newly placed item intersects the free region, remove it and, generate four new free rectangles for each side (see image);
  5. Insert the new rectangles – removing any zero sized free rectangles or free rectangles entirely contained by other free regions;
  6. Profit

Now, as far as I am concerned, the algorithm only has a few interesting parts. First, as I have alluded to, the heuristic you use in step 2 is pretty crucial, and is where many gains can come. Step 3 is actually a whole new problem and I decided to punt on it entirely – my implementation always places the item in the top left corner of the selected region. The rest of the work is essentially in implementation details.

I tried a few different data structures (since there is so much churning of free rectangles), but in the end – as is often the case – keeping a big vector of elements and using another vector for auxiliary storage (I end up rebuilding the whole list of rectangles from scratch) turned out to be the fastest solution, especially if you can reduce the memory allocation count by using pre-allocated vectors.

Tuning the Heuristic

So this leaves really only one remaining part – the heuristic we use to pick the rectangle. Qualitatively, we want to pick one which leaves as much usable, sensibly proportioned space leftover – ideally, the fewer tiny slivers or islands of unusable space, the better. So, where to begin? Let us start with the simplest method we can:

Naive (first-fit) Allocation

Naive (first-fit) Allocation

First Fit

Jam the item in the first region you find.

Actually this doesn’t even count as a heuristic because we’re not qualitatively deciding between (more than one) solution(s). I have set up a test program which will pack a range of items into a 256×256 box. The random seed it uses is the same for all runs, so we can compare algorithms sensibly.

Tightest Fit

Tightest Fit

Tightest Fit

Calculate the leftover “size” using a metric (I picked manhattan size) and select the region with the minimum remainder. Qualitatively, this works pretty well, but is it as good as we can do?

struct TightestFit
{
	IntegerRectangle Item;

	bool operator()(const IntegerRectangle& a, const IntegerRectangle& b) const
	{
		int x0 = a.Width - Item.Width; // Leftover size in X
		int y0 = a.Height - Item.Height; // Leftover size in Y
		int x1 = b.Width - Item.Width;
		int y1 = b.Height - Item.Height;

		int remainderA = x0 + y0;
		int remainderB = x1 + y1;

		return remainderA < remainderB;
	}
};

Shortest Leftover (Small) and Longest Leftover (Long)

Shortest Leftover Edge

Shortest Leftover Edge

These heuristics look at the remaining leftover axes of the box, and, respectively, select the rectangle with the least remaining short side, or the largest remaining long side. They should make sense from a design perspective – the first is another attempt at getting a tight fit, and the second is a way of measuring how “intact” the remaining free space is. Surprisingly when I animated these, they ended up identical for this set of inputs, so you only get one gif.

struct ShortestLeftover
{
	IntegerRectangle Item;

	bool operator()(const IntegerRectangle& a, const IntegerRectangle& b) const
	{
		int x0 = a.Width - Item.Width;
		int y0 = a.Height - Item.Height;
		int x1 = b.Width - Item.Width;
		int y1 = b.Height - Item.Height;

		int shortest0 = std::min(x0, y0);
		int shortest1 = std::min(x1, y1);

		return shortest0 < shortest1;
 	}
};
struct LongestLeftover
{
	IntegerRectangle Item;

	bool operator()(const IntegerRectangle& a, const IntegerRectangle& b) const
	{
		int x0 = a.Width - Item.Width;
		int y0 = a.Height - Item.Height;
		int x1 = b.Width - Item.Width;
		int y1 = b.Height - Item.Height;
		int longest0 = std::max(x0, y0);
		int longest1 = std::max(x1, y1);

		// Switched to greater-than because support code is looking
		// for the least of A, B
		return longest0 > longest1;
	}
};

Leftover Area

Minimize Area

Minimize Area

The shortest leftover axis heuristic works well – it is consistently as good or better than the “tightest fit” heuristic (although you can’t really see that from this dataset). It also leads us to another conclusion – what if we weight both axes (like the tightest fit), but especially favor results which fill a box in either axis. A simple way to do that is to multiply the leftover sides together, and sort using that.

With some luck, you should be able to see that this algorithm, while not doing any better than the others (item #63 is actually pretty huge in this test), produces quite a well packed result, with large free regions left over.

struct LeastLeftoverArea
{
	IntegerRectangle Item;

	bool operator()(const IntegerRectangle& a, const IntegerRectangle& b) const
	{
		int x0 = a.Width - Item.Width;
		int y0 = a.Height - Item.Height;
		int x1 = b.Width - Item.Width;
		int y1 = b.Height - Item.Height;

		int area0 = x0 * y0;
		int area1 = x1 * y1;

		return area0 < area1;
	}
};

Comedy Option: The Typo Heuristic

This is what happens when your heuristic is wrong.

This is what happens when your heuristic is wrong.

As there isn’t much evidence that these heuristics are terribly important in the above gifs, I thought I’d include this – this is what happens when you implement the Longest Leftover Axis heuristic but forget to switch the comparison, producing exactly the wrong result.

Implementation Notes

I strongly suggest you set up a minimum size for rectangles, and discard any below that size. I set this to 8px, which meant that I lost some area to unusable space, but it was unlikely that I’d get objects smaller than that being inserted into my atlas. Of course, this depends on your set up, but remember – the algorithm does not scale particularly amazingly, so minimizing the number of free rectangles is crucial. I was actually going to include benchmark numbers to demonstrate the difference between having that feature and not, but I got bored waiting for the results of my benchmarking program. Suffice to say, it was many orders of magnitude slower.

I leave you with my two favourite heuristics fighting over an arrangement of annoyingly shaped objects.

Small Small

Is this enough?

I thought I’d quickly mention this – by going from an artist prepared texture to a font file, we’ve gained a lot of flexibility in terms of scaling and internationalization, but we have lost the ability to have a beautiful, custom bitmap font with whatever effects the artists wanted baked straight into our texture. If we want an outline on our fonts, we’ll have to sort that out ourselves (I may cover outlines/glows later); and making a TrueType/OpenType font is not a trivial adventure. It’s not even like we have solved typography in games – Right-to-Left reading is basically unheard of, as are combining character languages like Arabic scripts, but I suppose that is just a matter of time.

Or is it?

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: