Adventures in Engine Construction – Cloaking the API
Preface
A common issue that engine developers face is dealing with multiple platform/rendering APIs – we have to marry (often quite different) system libraries into a single, consistent interface for the game teams, who are, after all, our clients. The last thing they need/should/want to worry about is whether they’re running on OpenGL or Direct3D9/10/11 (or iPhone/Windows/Linux, whatever). There are many solutions to this problem, with varying degrees of hackiness – most of the simple ones are based on conditional compilation, and, as a result of that, have a whole litany of issues. It is actually a very hard problem to solve without leaking the abstraction in one direction or another.
I’m going to start this by enumerating some of the methods I have seen and explaining what I believe the relative shortcomings are. I’ve come to my own conclusions for what I want to do for my own stuff, and I certainly am not advocating it as a perfect solution, which I will cover last.
The Rogue’s Gallery
For demonstrating the problem, I will be using a fairly simple example – but one that comes up all the time in rendering engines. A simple texture class. Now, most engines (at least those written in C++) will probably have some sort of hierarchy like the one to the right: a catch all Resource type, a general Texture type, and a specific Texture2D type (as opposed to a Texture1/3D or TextureCube). The hierarchy may continue beneath that with, say, dynamic textures, or framebuffers, or something – the point is that this is not a contrived example.
Direct Instances
We’ll start out with some simple solutions, which may be perfect for some libraries. These next few examples require that the implementation and the interface classes are the same. We can start by simply setting it up like this:
// Texture.h #ifdef ENGINE_API_OPENGL #include "TextureOpenGL.h" #elif defined(ENGINE_API_DIRECTX11) #include "TextureDirectX11.h" #else #error "No valid API!" #endif // TextureOpenGL.h class Texture2D : public TextureBase { protected: GLuint m_Name; //... }; // TextureDirectX.h class Texture2D : public TextureBase { protected: ID3D11Texture2D* m_Texture; //... };
It should be blindingly obvious that this sort of technique is a pain in the arse (duplicated classes everywhere), but it is a perfectly reasonable solution for some systems. I also realize that if that was any more than a contrived snippet, the GLuint in the OpenGL (comedy option) obviously wouldn’t be in the texture class; no, since every god damn thing in OGL is an int we’d be jamming that shit in the base class.
Let’s assume that you have a lot of shared code between your APIs (e.g., your textures have crazy methods like for “getting the format” or other outlandish things like that). Let us also assume that you’re caching that stuff in engine land rather than palming resources off to the driver and forgetting about them. Actually, if you are writing an OpenGL implementation, that’s basically a certainty – imagine how much your average (i.e., rubbish) OpenGL driver will deal with people calling glGetTexParamerteri from wherever just because some homeslice UI programmer wanted to know the width of a random image from the loading thread. For those of you who don’t have experience writing OpenGL(ES) on Android/ATI/Intel/iOS devices, … well, just be thankful of that.
This brings some very simple procedural issues with it as well – by having two sets of code, you are almost guaranteed to end up with some divergence happening. So, how can we “improve” this? Maybe we should cast our conditional compilation net a little…. narrower.
// Texture.h class Texture2D : public TextureBase { protected: #ifdef ENGINE_API_OPENGL typedef int Texture2DStorage; #elif defined(ENGINE_API_DIRECTX11) typedef ID3D11Texture2D* Texture2DStorage; #else #error "No valid API!" #endif // if you want to really mess with people, just make this a void* and access it through macros. // you get all the benefits of awful debugging and an extra cache miss for the low price of fewer ifdefs! // I have seen it. In real code. Texture2DStorage m_Storage; //... }; // Somewhere in the bowels of the renderer... void FrobnicateTexture(Texture2D::Texture2DStorage texture, float x, float y);
Better? Maybe. Maybe not – every special case will be surrounded by some delicious macros. It’s definitely one of those things that is YMMV, as the kids would say. In truth, I’ve only seen this in platform OS level abstractions, like threadlocks/mutexes. I guess because they tend not to have much code in them – it also leads on to what is a major restriction of this method. The API must be set in stone at compilation time, for all parts of the program. It makes sense for an OS level feature – you’re not going to compile on Windows and expect the EXE to work on Android (another inside joke there – nothing works on Android) – and equally, nobody is going to put an option to hot swap all the CriticalSections for Mutexes in your program’s config file.
Indirect Instances
Let us assume that you are making a more flexible engine, one that can select a renderer on start up, or something that, say, has the API specific code in another module. Now we have a much more difficult problem, because we’ve suddenly decoupled our contractual interface to the game with the implementation. Naively, we might set up something like this:
// Texture.h class Texture2D : public TextureBase { public: // Inherited from TextureBase virtual int GetHeight() = 0; virtual int GetWidth() = 0; virtual bool Is2D() { return true; } // Inherited from Resource virtual void Release() = 0; }; // Renderer API style: void FrobnicateTexture(Texture2D* texture, float x, float y);
No problems, you think. “I’ll just have a factory method that returns my (e.g.,) D3D11Texture2D when someone makes a texture! It will implement all those methods and we can all be happy ever after!” So far, so good. But then something happens; the voice of doubt begins to stir…
Hey, Timmy (if that is your real name), we could totally write this more efficiently if we had a shared base class for textures, because I am so tired of copying all this stuff around and I’m pretty sure that doing that is suboptimal. PS I still haven’t forgiven you for hiding all that shit behind void pointers before.
You dutifully make a new file in your lovely, isolated, not-at-all leaky renderer files, and start typing:
// D3D11Resource.h class D3D11Resource { protected: TotallySharedVariables m_Shared; };
“Now, I just need to make my texture type inherit from it!”
// D3D11Texture2D.h class D3D11Texture2D : public Texture2D { protected: TotallySharedVariables m_Shared; ID3D11Texture2D* m_Texture; public: virtual int GetHeight(); virtual int GetWidth(); virtual void Release(); };
Oh. Looks like you already are inheriting from your interface there, champ. Now, at least one of you readers will probably be thinking “yes! Another chance for my favourite language feature – multiple inheritance – to save the day! I love inheritance so much!” – we’ll get back to you later, but let’s keep looking at just how messed up this problem gets. In the case above, we have a simple inheritance; D3D11Texture implements Texture2D, but we want it to inherit from D3D11Resource as well for ~reasons~. What happens when you continue down this path (and this will happen as your inheritance hierarchy grows), and end up with your ideal structure being this:
Okay, well now you’re really in trouble. That isn’t an inheritance hierarchy, that’s a disaster waiting to happen. Let’s see how we can fix this.
C++ Multiple Inheritance
You’d think that a language as endemic as C++ would have a sensible solution to this built in. I mean, half those classes are just interfaces, they’re not even real classes (maybe)! If this was C#, or another language that was designed by people who were a) awake or b) competent, this whole post would end here. Of course you should use multiple inheritance, that’s what it’s for. Unfortunately, you can’t. Well, you can, but you have two options available: make it comically broken, or make it horribly slow. Neither option is what you probably would be aiming for; and if it is, I suggest you try setting your sights a little higher.
Hopefully all of you know that if you simply did it like this:
// API headers class Resource; class TextureBase : public Resource; class Texture2D : public TextureBase; // Implementation headers class D3D11Resource : public Resource; class D3D11TextureBase : public D3D11Resource, public TextureBase; class D3D11Texture2D : public D3D11Texture2D, public Texture2D;
Then you will end up with a generous 3 copies of Resource in your texture. And they will be there – even if it has no data members in it, as each class comes to the party with a vtable. Also, add one vtable for the main implementation hierarchy, as well as a smattering of scope operators to resolve compile time ambiguous function calls. As an extra bonus, if you foolishly put some data in your API class (not completely unheard of), you’ll get three different copies of it. Depending on where you are in the class hierarchy, you get to see different values too (D3D11Resource sees copy #1, D3D11TextureBase copy #2, etc). You can “work around” this by qualifying any duplicated field accessing with a scope operator, but all this is just icing on the cake of failure.
Okay, so that covers “broken”, what about “slow”? C++ didn’t get this far without a litany of poorly conceived bandaids, and let me tell you, Bjarne “how about we call it ‘static'” Stroustrup has one itching to go right now! If we simply reuse the keyword virtual in front of our class inheritance declarations, like so:
// API headers class Resource; class TextureBase : virtual public Resource; class Texture2D : virtual public TextureBase; // Implementation headers class D3D11Resource : virtual public Resource; class D3D11TextureBase : virtual public D3D11Resource, virtual public TextureBase; class D3D11Texture2D : virtual public D3D11Texture2D, virtual public Texture2D;
Also you need all those virtuals, no skimping (or you’ll get compile errors for ambiguous function lookups)! This “works” and “folds in” the classes in the same way that one could “fold in” a toddler into a tyrannosaurus. Let’s just imagine a setup where D3D11TextureBase has a single int (I am regretting not using OpenGL as my example API), and D3D11Texture2D has two. The memory layout of an instanced D3D11Texture looks like this:
That sucker is 56 bytes, too (104 in x64 mode) – remember, it’s just 3 ints and the support stuff for a not-crazy inheritance structure. But wait, it doesn’t stop there! Each function call, and potentially each variable access, has to go through a fixup function to get the real address. It changes a virtual call from one extra indirection to an eye watering four. Not to mention the general amazingness that is the drifting this pointer, the fact you have to use dynamic_cast for everything (a topic for another post), and the ever amusing gigantor member function pointer size.
Now what?
Hopefully I have demonstrated that C++ isn’t going to come in and save you in this case. So what do real engines do? Essentially, there are three options. One – you say “this is too difficult”, and remove the methods from your public classes. With no methods, they can be treated as opaque pointers. Instead of doing:
Texture2D* tex; // ... int width = tex->GetWidth();
You can do this:
Texture2D* tex; // ... int width = Renderer->GetWidth(tex);
It totally works because the renderer is behind the platform abstraction. It knows that all the textures are, e.g., D3D11Textures, so it just casts them and uses the methods on those types. Now you only have a bunch of dummy classes and a single inheritance chain. Problem solved.
Or, you could go a different approach – simply split the whole thing apart and either store some kind of lookup* to the type or a direct pointer to it and hide the ugliness of not having direct member functions by doing that in the engine code. It’s the same as the previous approach, just more (from the game’s side, at least) of an object oriented approach.
* Don’t do this, it’s retarded.
Plan B C Q: The Best of All Worlds?
Frankly, the problem should be pretty simple – we have two distinct inheritance hierarchies; where one, the interface layer, just happens to be the communication layer between the game and the underlying system. So what can we do? Can we just jam the classes next to eachother? You bet we can.
template <typename C> C* InterfaceToConcrete(const void* v) { return (C*)((char*)v + sizeof(Resource)); } template <typename I> I* ConcreteToInterface(const void* v) { return (I*)((char*)v - sizeof(Resource)); } template <class Interface, class Concrete> class InterfaceWrapper : public Interface { protected: typedef Concrete ConcreteType; Concrete m_Concrete; }; template <class Implementation> class Texture2DBridge : public InterfaceWrapper<Texture2D, Implementation> { public: // Obviously these could just be m_Concrete.xxx, but I want to make it explicit virtual int GetWidth() { return InterfaceToConcrete<Concrete>(this)->GetWidth(); } virtual int GetHeight() { return InterfaceToConcrete<Concrete>(this)->GetHeight(); } }; // In your platform specific code: typedef Texture2DBridge<D3D11Texture2D> WrappedTexture2D;
So where does that get us? If we provide the game with objects of type WrappedTexture from our factory types, we get the following benefits:
- From the game’s side, all the interface classes are simple inheritance structures – they can be static_casted to each other without issue, and we have a straightforward result where the factory returns a Texture which is a TextureBase, which is a Resource.
- In the renderer, we can take any arbitrary interface type and convert it to a concrete type. This concrete type pointer can then be used as if it were unrelated to the interface shenanigans – the D3D11Texture2D is a D3D11Resource, etc. Again, because it’s simple inheritance, we don’t have any kind of drifting pointers to deal with.
- If we have a further restriction that the interface classes cannot contain data (only code), we can also go from a concrete type back to an interface type using the ConcreteToInterface helper, which means we can keep all our renderer code using the concrete types, and even if we have to return results to the game, we can return it in a safe interface format.
- The helper methods are trivial and will be optimized out by the compiler, so we only have to pay the cost of an extra two pointer lookups when calling via the interface methods (the second vtable load and the second dispatch). This is a pretty trivial price, but as we can safely store references to the concrete forms, we can avoid that cost on the platform specific side of the divide. Also, as the concrete type is stored immediately after the interface (which may be just a vtable pointer), it’ll already be in a cache line and ready to go.
- This type is completely understandable by debuggers and works without issue. If you’re on the concrete side of the divide, you won’t be able to see the interface data, but that shouldn’t be a big problem given the execution context at that point.
- No back pointers are required.
The only drawback is the requirement to re-wrap every interface method for each concrete type, but that’s mindless work that can be done easily. We can even use std::forward and variadic templates to make multi-argument constructors a trivial addition to the bridge types. Also, if we’re willing to put our shared interface data on the end of our bridge class, we can ensure the “hot” data – the concrete class – comes first to improve cache performance.
Conclusions
As I said before, this isn’t necessarily the most clever way of doing this – nor do I advocate it as the one true solution, but it is a reasonably good way of keeping the interface and the underlying systems separate. Anything that reduces the leaking of abstractions is a good thing in my opinion, and this solution requires minimal extra code, allows you to write sensible, shared methods on both the interface and the concrete side of the divide, and performs well. I also hasten to add that this is not some novel idea – many different libraries use similar constructions, or even more exotic ways around inheritance issues (such as COM), although I haven’t seen this particular solution used frequently in game engines.
Update
Some of the people who read this have suggested that my example is not realistic and that such things could be solved using composition. They can, it’s true. However, I prefer inheritance over composition as it is a more appealing concept to me. I find it easier to leverage it as a methodology than other approaches, and this is a small hack in the grand scheme of engine development – well worth the effort in my opinion. I would also strongly argue against any claim that this is unrealistic. These types may very well not share data, but they may have nontrivial code sharing which would require additional auxiliary types to duplicate. I would also posit that this solution continues to scale well – when you inherit from D3D11Texture2D to build D3D11DynamicTexture2D, you will certainly see immediate benefits from an approach like this.