State Batching and Instanced Models

First of all everyone, sorry about the late post. Long story short, I’ve been trying to set up a Kickstarter campaign for Species to pay the people working with me, cover the website costs, and fund further expansion of the game. That failed miserably a week or two back due to Amazon Payments not supporting non-US applicants (AT. ALL. Trust me on this: if the name on the account is not a US-based individual, it doesn’t matter how many hoops you jump through they’ll still turn you down), so my motivation for blogging was somewhat sapped.

I’ve since moved to IndieGoGo, who aren’t as popular but support international projects (and who deserve a bit more love), and am gradually getting my motivation back.

Don’t worry, the campaign shouldn’t interfere with my work on 0.5.0. If anything, it’ll result in some functionality earlier: I’m working on a super secret version of 0.4.1 with fancy new prototype features as an IndieGoGo exclusive. The campaign should be going live sometime in November.

Alright, on to the post.

For all that the Species graphics engine doesn’t look like much, it’s actually quite complex. This is because, even though it’s not performing any fancy tricks like real-time shadows or normal mapping (both things I’ve implemented in other engines), it has to send a lot more information to the graphics card than most games.

For example, consider what a generic FPS might have on screen. The sky and terrain, a few types of trees and environmentals, and maybe three types of enemies underneath the wrapping of browny bloomy violence.

Also a destroyed city at least 60% of the time

Now compare that to Species. The sky and terrain, a few types of trees and environmental… and then hundreds of unique types and shapes of torso’s, limbs, heads, tails, necks and feet.

Note that I’m specifying *types* of object, not quantity. That’s because there are actually quite a few well-known tricks for drawing lots and lots of similar models to the screen. That’s how Crysis (the good one) can render more vegetation per-square meter than exists in 99% of western civilisation.

I’ll take this opportunity to discuss two of those tricks, both implemented in Species to some degree. I’ve already discussed billboards, so this will be about 3d models.

Firstly, it’s important to understand how the CPU tells the GPU to render a model. It’s not a matter of just saying “Draw Model”: there’s a lot of ancillary information and overhead that passes between the two processors. If it helps, think of it as painting a picture: you don’t just start painting, you first need to set up the easel, get out your paints, etc.

//Model 1               
[Specify shader]               [Set up easel]
[Send Model 1 Vertex List]     [Decide what to paint]
[Send Model 1 Index List]      [Work out what colours you'll need]
[Send Textures]                [Get out your colours]
[Set Shader Parameters]        [Put canvas on easel]
[Draw]                         [Paint]

//Model 2
[Specify shader]               [Set up easel]
[Send Model 1 Vertex List]     [Decide what to paint]
[Send Model 1 Index List]      [Work out what colours you'll need]
[Send Textures]                [Get out your colours]
[Set Shader Parameters]        [Put canvas on easel]
[Draw]                         [Paint]

//Model 3
[Specify shader]               [Set up easel]
[Send Model 1 Vertex List]     [Decide what to paint]
[Send Model 1 Index List]      [Work out what colours you'll need]
[Send Textures]                [Get out your colours]
[Set Shader Parameters]        [Put canvas on easel]
[Draw]                         [Paint]

As you can see, this is quite an expensive routine. All of these processes add to the amount of time it takes to draw each of these models. But as I said, there are tricks to make it cheaper. The first, and simplest to implement, is State Batching.

State Batching involves simply working out what things you only need to send to the GPU once. Let us suppose that in the above example all of the models are identical. In that case, it would be possible to rearrange things to set the common features: the shader, vertex lists, index lists and textures, only once:

//Starting pass
[Specify Shader]               [Set up easel]
[Send Model Vertex List]       [Decide what to paint]
[Send Model Index List]        [Work out what colours you'll need]
[Send Textures]                [Get out your colours]
                    
Model 1
[Set Shader Parameters]        [Put canvas on easel]
[Draw]                         [Paint]

//Model 2
[Set Shader Parameters]        [Change canvas]
[Draw]                         [Paint]

//Model 3
[Set Shader Parameters]        [Change canvas]
[Draw]                         [Paint]

Since the position, rotation and scale can be sent as Shader Parameters, this can be used to render hundreds of similar objects more cheaply.

The torso rendering in Species uses a similar system: sending the torso model only once, but specifying colour, width and height values as shader parameters. It’s not quite as simple as this example (torso models can have a variety of textures, so the models need to be sorted by texture in order to avoid sending them every time), but it makes rendering lots of them much cheaper.

But this system still calls the Draw() method for every single model. What if you want to go cheaper still, for thousands of trees? How can you draw all these models for the cost of a single Draw() method?

//Instanced Models
[Send ALL OF THE THINGS]       [Umm...]
[Draw ALL OF THE THINGS]       [Yeah, I got nothing]

Instanced models defy the analogy for a variety of reasons, but they essentially boil down to sending all the data at once, and drawing all the objects as if they were a single model. Without any of the overhead associated with more than 1 model, they draw much, much quicker.

How they do this depends on the type of instancing you use, which I’m not going to get into: suffice it to say, older machines do instancing differently from newer machines, and consoles are different again. What they share in common is sending all the data required to draw all the models to the GPU in one hit.

But the natural restriction on this is that all the models have to be the same. I am able to draw lots of identical tree’s this way, but not limbs because they’re shaped and animated on an individual basis.

So naturally, given that the tree’s need to be identical to use this system, I decided to give the 3d trees in Species 0.5.0 unique hereditable features. Thankfully I’ll be handling those with colours, which with some modification the system can handle, so it’s still quite viable.

I’m still working with placeholder meshes: the final result will be prettier than this.

Of course, with the hereditable features working, I decided to push my luck a bit further, by giving the tree’s consumable foliage. Rather than the 3d tree’s shrinking when creatures eat them, in 0.5.0 the creatures will eat away at the foliage instead, leaving a stem or trunk behind. I’ve yet to see how that turns out, but in theory at least it should be quite possible.

I don’t really have a snappy conclusion to this post, so here’s a mystery picture:

What.

“What is that I don’t even…”

  1. #1 by Adam Benton on October 10, 2012 - 8:18 pm

    Is that a jab at Crysis 2 I see? There aren’t enough of them, imo.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: