I learned something today: Out of Memory errors are not actually caused by being out of memory! What a silly I am for expecting error messages to make sense.
So, for everyone preparing to set your RAM on fire, stomp on it and throw it out the window before burying it in a deep dark hole and pouring concrete on top: don’t bother! You RAM is fine. Assuming you haven’t already set it on fire. If you have, your RAM is not fine. And you might want to put that out.
No if you’re out of RAM you’ll get performance issues caused by ‘thrashing’ (when a computer starts relying on the Hard Drive instead of RAM), but you won’t get Out Of Memory exceptions. So, what causes OoM exceptions?
32-bit Windows causes OoM exceptions.
Screw 32-bit Windows.
See, when Microsoft says “you are out of memory”, they actually mean “you are out of consecutive address space”.
You now have several options (it’s like a game!):
– For a “proper” explanation, go read the article I read that taught me all this: http://blogs.msdn.com/b/ericlippert/archive/2009/06/08/out-of-memory-does-not-refer-to-physical-memory.aspx
– For an “improper” explanation, go ahead and use your imagination (I’m sure it’s way more kinky than anything I can come up with).
– For an “incredibly-simplified and probably-wrong” explanation, continue reading…
When an object like a Creature or a Tree is assigned, it gets put in memory and Windows assigns it an address. And by ‘put in memory’, I don’t mean RAM. Forget RAM. RAM is a glorified performance optimisation. The memory we’re talking about here is your hard drive. And as you’re probably aware, your hard drive is huge. You’re not in any danger of running out of that unless you’ve packed the drive to it’s very limit.
Okay, so you’re not in danger of running out of Hard Drive space, and lack of RAM doesn’t cause exceptions, just performance issues. So the next place to look is at that address the object was assigned by Windows when it was put in memory.
This address is where windows get’s it’s 32/64 bit designation. The absurd oversimplification I’m about to make will likely make any computer-science types reading this swallow their hats, but it’s basically a number stating that object X can be found Y bits into the memory stream.
The cause of Out Of Memory exceptions is that “Y-bits into the memory stream”. See, even though the process has the entire hard drive to work with, a 32-bit address format can only store addresses up to ~4GB (and half of that is reserved by the operating system for other things). This is why 32-bit editions of Windows can’t make use of more than 2GB ram: the 32-bit format doesn’t have enough address space to map more than 2GB.
Okay, but surely that’s basically the same as a 2GB memory limit for the purposes of Out Of Memory exceptions?
Nope. See, to assign and map a new object, there needs to be an empty “hole” in address space for it to go into. If there isn’t a hole large enough, an Out Of Memory exception will be thrown. (finally!)
I’ll explain via metaphor. Imagine putting three baskets into your car, then filling in the area in between them with smaller items. Then you realise that you need a large esky instead, so you take out the baskets and try to put the esky in. Even though the three baskets combined took up more space than the esky, and even though they’ve been removed, the esky still won’t fit because the smaller infill items are blocking it.
That’s the problem with address space: if you fragment it with smaller items in between the big ones, the next time you need to allocate something big it’ll throw an Out Of Memory exception, even if you freed up the big ones and have plenty of cumulative space to use. *That* is what is causing the memory issues in Species.
I’m sure a decent Computer Science course would have covered all this but I’m self taught, so I get to learn by the slightly more direct route of sticking my hand in the fire and finding out what happens.
So, on to the usual question: how do we fix this?
Well, the obvious solution is “compile to 64-bit.” A 64-bit address space can map about 4 billion times as much memory (no, literally), so chances are we wouldn’t be seeing fragmentation errors any time soon. Of course, it also means people using 32-bit windows wouldn’t be able to play the game.
I’d rather solve the error at the source and retain backwards compatibility, but “solve” is a relative term. Memory fragmentation on a 32 bit machine isn’t something that can be stopped. But it can be significantly reduced, and I’d prefer to do that for the sake of memory management *before* dropping the metaphorical world-killer asteroid that is a 64-bit compilation.
So, on to the source. What, ultimately, causes memory errors in Species?
I’d done memory profiling in the past, but I hadn’t properly understood what it was telling me until now. They indicated I had two main memory draws: grass vertices, and tree instances. But these two memory hogs always appeared in different graphs: tree instances would appear in allocation, grass vertices in snapshots.
With what I know now, this makes sense. Grass vertices are created when the terrain is built, and held from then on out. This means that the grasses memory cost at any moment in time is huge: after all, it’s storing the position of every potential billboard vertex on the map. Imagine the map completely covered in the dense savanna grass: that’s what the grass memory cost is.
But relative to the hard drive, or even the RAM, it’s not that big: I think somewhere around the 200MB mark? Can’t remember exactly, but regardless, it’s also generally a static cost. The grass grabs up all that memory when the terrain is built and doesn’t release it until the terrain is destroyed. You might be able to force an OoM exception by repeatedly loading/generating worlds, and fragmenting the memory that way. Fixing it isn’t urgent, but probably worth doing: need to ensure the memory is allocated at startup, rather than on terrain-build.
The second cost is the Tree Instances array. This one confused me because it was barely noticable on the moment-to-moment memory, but was invariably a massive cost in terms of amount allocated. This was because the Tree Instances array was designed to grow organically with the vegetation: if the number of trees grew too large for the array, it would allocate a new array with an extra 500 slots for trees and copy the data to that (well okay it actually uses Array.Resize(), but that’s what Array.Resize() does internally).
Of course, frequently allocating a massive new tree array is a great method for fragmenting your address space, which is why large (>2) worlds were invariably hitting an OoM exception during the vegetation generation at startup. Since I didn’t realise this was a problem, the amount allocated in the memory profiler didn’t mean anything to me.
I’ve already applied a few simple optimisations to this method, which makes generating size-3 worlds possible again, but fixing it will be another matter entirely. Ultimately, the vegetation should probably be capped to a set level, but this is tricky because the array is fed directly into the graphics card when drawing the trees. This means that those 500 “excess” trees still incur a GPU cost, even though they’re invisible. In a size 3 world, this cost is enough to cap FPS to 12, even when there are no creatures.
To fix this one I’m going to need to adapt the InstancedModel DrawInstances() method to take the total number of trees as a parameter, and feed that to an overload of the DynamicVertexBuffer. This should allow me to submit the entire oversized array, but then lower the number of instances that are actually drawn each frame.
(Edit) Well that was surprisingly easy. A heap of address space is now set aside for trees, but only the actual trees themselves are sent to the graphics card. Assuming XNA isn’t doing something horrible with them beneath the hood, that should take care of most of the OoM exceptions in the game. Stability improvements ahoy!