Anatomy of a bug – Loading Death

Hello minions,

Time for another “Anatomy of a bug” post. I’ll try not to get the lymph all over the place, but no promises. Vivisection is a messy business!

A recent bug fix involved a save/load problem: when you load a game, about 10-20% of the creatures would die, but *only* if the game had been running for a few minutes. This was something I hadn’t previously noticed: I didn’t really pay attention to the number of creatures, and they just looked like ordinary corpses after the load, so there was nothing to raise my suspicion.

With the new red-mist effect when creatures die, however, it was quite obvious that a large number of creatures were dying at the exact moment the game was loaded. Nothing quite says “you have a problem” like a couple dozen perfectly-synchronised bloodsplosions.

Liberties may have been taken.

Liberties may have been taken with this rendition of the event.

My first assumption was that this was somehow related to the AI changes: I’d implemented other potential culprits since 0.7.0, things like changes to how growth cost is evaluated and how temperature damage is applied, but since I’d just come off of implementing save/load functions for the creatures new brain, I assumed it was that the AI was being initialised badly. That would also fit the symptoms: 10-20% of creatures would be roughly the amount doing specific, low priority actions (the vast majority of them spend their time feeding), and they always start the game by seeking food which would explain why it took a few minutes to manifest.

But after adding a variety of reset and revaluate functions to completely refersh the behaviour tree, creatures were still dying during load. At that point I had to admit that the AI probably wasn’t the cause.

The next thing to check was that the Save/Load function itself wasn’t broken. That seemed unlikely: typo’s in the save/load routine usually either crash the game or result in *very* noticable problems. Still, maybe I was loading energy as health or something, so it needed to be tested.

I did that by saving the game with exactly 500 creatures, then loading it. Sure enough, when it comes in, there are 428 creatures.

Except… with the power of code, I can put a breakpoint in and study the game’s state at the *exact* moment it leaves the load routine. And at that moment, there are 500 creature’s. The killing blow isn’t actually being delivered until after the game is loaded, in the single update frame between when Loading finishes and the game is actually drawn to the screen.

And that thought, thinking of it as a “killing blow”, caused me to notice something something interesting. After the load, a large number of creatures were Fleeing Randomly. They’d noticed something.

Flee Randomly is actually a very specific behavior: it only occurs when Fear increases with no discernable source. Since fear is only raised under very specific circumstances, we can narrow it down further: they can’t have recieved a threatening ping (since that would give them a source), so they must have taken a significant amount of damage in a single frame.

They weren’t just fleeing in terror from some nebulous threat. They’d been attacked. Eeeenteresting.

Wait, did I say interesting? I meant terrifying. Something is in my game, causing pain and death in the single-frame of darkness between loading and playing, and I don’t know what it is.

After checking that the 8 seals on [REDACTED] were indeed still intact, I started looking for other sources of pain, misery and death in the Species universe. The first was prompted by a strangely beautiful observation: a few particles of snow falling in the desert.

Aha! The temperature map is being initialised one frame too late, hence the snow!

That would mean that for a single frame creatures are subjected to absolute-zero temperature. Or, y’know, -17 degrees Celsius, which is absolute zero in the species universe (FOR REASONS!). That would do some damage: not much, but enough to knock off some of the near-death creatures and scare the rest. Problem solved, and as a bonus we don’t have some wrathful eldritch abomination loose in the game’s universe.

So I went ahead and fixed the temperature map issue.

It didn’t work. The creatures still died.

Welp I guess it really is an elder god. Time to bury your computer, hide in your weird survivalist neighbours anti-zombie bomb shelter and await death or insanity, whichever comes first.

Or we could keep looking for a rational explanation, but where’s the fun in that?

The current UpdatableAttribute system has a neat feature where it records the cause of all changes to an attribute, called “delta’s”. So you can, in theory at least, see what type of health change the delta was: Attack Damage, Temperature, Healing, Pregnancy: you get the idea.

Of course nothing is ever so simple: there are 20-something delta’s split across 4 updatable attributes for 500+ creatures 30 times a second. Getting *specific* information out of it is difficult. Luckily, we know we’re dealing with something unusual here: a large negative spike across many creatures in a single frame. Those are rare, especially now that pregnancy happens over time.

So, some conditional breakpoints and a few false positives later, and we had a culprit:



ChildhoodGrowth! They’re being attacked by the cost they pay as children to grow to adulthood.

But why?

An answer gives way to more questions. Simply knowing it’s ChildhoodGrowth that is damaging them doesn’t really help: we need to know *why* ChildhoodGrowth is damaging them. What’s wrong with it’s calculation that conflicts with the loading routine?

Here’s the entirety of the code for calculating this amount of growth cost that needs to be paid this from:

    //Calculate energy cost for growth during childhood
    float energythatShouldHaveBeenSpent = phenotype.childhoodCost.Value * (age / (phenotype.childhoodLength));
    float thisFrameCost = energythatShouldHaveBeenSpent - energySpentOnChildHood;

    energySpentOnChildHood += thisFrameCost;
    health.QueueChange(HealthDeltaType.ChildhoodGrowth, -thisFrameCost);

As you can see, it calculates the total amount that needs to be spent (childhoodCost), multiplies it by the ratio between age and childhoodLength, subtracts any energy that has been spent previously, and then applies the result.

Can you see the bug?

No? I couldn’t either, so I created a number of debug variables, to test what the values of everything were before and after saving. The results were as follows:

savedChildhoodCost = 28                 : ChildhoodCost = 28
savedAge = 166                          : age = 166
savedChildHoodLength = 72               : ChildhoodLength = 72
savedEnergythatShouldHaveBeenSpent = 52 : energythatShouldHaveBeenSpent = 28    : BZZZZZT!

Ah! So the difference between what was saved before the load, and what was spent after, lies in energythatShouldHaveBeenSpent, which is…

… calculated from the other three variables. Which are the same before and after the load. So the values should be exactly the same.


Can you see the bug now?

    //Calculate energy cost for growth during childhood
    float energythatShouldHaveBeenSpent = phenotype.childhoodCost.Value * (age / (phenotype.childhoodLength));
    float thisFrameCost = energythatShouldHaveBeenSpent - energySpentOnChildHood;

    savedEnergythatShouldHaveBeenSpent = energythatShouldHaveBeenSpent; 
            //Note: savedChildhoodCost, savedAge, and savedChildhoodLength
            //are all saved from their actual values at the moment of save.
            //energythatShouldHaveBeenSpent is a local variable, so it has 
            //to be saved here.

energySpentOnChildHood += thisFrameCost;
health.QueueChange(HealthDeltaType.ChildhoodGrowth, -thisFrameCost);

The same 3 values being combined in the same way can’t result in a different value. It doesn’t work that way. I must be
missing something. That saved value must have been generated at a different point in time or something…

… wait a sec.

Back up for a bit. I missed something important back here.

savedChildhoodCost = 28                 : childhoodCost = 28
savedAge = 166                          : age = 166
savedChildhoodLength = 72               : childhoodLength = 72
savedEnergythatShouldHaveBeenSpent = 52 : energythatShouldHaveBeenSpent = 28    : BZZZZZT!

There it is. There’s the bug. Can you see it? Look closely.

age = 166
childhoodLength = 72

This creature is not a child.

Why is an adult creature paying the childhood growth cost?

Suddenly everything begins to fall into place.

Immediately after a load, all creatures are initialised with isChild set to true. Their saved age will cause them to *become* adults next frame, but for that first frame, they will pay the outstanding childhood growth cost.

And that outstanding cost is calculated by multiplying the amount it takes to grow out of childhood by their *age*. So if they’ve lived as an adult for any amount of time, that’s quite a lot of growth cost they haven’t had to pay (by virtue of being fully grown), that they suddenly have to pay all at once in a single frame.

And they would have gotten away with dying horribly too, if it wasn’t for us meddling kids.

Thankfully, after all that, it’s an easy solution: just make sure “IsChild” is saved and loaded, right?

Well, not quite. Creature’s are initialised as children, they need to call the GrowToAdulthood method, or they’ll suffer permanent dwarfism. I could call this method manually, but it’s clumsy and inelegant. Plus it doesn’t really address the GrowthCost debt, which keeps accumulating in the background and only takes one wrong line of code to show up and wreck things.

So, a slightly more sideways solution is to make sure EnergythatShouldHaveBeenSpent maxes out at ChildhoodCost. No more excess childhood growth debt, and I can keep initialising creatures with high age values and letting their normal methods take care of growing them to adulthood.

    //Calculate energy cost for growth during childhood
    float energythatShouldHaveBeenSpent = phenotype.childhoodCost.Value * MathHelper.Min((age / (phenotype.childhoodLength), 1.0));
    float thisFrameCost = energythatShouldHaveBeenSpent - energySpentOnChildHood;

    energySpentOnChildHood += thisFrameCost;
    health.QueueChange(HealthDeltaType.ChildhoodGrowth, -thisFrameCost);

Bugkiller Qu.

  1. #1 by These Lonely Adventures on March 21, 2015 - 4:12 am

    Ahhh, man. That was really entertaining. I’m only a fledgling developer (not even that, really), so a lot of it went over my head, but what I grasped was interesting. You also write very well. Keep it up.

  2. #2 by timothy on March 22, 2015 - 9:47 am

    Man even though this is just some obscure nerds blog on some obscure game I’m interested in sometimes I still read all of these damn blog posts. You’re fucking awesome.

  3. #3 by Xami on March 23, 2015 - 5:33 am

    Hey, 20 here.

    I know this is unrelated, however what happened to the forums? I am unable to get on.

  4. #4 by Alpha on March 23, 2015 - 6:38 am

    This is Alpha, and I also cannot get on to the forums.

  5. #5 by Ewery1 on March 23, 2015 - 7:17 am

    Yah this is Ewery1, same as thing 20 and Alpha.

  6. #6 by Xami on March 23, 2015 - 8:33 am

    I’m really sorry about not being able to talk with you guys, I had been planning do little speech for everyone on the floor rooms if I could get YouTube working for me…I had a mask and everything..

  7. #7 by ququasar on March 23, 2015 - 10:09 am

    We had some strange activity in the phpBB cache/services file. I think it might have been another hack attempt. I’ve already got the site back, and am requesting help on the php support forums to try and work out what happened.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: