Game Development Community

Terrain Texture Editor freeze bug

by Tom Vogt · in Torque Game Engine · 11/20/2002 (4:39 am) · 4 replies

I can reliably reproduce a total-freeze bug in our game (based on 1.1.2 with some additional patches). Maybe someone can point me in the right direction given this debug information:

Loaded symbols for /usr/home/tom/BaSam/BadSam/example/tplib/libopenal.so
0x080bd902 in GuiTerrPreviewCtrl::wrap (this=0x40b63848, p=@0xbffff250)
at editor/guiTerrPreviewCtrl.cc:153
153 result.x -= mTerrainSize;
(gdb) backtrace
#0 0x080bd902 in GuiTerrPreviewCtrl::wrap (this=0x40b63848, p=@0xbffff250)
at editor/guiTerrPreviewCtrl.cc:153
#1 0x080bd9e7 in GuiTerrPreviewCtrl::worldToTexture (this=0x40b63848,
p=@0x40b638f8) at editor/guiTerrPreviewCtrl.cc:165
#2 0x080bde65 in GuiTerrPreviewCtrl::onRender (this=0x40b63848, offset=
{x = -1073744720, y = -1073744712}, updateRect=@0xbffff4b8)
at editor/guiTerrPreviewCtrl.cc:219
#3 0x080f55e1 in GuiControl::renderChildControls (this=0x40b63798, offset=
{x = -1073744612, y = -1073744488}, updateRect=@0xbffff598)
at gui/guiControl.cc:426
#4 0x080f54b6 in GuiControl::onRender (this=0x40b63798, offset=
{x = -1073744496, y = -1073744488}, updateRect=@0xbffff598)
at gui/guiControl.cc:403
#5 0x080f55e1 in GuiControl::renderChildControls (this=0x409ade38, offset=
{x = -1073744324, y = -1073744120}, updateRect=@0xbffff708)
at gui/guiControl.cc:426
#6 0x0811adfc in GuiTSCtrl::onRender (this=0x409ade38, offset=
{x = -1073744192, y = -1073744120}, updateRect=@0xbffff708)
at gui/guiTSControl.cc:148
#7 0x080b9d88 in EditTSCtrl::onRender (this=0x409ade38, offset=
{x = -1073744128, y = -1073744120}, updateRect=@0xbffff708)
at editor/editTSCtrl.cc:68
#8 0x080f55e1 in GuiControl::renderChildControls (this=0x409aa058, offset= {x = -1073744020, y = -1073743852}, updateRect=@0xbffff814)
at gui/guiControl.cc:426
#9 0x080f54b6 in GuiControl::onRender (this=0x409aa058, offset=
{x = -1073743836, y = -1073743852}, updateRect=@0xbffff814)
at gui/guiControl.cc:403
#10 0x080ec019 in GuiCanvas::renderFrame (this=0x408139d8, preRenderOnly=false)
at gui/guiCanvas.cc:1166
#11 0x0812b61c in DemoGame::processTimeEvent (this=0x84c7940, event=0xbffff94c)
at game/main.cc:691
#12 0x08210635 in GameInterface::processEvent (this=0x84c7940,
event=0xbffff94c) at platform/gameInterface.cc:72
#13 0x082107aa in GameInterface::postEvent (this=0x84c7940, event=@0xbffff94c)
at platform/gameInterface.cc:153
#14 0x082d90b4 in TimeManager::process ()
at platformX86UNIX/x86UNIXWindow.cc:786
#15 0x0812b142 in DemoGame::main (this=0x84c7940, argc=2, argv=0x403fd3d8)
at game/main.cc:489
#16 0x082d92b4 in main (argc=2, argv=0xbffffa14)
at platformX86UNIX/x86UNIXWindow.cc:840


This happens on a Linux system (Debian sid) and was not reproduceable on the one windos system tested.

This happens when I start the Terrain Texture Editor (F8 in the editor).

Any clues would be very appreciated. Unfortunately I'm not too good at C++ and reading gdb output.

#1
11/20/2002 (5:15 am)
Found and fixed, though I still don't understand it.

Through debugging, I found out that result.x in the preview wrap loop went out of bounds, at which point the -= stopped to work correctly.

I fixed this by rewriting the entire function like this:

Point2F& GuiTerrPreviewCtrl::wrap(const Point2F &p)
{
static Point2F result;
result = p;
F32 div;

if (result.x < 0.0f || result.x > mTerrainSize) {
div = floor(result.x/mTerrainSize);
result.x -= mTerrainSize*div;
}
if (result.y < 0.0f || result.y > mTerrainSize) {
div = floor(result.y/mTerrainSize);
result.y -= mTerrainSize*div;
}

return result;
}


Personally, I find this more elegant than the while(bla) a-=b; stuff that was there before. I'm not sure if it also means a performance gain.

If someone with CVS access wants to use this, it's around line 150 in editor/guiTerrPreviewCtrl.cc
#2
11/20/2002 (11:49 pm)
I don't think it will be a performance gain in the common case, where the value-to-be-wrapped is already inside the boundaries. It's possible it will even be slower in the cases where the value-to-be-wrapped is just outside the boundaries. In either case though I don't think it really matters in practical terms.

It's funny that the original code would cause an infinite while loop (I assume that's what the freeze was). If you're comfortable using the debugger, there's a couple of experiments that I think would be interesting:

1) Restore the original code and find out where it is infinite-looping, and why it is not breaking out of the loop.

2) Do a cvs update to get the fixes I just committed for guiTerrPreviewCtrl.cc and see if the problem (with the original code) goes away. The fixes I committed should prevent some bogus values being fed to the wrap function. Even with bogus values, it shouldn't infinite-loop, but... who knows.
#3
11/21/2002 (12:04 am)
Yes, Joel, it was infinitely looping, namely in the result.x -= mTerrainsize part, according to gdb.

I put some debug con::printf's in and found that result.x starts out at what I believe is the max value for F32, and it would not fall. I have no idea what happened there, that's why I wrote above that I still don't understand it - it was doing the -= but result.x was the same before and after.
That's why I assume the value was out of bounds, got truncated during some implicit cast and since it was so large, the -= fell off due to rounding errors in the casting. At least that's the only way I can make sense of the whole thing.

So (without checking), if your changes prevent this out-of-bounds value from showing up in the first place, then I guess it will prevent this freeze.
#4
11/21/2002 (12:07 am)
Actually... heh heh. Maybe the bogus numbers were in fact causing the infinite loop, only it wasn't quite infinite.

I think that theoretically you could get a number of around 2^127 crammed into a float. Since mTerrainSize is 2048 = 2^11, that means 2^116 times through the loop. That's a lot of looping! If we ultra-generously assume 4 = 2^2 ops for each loop iteration, that's 2^118 cycles. Assume a pretty fast 2^30 ops per second and you're still left with 2^88 seconds, about ten million million million years. Maybe your app wasn't frozen... you just didn't wait long enough for it to come out of the loop!