Optimization: Speeding up terrain occlusion check
by Clark Fagot · 01/16/2003 (2:32 pm) · 39 comments
First, I'd like to thank Ben Garney and Nicolas Quijano who have volunteered to help me get several of the optimizations I mentioned in my .plan file into resources. These enhancements will get to the community much quicker due to their efforts.
On to business...
The trick to optimizing the occlusion check is to take advantage of the fact that if the check indicated that an object was occluded last time it will probably work the same this time (the camera didn't move that much) and if it didn't occlude last time it probably won't this time either. We will keep a counter on each object that constantly runs down. When it gets to zero, we will perform the occlusion check. If the check is succesful, we keep the counter at zero. If it isn't, then we reset the counter to some preset value (15 by default, but can be changed by setting $pref::SceneGraph::occlusionCount to some other value). This way, occluded objects are re-checked every frame, non-occluded objects are only re-checked occassionally.
Note that because the counter is not reset when the object is out of view, once one has done a 360 from a given location, the occluded items will continue to be checked when they are in view again. Similarly, when in a valley, objects outside the valley will be appropriately check as long as you stay in the valley.
This optimization saved us about 5% in ThinkTanks. If you have around 100 objects on your terrain, it might do the same for you.
A second optimization of this code is to re-write the terrCheck routine itself. If you look at what is happening there I think you will conclude, like I did, that it is indirect and maybe even a bit odd. I replaced it with the following straightforward code and found that I was able to occlude 15% more items (occluding more items is good, as long as they aren't false occlusions, which they didn't seem to be). It may be that with very large objects my new terrCheck routine will falsely occlude objects that the original won't. It warrants testing. Here is the code:
On to business...
The trick to optimizing the occlusion check is to take advantage of the fact that if the check indicated that an object was occluded last time it will probably work the same this time (the camera didn't move that much) and if it didn't occlude last time it probably won't this time either. We will keep a counter on each object that constantly runs down. When it gets to zero, we will perform the occlusion check. If the check is succesful, we keep the counter at zero. If it isn't, then we reset the counter to some preset value (15 by default, but can be changed by setting $pref::SceneGraph::occlusionCount to some other value). This way, occluded objects are re-checked every frame, non-occluded objects are only re-checked occassionally.
Note that because the counter is not reset when the object is out of view, once one has done a 360 from a given location, the occluded items will continue to be checked when they are in view again. Similarly, when in a valley, objects outside the valley will be appropriately check as long as you stay in the valley.
This optimization saved us about 5% in ThinkTanks. If you have around 100 objects on your terrain, it might do the same for you.
In sceneGraph.h, right after "static F32 smVisibleDistanceMod;" add this line:
static S32 smOcclusionCount;
In sceneObject.h, right after "U32 mContainerSeqKey;" add this line:
S32 mOcclusionCount;
In SceneObject.cc, right after the line "Polyhedron sBoxPolyhedron;" add this line:
S32 gOcclusionStagger = 0;
In SceneObject.cc, inside "SceneObject::SceneObject()" add:
mOcclusionCount = ++gOcclusionStagger & 0x1F;
In SceneObject.cc, inside "void SceneObject::consoleInit()" add:
Con::addVariable("pref::SceneGraph::occlusionCount", TypeS32,&SceneGraph::smOcclusionCount);
In SceneTraversal.cc, after the #include's, add:
S32 SceneGraph::smOcclusionCount = 15;
In SceneTraversal.cc, in the "void SceneGraph::treeTraverseVisit()" method, between the SECOND occurrence of the line:
obj->setTraverseColor(SceneObject::Black);
and the FIRST (only) occurrence of the line:
obj->prepRenderImage(state, stateKey, 0xFFFFFFFF);
replace what is there with:
if (getCurrentTerrain() != NULL && obj->getWorldBox().min.x > -1e5)
{
if (obj->mOcclusionCount==0)
{
bool doTerrCheck = true;
SceneObjectRef* pRef = obj->mZoneRefHead;
while (pRef != NULL)
{
if (pRef->zone != 0)
{
doTerrCheck = false;
break;
}
pRef = pRef->nextInObj;
}
if (doTerrCheck == true && terrCheck(getCurrentTerrain(), obj, state->getCameraPosition()) == true)
{
// Note: leave obj->mOcclusionCount at 0 so we check next time (it was a good thing this time)
return;
}
// don't check for a number of frames
obj->mOcclusionCount = smOcclusionCount;
}
else
// don't check this frame, but later...
--obj->mOcclusionCount;
}A second optimization of this code is to re-write the terrCheck routine itself. If you look at what is happening there I think you will conclude, like I did, that it is indirect and maybe even a bit odd. I replaced it with the following straightforward code and found that I was able to occlude 15% more items (occluding more items is good, as long as they aren't false occlusions, which they didn't seem to be). It may be that with very large objects my new terrCheck routine will falsely occlude objects that the original won't. It warrants testing. Here is the code:
bool terrCheck(TerrainBlock* pBlock,
SceneObject* pObj,
const Point3F camPos)
{
Point3F localCamPos = camPos;
pBlock->getWorldTransform().mulP(localCamPos);
F32 height;
pBlock->getHeight(Point2F(localCamPos.x, localCamPos.y), &height);
bool aboveTerrain = (height <= localCamPos.z);
if (!aboveTerrain)
// Don't occlude if we're below the terrain. This prevents problems when
// looking out from underground bases...
return false;
const Box3F& oBox = pObj->getObjBox();
F32 minSide = getMin(oBox.len_x(), oBox.len_y());
if (minSide > 85.0f)
// too big to occlude (imagine big interior at the end
// of a narrow valley).
return false;
const Box3F& rBox = pObj->getWorldBox();
RayInfo rInfo;
Point3F ul(rBox.min.x, rBox.min.y, rBox.max.z);
pBlock->getWorldTransform().mulP(ul);
if (!pBlock->castRay(localCamPos,ul,&rInfo))
// didn't hit any terrain, we can see this
return false;
Point3F ur(rBox.min.x, rBox.max.y, rBox.max.z);
pBlock->getWorldTransform().mulP(ur);
if (!pBlock->castRay(localCamPos,ur,&rInfo))
// didn't hit any terrain, we can see this
return false;
Point3F ll(rBox.max.x, rBox.min.y, rBox.max.z);
pBlock->getWorldTransform().mulP(ll);
if (!pBlock->castRay(localCamPos,ll,&rInfo))
// didn't hit any terrain, we can see this
return false;
Point3F lr(rBox.max.x, rBox.max.y, rBox.max.z);
pBlock->getWorldTransform().mulP(lr);
if (!pBlock->castRay(localCamPos,lr,&rInfo))
// didn't hit any terrain, we can see this
return false;
Point3F center = ul + lr;
center *= 0.5f;
if (!pBlock->castRay(localCamPos,center,&rInfo))
// didn't hit any terrain, we can see this
return false;
// all five rays collided...consider this occluded
return true;
}
Note that if all your objects were small (no interiors) one could
simplify this a bit more by reducing the number of terrain checks,
getting rid of the size restriction (85 meters, copied from the
original terrCheck routine) and even get rid of the aboveTerrain
restrictions. But the above is meant to be the general case.About the author
Recent Blogs
• Plan for Clark Fagot• Plan for Clark Fagot
• Plan for Clark Fagot
• Plan for Clark Fagot
• Plan for Clark Fagot
#2
01/16/2003 (7:28 pm)
Hey Clark, thanks for posting things like this!
#3
01/16/2003 (9:39 pm)
Nice code snippet. Don't expect any gigantic jump in performance (at least I didn't get one), but it was certainly noticable. Good work!
#4
Great work!
01/16/2003 (9:46 pm)
From the discussions I've had with BraveTree staff, this resource is more so slower computers can handle Torque games.Great work!
#5
01/17/2003 (8:48 am)
Great !
#6
@Tim - at some point, I'm all for testing things first. Anyone who has problems with the resource can post them here. Shouldn't be an issue with this one being as it is pretty simple, but I think it's the right process to go through.
@Kevin - No problem. Hope all is going well with you and yours.
@Ben - yes, shouldn't be a huge increase. We saw about 5%, although that didn't include the change to the terrCheck routine, which I know improved performance a little more.
@Anthony - that is our motivation, but I have found that even on a fast machine, faster framerates are noticable (jumping from 30 to 45 improved the feel quite a bit).
@Frank - thanks.
01/17/2003 (10:11 am)
Thanks for the comments guys.@Tim - at some point, I'm all for testing things first. Anyone who has problems with the resource can post them here. Shouldn't be an issue with this one being as it is pretty simple, but I think it's the right process to go through.
@Kevin - No problem. Hope all is going well with you and yours.
@Ben - yes, shouldn't be a huge increase. We saw about 5%, although that didn't include the change to the terrCheck routine, which I know improved performance a little more.
@Anthony - that is our motivation, but I have found that even on a fast machine, faster framerates are noticable (jumping from 30 to 45 improved the feel quite a bit).
@Frank - thanks.
#7
01/18/2003 (9:29 pm)
Any problems with large polygons or objects yet??
#8
01/19/2003 (11:14 am)
Good job. I'm always glad to see folks adding modifications that improve the performance or feature set of the core. I do hope this makes it into HEAD after sufficient testing.
#10
01/27/2003 (9:25 am)
@Alvaro - That's #include's as in plural #include. At the top of the file you will find a bunch of lines beginning with #include. Place the line after the last of those lines. Re: treeTraverseVisit - it's definetly there. That codes is in the head and in all previous version of the Torque. Your search program may not like searching for "::" so you might just search for "treeTraverseVisit" until you get to the right spot.
#11
I did everything as told in the resource but....
torqueDemo_DEBUG.exe - 14 error(s), 0 warning(s)
It looks like a great improvement, hope I can install it properly sometime soon.
01/28/2003 (2:29 pm)
Thanx a lot for the help! I'm learning tons from you guys.I did everything as told in the resource but....
torqueDemo_DEBUG.exe - 14 error(s), 0 warning(s)
It looks like a great improvement, hope I can install it properly sometime soon.
#12
01/29/2003 (4:16 pm)
@Alvaro - I think you probably typed something in wrong. Look at the errors messages and try to debug it. If you can't figure it out then you can send me the file and I'll have a look.
#13
Thanx a lot for your help. I'll try once more and If it doesn't work I'll send you my file. As soon as I get an answer I'll write back.
01/30/2003 (7:26 pm)
@ClarkThanx a lot for your help. I'll try once more and If it doesn't work I'll send you my file. As soon as I get an answer I'll write back.
#14
all compile well (under vc7) but when i run the app i got this message:
Errore: Already Entered into zone list.
02/06/2003 (9:43 am)
I can't make it work...all compile well (under vc7) but when i run the app i got this message:
Errore: Already Entered into zone list.
#15
02/06/2003 (10:57 pm)
@Andrea - make sure you do a clean build (in VC select Build->clean, then re-build everything). Also, make sure you followed the instructions correctly. The error you are seeing is most likely due to not doing a clean build.
#16
Isn't rBox.min.z closer to the camera than rBox.max.z? If I understand the perspective we are looking at a bounding box end on, and the closer face is MinX-MaxX, MinY-MaxY, MinZ; whereas the further face is MaxZ.
So I'd think you'd want to check occulsion for the nearer face which is larger to us.
The way to test for this bug is to make a cubic building and stuff one corner of it into a matching divot in the landscape, so the back two faces of the cube are nearly touching the dirt. The landscape needs to be taller than the building. This puts the rear face of the bounding box deep in the dirt. Viewed from the opposite corner only the center test should pass. As you strafe around the builing eventually the center test will fail and the building will be occluded.
The point of the center check is to test for things seen out of valleys (camera is in a narrow valley, object is wider than valley). If this is correct, the test in the snippet is wrong but easily improved: change to testing (ul + ur)/2 instead. Even this won't catch the object if the object is not centered on the valley. The ideal test would be to test the terrain for "valleyness" rather than to test the object, because to accurately test the object you have to scan along every pixel of the top front line of the bounding box.
This code depends on terrain being a function of latitude and longitude: i.e. no overhangs. It also depends on castRay not intersecting anything which _could_ overhang, like other buildings.
02/18/2003 (12:39 pm)
I haven't actually tried to produce this bug, I only read the snippet, but:Isn't rBox.min.z closer to the camera than rBox.max.z? If I understand the perspective we are looking at a bounding box end on, and the closer face is MinX-MaxX, MinY-MaxY, MinZ; whereas the further face is MaxZ.
So I'd think you'd want to check occulsion for the nearer face which is larger to us.
The way to test for this bug is to make a cubic building and stuff one corner of it into a matching divot in the landscape, so the back two faces of the cube are nearly touching the dirt. The landscape needs to be taller than the building. This puts the rear face of the bounding box deep in the dirt. Viewed from the opposite corner only the center test should pass. As you strafe around the builing eventually the center test will fail and the building will be occluded.
The point of the center check is to test for things seen out of valleys (camera is in a narrow valley, object is wider than valley). If this is correct, the test in the snippet is wrong but easily improved: change to testing (ul + ur)/2 instead. Even this won't catch the object if the object is not centered on the valley. The ideal test would be to test the terrain for "valleyness" rather than to test the object, because to accurately test the object you have to scan along every pixel of the top front line of the bounding box.
This code depends on terrain being a function of latitude and longitude: i.e. no overhangs. It also depends on castRay not intersecting anything which _could_ overhang, like other buildings.
#17
-Nick
04/21/2003 (8:29 am)
I'm seeing noticable fps improvements without visual inconsistencies. Thank you, Clark.-Nick
#18
01/04/2004 (7:05 am)
Did this make it into HEAD? Or is it still worth adding..
#19
EDIT >>
Hmm, theres no "void SceneObject::consoleInit()" in my SceneObject.cc. I have the latest stable (not HEAD).
And the obj->getTraverseColor()'s are commented out....
HEEEEEELP!
04/08/2004 (8:31 pm)
I was about to ask the same thing, but after comparing the changes to my copy of the sdk I don't see any of these changes, so I'm adding it right now =)EDIT >>
Hmm, theres no "void SceneObject::consoleInit()" in my SceneObject.cc. I have the latest stable (not HEAD).
And the obj->getTraverseColor()'s are commented out....
HEEEEEELP!
#20
Right well it wasn't that hard. Yeah they changed the traversel color stuff to more meaningful names somtime around TGE 1.2. So do all the changes mentioned at the top, and then where it mentions getTraverselColor, just look for the commented out line or instead look for
04/09/2004 (3:15 pm)
I remember seeing some comments about getTraverseColor being depricated - I'll have a closer look tomorrow (UK here) and see if I can find a fix that works with the latest HEAD.Right well it wasn't that hard. Yeah they changed the traversel color stuff to more meaningful names somtime around TGE 1.2. So do all the changes mentioned at the top, and then where it mentions getTraverselColor, just look for the commented out line or instead look for
obj->setTraversalState( SceneObject::Done );which is around line 319 of my sceneTraversal.cc file.

Torque Owner Roger Smith