Game Development Community

Post your profiler dump

by Stephen Lujan · in Torque Game Engine · 04/06/2007 (12:23 pm) · 9 replies

I was surprised a search didn't turn up an existing thread like this. I think it would people me get a feel for how torque runs and how to optimize it if we shared profiler dumps from a few different situations.

For those who aren't very familiar with the profiler it tracks where your time is being used up and what's bringing your frame rates down. To enable the profler enter in the console profilerenable(1) then to do a dump profilerdump()


My situation--- Here is a profiler dump from a few minutes of running around the default stronghold mission and killing kork a couple times.

video options--- I used 800x600 resolution and the opengl renderer.

platform--- I'm running tge 1.5.1 modified and my own scripts on a 1.3 ghz athlon with 1gig of ram and a geforce fx5200. (time for an upgrade i know)

#1
04/06/2007 (12:24 pm)
I think the first column is just each individual module and the second lists the combination of all its child processes.

Profiler Data Dump:
Profiled blocks ordered by non-sub total time:

%NSTime % Time Invoke # Name
16.851 16.880 9834 PlayerRenderShadow
14.839 14.839 751805 ContainerCastRay
10.089 10.092 3243 PlatformProcessMain
7.905 7.905 6521 IRO_RenderARB_FC
6.053 6.736 91346 TSShapeInstanceRender
5.028 5.049 6521 ISAPL_Setup
4.222 15.467 91015 LightManager_sgSetupLights
2.755 2.756 3173 WaterBlock_render
2.571 2.571 6521 ISAPL_Merge
2.027 2.029 3208 TerrainRenderRecurse
2.017 2.017 252939 TerrainRenderBuild
1.659 1.919 3243 BuildSceneTree
1.641 1.641 252731 TerrainRenderXFRender
1.504 1.505 6416 PrecipRenderPrecip
1.313 1.313 3208 SkyRender
1.308 4.864 15526 PrecipProcess
1.135 8.681 3243 AdvanceClientTime
1.068 2.267 26011 Player_PhysicsSection
1.025 1.176 25639 Player_UpdatePos
1.002 76.770 3243 CanvasRenderControls
0.876 0.876 3243 SwapBuffers
0.875 5.750 3208 TerrainRenderEmit
0.788 9.487 3243 ClientProcess
0.671 0.671 91346 TSShapeInstanceMaterials
0.619 8.099 15526 AdvanceObjects
0.616 0.616 12774 PrecipInterpolate
0.516 4.918 55097 TSStatic_renderObject
0.514 0.514 3243 BuildFogTexture
0.510 71.949 129335 SceneStateRenderImage
0.480 34.924 32022 ShapeBaseRenderObject
0.410 0.445 3243 ServerNetProcess
0.387 0.403 21998 UpdateActionThread
0.384 0.384 252731 TerrainRenderBind
0.368 0.376 6521 IRO_RenderDynamicLights
0.359 0.366 12354 ShapeBaseRenderShadow
0.351 0.351 10269 ParticleEmitter_render
0.297 0.297 6521 IRO_UpdateAnimatedLights
0.275 0.275 70742 SceneGraph_rezoneObject
0.264 0.457 1222 Blender
0.257 75.244 3243 GameRenderWorld
0.224 0.241 11139 DrawText
0.209 8.115 6521 IRO_RenderSolids
0.207 1.501 9834 PlayerRenderMounted
0.192 0.192 842368 BlenderInASM
0.191 1.504 3208 Sky_renderObject
0.191 2.505 31032 Player_ProcessTick
0.179 0.179 369020 MemoryAlloc
0.179 0.179 3243 RegisterLights
0.176 74.987 3243 SceneGraphRender
0.170 0.170 24595 TerrainRenderDetails
0.170 72.135 3243 TraverseScene
0.153 0.170 3494 ProcessInputEvent
0.149 3.509 212475 PrecipFindDropCutoff
0.146 0.153 3243 ClientNetProcess
0.144 0.601 1222 TerrainRenderBuildBlendMap
0.136 0.136 363132 MemoryFree
0.099 0.277 91015 GetLightingAmbientColor
0.084 0.166 6692 InteriorPrepRenderImage
0.084 2.592 9834 PlayerRenderPrimary
0.073 89.826 3243 GameProcessEvents
0.073 16.572 6521 InteriorRenderObject
0.071 0.071 6416 PrecipRenderSplash
0.071 89.578 3243 ProcessTimeEvent
0.066 0.067 3243 CanvasPreRender
0.058 0.525 12354 ShapeBaseRenderPrimary
0.052 0.069 13477 convertUTF8toUTF16
0.049 1.625 6416 PrecipRender
0.049 0.049 6521 InteriorTraverseZone
0.043 7.663 6521 InteriorSetupActivePolyList
0.040 0.040 3243 NetProcessMain
0.038 0.038 72321 PrecipSpawnDrop
0.037 0.037 3208 TerrainRenderSetup
0.033 0.033 252731 TerrainRenderSetVertexBuffer
0.030 0.030 6521 IRO_GetZones
0.030 0.079 6521 InteriorZoneTraversal
0.024 0.033 17208 MemoryRealloc
0.020 1.648 3243 AdvanceServerTime
0.020 0.022 3243 SimAdvanceTime
0.019 0.023 3243 TimeManagerProcessMain
0.019 77.747 3243 RenderFrame
0.018 7.681 6521 IRO_ComputeActivePolys
0.018 0.018 323913 NewFontGetCharInfo
0.017 0.017 343480 oneUTF8toUTF32
0.016 0.016 252939 TerrainRenderLock
0.014 0.014 3243 glFinish
0.013 0.013 252939 TerrainRenderUnlock
0.012 0.315 11307 DrawText_UTF8
0.012 100.000 3243 MainLoop
0.012 0.012 91346 TSShapeInstanceRenderBillboards
0.010 7.827 3208 TerrainRender
0.005 0.005 57770 PrecipCreateSplash
0.004 0.019 1661 ConformToGround
0.004 0.004 3243 TelconsoleProcessMain
0.004 0.004 57740 PrecipDestroySplash
0.004 1.652 3243 ServerProcess
0.002 0.002 3243 JournalMain
0.002 0.002 3243 TelDebuggerProcessMain
0.000 0.000 1 convertUTF16toUTF8
0.000 0.000 14 oneUTF16toUTF32
0.000 0.000 14 oneUTF32toUTF8
#2
04/06/2007 (12:26 pm)
Sorry that was so long. I didn't bother posting the root which lists everything by hierarchy.

Well what I can glean from this?
1. Shadow rendering ate up a lot of my time but that's no surprise with all of the lights crowded in strong hold in the area where kork loops around. If you check the editor there's a bunch and they each cast shadows.
2. Raycasting used up a lot of time for some reason? I can't imagine why in this scenario... does stock tge 1.5.1 use ray casting on the precipitation?

Anybody know what PlatformProcessMain and IRO_RenderARB_FC do?

I'll try and post my a dumps from some different scenarios soon.
#3
04/07/2007 (10:28 am)
Make sure to run the profiler in a release build - the time taken by various sections of code is different between release and debug builds.

Disabling multi-shadows or reducing the max number of lights that affect dts objects, can both reduce the shadow processing (see TDN for details).

The ray casts could be related to the new Dynamic Lighting Occlusion feature, which can also be reduced or disabled (check out the TGE 1.5.1 thread, search the thread for "Dynamic Lighting Occlusion").
#4
04/07/2007 (12:23 pm)
Ah thanks for the tips. I was in fact in release build yes. Well I know how to turn down the features to get the most benefit my machine now. That gives me an idea actually.

I wonder if I could reprogram the profiler to help the game make intelligent changes to preferences on the fly to optimize frame rates based on the demands of the moment. Something to try to keep the frame rate above a certain minimum perhaps. I suppose that might be easier than hunting for optimizations in the existing code.

Anyways, if anyone else wants to post some profiler dumps I'm still interested to see where torque invests its time heavily.
#5
04/14/2007 (9:05 pm)
Here's mine (I deleted everything under 1%):

%NSTime % Time Invoke # Name
33.283 33.294 5196 PlayerRenderShadow
20.520 20.935 61920 TSShapeInstanceRender
11.608 11.608 712826 ContainerCastRay
3.600 3.600 4740 IRO_RenderARB_FC
3.371 3.372 2079 PlatformProcessMain
2.893 2.893 4158 PrecipRenderPrecip
2.269 7.188 61920 LightManager_sgSetupLights
2.204 2.204 317923 TerrainRenderXFRender
1.774 8.645 6218 PrecipProcess
1.618 1.618 2079 WaterBlock_render
1.486 1.495 4740 ISAPL_Setup
1.307 1.307 318038 TerrainRenderBuild
1.287 6.770 2079 TerrainRenderEmit

This is with the glFinish fix. TSShapeInstanceRender is probably taking up a lot of time because of multiple 9k poly character models.
#6
04/15/2007 (12:42 am)
The dynamic shadows in TGE are extremely cool,
but the performance hit is pretty significant.
we added a user option for shadow quality:
either high-definition or just circles directly under the player.
the performance savings w/ lo-fi shadows is pretty significant.
#7
04/15/2007 (1:13 am)
Profiler Data Dump:
Profiled blocks ordered by non-sub total time:

%NSTime % Time Invoke # Name
43.865 43.866 601 PlatformProcessMain
21.428 21.911 8414 TSShapeInstanceRender
7.715 7.728 5409 PlayerRenderShadow
2.304 47.091 601 CanvasRenderControls
2.149 2.522 601 AdvanceClientTime
2.075 2.205 601 BuildSceneTree
1.992 2.106 12687 DrawText
1.796 1.933 601 SimAdvanceTime
1.597 37.900 12621 SceneStateRenderImage
1.341 1.341 601 BuildFogTexture
1.316 1.316 601 glFinish
1.172 30.377 7813 ShapeBaseRenderObject
1.116 1.116 601 SkyRender
1.106 1.106 601 SwapBuffers
0.766 0.766 601 WaterBlock_render



Here's mine.

What's this ProcessPlatformMain all about??
#8
04/15/2007 (4:07 am)
Quote:
I wonder if I could reprogram the profiler to help the game make intelligent changes to preferences on the fly to optimize frame rates based on the demands of the moment. Something to try to keep the frame rate above a certain minimum perhaps. I suppose that might be easier than hunting for optimizations in the existing code.

That is a pretty cool feature. We added something similiar that lowers the amount of particles, LOD and update frequency for the skin update function when the framerate is low.

I would post profiler dumps if it were not for the fact that we use TGEA. :/
#9
04/15/2007 (4:12 pm)
48.642  48.642     1023 SwapBuffers

OMG! This is with a Gforce FX 5600