Shouldn't SwapBuffers be fast?
by Andy Schatz · in Torque 3D Professional · 06/25/2009 (5:47 pm) · 28 replies
Am I missing something in how I should be optimizing torque? I'm currently testing on my mid-range laptop (P3 2.5GHz, GeForce 8400), and in Basic lighting I only get 13.3 fps. But when I run a profile, it spends 90% of its time in SwapBuffers, and no significant time in any other single function. Is the profiler actually missing something? Shouldn't SwapBuffers be the sort of thing that can be performed at like 300 Hz?
#22
04/07/2010 (8:24 am)
Empty room has nothing to do with it. It's deferred lighting. Change your resolution. If the framerate goes up, you're fillrate bound.
#23
04/07/2010 (8:43 am)
I'll also repeat what I said in other threads: for *any* Intel IGPs, or low end GeForce 7XXX / ATI X1XXX cards or older, force T3D to use to 2.0 shaders:$pref::Video::forcePixVersion = true; $pref::Video::forcedPixVersion = 2.0;These are pref values. The best way for forcing them as defaults for specific video cards is creating vendor/card-specific profile scripts in the profile folder (an Intel.cs script would be a nice place to force stuff off on all Intel cards, as example).
#24
I'm talking about Basic Lighting. I assumed that was clear from my post. Does Basic Lighting use deferred lighting as well?
On the deathball desert level, in BASIC LIGHTING, in windowed mode, 800x600 resolution, when I run the profiler I'm getting 66% time spent in swapbuffers. When I go to full-screen, it goes down to 10%.
Again, this is BASIC LIGHTING mode, not deferred lighting. I switched AL off long ago and never looked back. Obviously, being in windowed mode is making swapbuffers do more work. This makes sense, but these percentages are still way off what they should be.
For comparison, when I run TGEA 1.8.2 in windowed mode, 800x600 resolution, I get about 1% frametime spent in swapbuffers. When I switch to the highest resolution, in windowed mode, I get about 1% frametime spent in swapbuffers. swapbuffers is taking up much less frame time in TGEA. Even in windowed mode, in TGEA swapbuffers only takes about 1%-2% frame time.
it's easy to say it's just a video card or resolution issue, but when you take into consideration the comparison of my results in TGEA to my results in T3D BASIC LIGHTING, the numbers just don't make sense to me. I thought TGEA was comparable to T3D basic lighting.
04/07/2010 (6:18 pm)
Quote:
Empty room has nothing to do with it. It's deferred lighting. Change your resolution. If the framerate goes up, you're fillrate bound.
I'm talking about Basic Lighting. I assumed that was clear from my post. Does Basic Lighting use deferred lighting as well?
On the deathball desert level, in BASIC LIGHTING, in windowed mode, 800x600 resolution, when I run the profiler I'm getting 66% time spent in swapbuffers. When I go to full-screen, it goes down to 10%.
Again, this is BASIC LIGHTING mode, not deferred lighting. I switched AL off long ago and never looked back. Obviously, being in windowed mode is making swapbuffers do more work. This makes sense, but these percentages are still way off what they should be.
For comparison, when I run TGEA 1.8.2 in windowed mode, 800x600 resolution, I get about 1% frametime spent in swapbuffers. When I switch to the highest resolution, in windowed mode, I get about 1% frametime spent in swapbuffers. swapbuffers is taking up much less frame time in TGEA. Even in windowed mode, in TGEA swapbuffers only takes about 1%-2% frame time.
it's easy to say it's just a video card or resolution issue, but when you take into consideration the comparison of my results in TGEA to my results in T3D BASIC LIGHTING, the numbers just don't make sense to me. I thought TGEA was comparable to T3D basic lighting.
#25
Several newer threads have solved their issues using his suggested fix.
Afaik Gideon uses SM 3 shaders so there will be a bit of fixing involved.
04/08/2010 (11:36 am)
*points to Manoel's post*Several newer threads have solved their issues using his suggested fix.
Afaik Gideon uses SM 3 shaders so there will be a bit of fixing involved.
#26
If I were to take a guess at what the problem may be, I'd say either T3D is creating multiple swap chains in Basic Lighting for some reason, or it's using 1 swap chain but allocating more than 3 buffers degrading performance.
04/08/2010 (6:18 pm)
I appreciate your help Jacob, but that change had no effect on my results:==>echo($pref::Video::forcePixVersion); 1 ==>echo($pref::Video::forcedPixVersion); 2 ==>profilerdump(); Profiler Data Dump: Ordered by non-sub total time - %NSTime % Time Invoke # Name 62.534 62.534 2522 SwapBuffers 9.931 11.422 129457 GFXDevice_drawTextN 3.913 33.005 2522 CanvasRenderControls 2.515 4.939 10088 RenderMeshMgr_render 2.371 99.922 2522 MainLoop 2.115 2.147 5044 TSSkinMesh_UpdateSkin 1.309 2.026 390491 GFXDevice_updateStates 1.200 5.895 2522 ProjectedShadow_RenderToTexture 0.979 1.926 2522 RenderTerrainMgr_Render 0.872 0.872 10833150 GFont_getCharInfo 0.716 0.716 89135 GFXD3D9StateBlock_Activate 0.709 0.965 2522 AdvanceClientTime 0.696 0.696 5044 TSMesh_CreateVBIB 0.600 0.621 2522 DecalManager_RenderDecals_RenderBatch 0.561 0.805 17654 RenderObjectMgr_render 0.478 0.478 2330328 GenericConstBufferLayout_set 0.439 0.439 45396 terrCheck 0.433 3.860 47918 TSShapeInstance_Render 0.413 0.413 7566 ContainerCastRay 0.371 0.591 216892 LightManager_Update4LightConsts 0.338 0.693 388388 GFXD3D9ShaderConstBuffer_activate 0.335 0.335 224458 SceneRenderPassManager_addInst 0.311 0.582 68094 TSMesh_InnerRender 0.310 0.465 3652 Player_PhysicsSection 0.298 0.325 2522 RenderGlowMgr_Render 0.254 0.360 171496 MatInstance_setTransforms 0.233 0.234 134937 convertUTF8toUTF16 0.233 2.535 50440 treeTraverseVisit_prepRenderImage 0.214 0.439 141232 ProcessedShaderMaterial_SetupPass 0.206 0.206 35308 Frustum_UpdatePlanes 0.202 0.202 2522 ContainerCastRayRendered 0.179 0.263 2522 TerrainBlock_RenderBlock 0.165 3.557 5044 ShapeBase_PrepRenderImage 0.164 0.396 85748 TerrainCellMaterial_SetupPass 0.153 0.193 42874 TerrainCellMaterial_SetTransformAndEye 0.148 0.154 6174 ContainerFindObjects_Box 0.146 0.234 108446 GFXD3D9ShaderConstBuffer_activate_dirty_check_1 0.142 0.142 2522 GFXEndScene 0.135 8.166 5044 RenderPassManager_Render 0.128 0.174 73138 ProcessedShaderMaterial_SetShaderConstants 0.128 0.639 3652 AdvanceObjects 0.126 0.327 2522 LightFlareData_prepRender 0.122 0.122 368212 GenericConstBuffer_getDirtyBuffer 0.112 0.122 2522 CanvasPreRender 0.109 0.109 5044 RenderPassManager_Sort 0.105 0.797 388388 GFXD3D9Device_setShaderConstBufferInternal 0.100 1.152 2522 ClientProcess 0.091 6.052 2522 SceneGraphRender_PreRenderSignal 0.088 0.088 433784 GenericConstBuffer_isEqual 0.082 0.673 216892 BasicLightManager_SetLightInfo 0.078 0.562 153842 MatInstance_SetupPass 0.076 0.076 4354 SimFindObject 0.071 3.246 2522 BuildSceneTree 0.070 16.690 2522 SceneGraphRender 0.066 0.066 2522 GFXBeginScene 0.060 0.060 2522 SFXXAudioDevice_Update 0.057 0.057 95836 String_char_constructor 0.055 0.155 3652 Player_UpdatePos ...
If I were to take a guess at what the problem may be, I'd say either T3D is creating multiple swap chains in Basic Lighting for some reason, or it's using 1 swap chain but allocating more than 3 buffers degrading performance.
#27
switching resolution while in windowed mode doesnt seem to do much either. I switched to 640x480 and swapbuffers is still taking about 50% frame time while running at 50 fps in deathball. Is this a false metric from the profiler tool? is the profiler in TGEA giving me a false metric for swapbuffers?
04/08/2010 (7:59 pm)
@Logan - turning off shadows doesn't seem to make much of a difference. swapbuffers went from 63% to 59%.switching resolution while in windowed mode doesnt seem to do much either. I switched to 640x480 and swapbuffers is still taking about 50% frame time while running at 50 fps in deathball. Is this a false metric from the profiler tool? is the profiler in TGEA giving me a false metric for swapbuffers?
#28
I just ran Forge Demo in TGEA yesterday to see what swapbuffers is doing there. swapbuffers was at 75% which means TGEA is showing this same discrepancy in the profiler, depending on the mission.
this leads me to conclude that T3D has simply inherited this from TGEA, which makes sense.
I'm now inclined to think that this actually is simply a side effect of the CPU stalling while waiting for the GPU to finish up it's work as someone posted earlier.
This should be addressed since, at the very least, this is making the profiler show inaccurate results.
AFAIC, this issue is resolved.
04/10/2010 (10:50 am)
*update*I just ran Forge Demo in TGEA yesterday to see what swapbuffers is doing there. swapbuffers was at 75% which means TGEA is showing this same discrepancy in the profiler, depending on the mission.
this leads me to conclude that T3D has simply inherited this from TGEA, which makes sense.
I'm now inclined to think that this actually is simply a side effect of the CPU stalling while waiting for the GPU to finish up it's work as someone posted earlier.
This should be addressed since, at the very least, this is making the profiler show inaccurate results.
AFAIC, this issue is resolved.
Associate Logan Foster
perPixel Studios