Game Development Community

Mac glDrawElements Performance Issue

by Joe Sulewski · in Torque Game Engine · 02/04/2005 (12:31 pm) · 25 replies

Hello,

This is an extension of another thread. I moving here because I figure I have a better chance of getting in front of a good GL developer.

I never think to post where the registered developers can go.

I found what I believe to be a serious performance bottleneck in the Mac version of TGE. It appears that the glDrawElements method is unreliable and is taking way to much resources. I added my own profile checks to the source code and located the bottle neck as per the profiler to be in the function glDrawElements in the TSMesh.render method.

I know this is old but I found this thread on the Mac developer lists, here , in which a game developer found a big bottle neck porting a game from the PC. It sounds like what I'm experiencing.

Here's the short and skinny of what I'm experiencing. I'm getting huge frame rate drops with very simple dts objects. 70 dts objects with 12 detail polys cause a 17fps drop. When I add my real objects of between 70 and 120 polys each I go from 60 fps (without the objects) to 19 (with the objects) and many are hidden from view.

So are there any Mac GL developers that can help me fix this or find a better way to draw the elements? My GL experience is very old, out of date and I'm so new to torque so I don't really know where to start. Plus I just don't have a lot of time before I show what I'm doing to some people who will let me know if I'm on the right track. So I'm getting close to a panic mode. :)

Any help from developers will be greatly appreciated.
Page«First 1 2 Next»
#21
09/11/2005 (4:20 pm)
Did you guys end up resolving this issue? We are seeing the same kind of behavior rendering many small (~200 poly) tubes at once on lower-end systems (like a P3-500 with GeForce2 or an 800mhz Emac). The profiler reports most of the time spent in that DrawElements function on these systems (like 25%-40% depending on the test case).

Also we found that the frame rate drops off suddenly as you add more objects. Adding 10 tubes at a time, the frame rate eventually dropped from a solid 60 fps to almost 30 once enough tubes were added. It's almost like some kind of fill rate threshold is being reached on the older video card.

Anyway, this is one of the last optimizations we are making to TubeTwist and so we are hoping you all found some kind of resolution (and are still watching this thread :)).
#22
09/11/2005 (5:24 pm)
I never heard back from Josh on this. I created the sample project that he asked me to create which illustrates the issue and he never got back to me. So in my case the issue still remains. There is definitely something not right with that call and the tiger version of the OS doesn't seem to help. I was hoping that if it was an OpenGL issue that Tiger would fix it.

According to your post it appears to be an issue not local to Apple which is interesting.
Joe
#23
09/12/2005 (11:46 am)
Joe, Justin -

I've never really been sure that there is a problem here at all. Framerate drops as you add more objects are not unusual. It's hard to say what the framerate drop "should" be, but in your case, Justin, I'm not sure that 30 fps with enough tubes is a "bug". You can't use straight number of polys to guide you here, since video cards aren't great at rendering a lot of low poly objects, especially when the setup code is repeated for each one. There are things you can do to improve that situation. E.g., for ThinkTanks we made sure that if we had 10 trees to render we rendered all 10 tree trunks without reseting the materials each time, then we rendered the leaves.

Another thing to keep in mind (which makes interpreting profile data difficult) is that different gl implementations will spend time in different gl calls, even though they are suffering from the same problem. So, for example, Josh was implying that gl might stall in glDrawElements because of glFinish. That might happen if gl allowed the cpu to continue after a call to glFinish until you make another gl call. It might also simply block at the call to glFinish (which is what I think most gl implementations do). So where the time is spent doesn't necessarily mean that is where the problem lies. In this case, it might lie with setting up materials over and over, but the price isn't paid till you finally try to draw something (which is what glDrawElements is doing). It's also the case that the portal code in Torque does a lot of weird viewport changes for each object that is rendered (even if there are no portals in the scene). It is possible that this is costing you something but doesn't show up till you start to draw an object. In ThinkTanks, this is one of the things we fixed. Unfortunately, there is no easy way to copy these changes over to a new game, especially since we didn't use interiors and probably broke interiors with our changes.

To sum up, from what I'm seeing in the thread, the issue has more to do with a lot of low poly objects on low end machines rather than a general performance issue on the mac (since Justin is seeing similar results on a mac and win system, and Joe's window's test looked to have similar issues).

Finally, so that we can make apples to apples comparisons, be sure to list the video resolutions you are using. Joe mentioned getting low performance with 1024x768. Well, then the doctor says don't do that. I.e., 1024x768 is hi-res, try turning the res down to 800x600 or 640x480. Similarly, what are the resolutions of the bitmaps. The chugging Joe described when turning to look at the objects (which disappeared when they were finally on screen) sounds like a texture thrashing issue. Really hi-res textures combined with high resolution display can result in not having enough video memory. If you are also working windowed, you need room on the video card for the desktop too, so keep that in mind. A good test of whether or not it is texture thrashing would be to use really low res textures (64x64 should be good enough) and run at 640x480 fullscreen. If it's slow there too then it isn't texture thrashing.

Yet another "finally": one thing to look out for in Torque when using terrain and a lot of objects is to make sure you aren't spending too much time on occlusion culling. At it's best, the occlusion culling code can cut out objects that would otherwise be taking a lot of gpu time. At it's worst, it can do no good but take up 20-40% of the cpu. Find the occlusion culling code in the scene graph and put profile statements around it so that you can track how much time it's taking. I have a resource for optimizing the occlusion culling in Torque so that it only takes up a lot of time when it's being helpful. I believe that is in 1.4 (probably should have been integrated earlier). Obviously, this applies to Joe's case but not Justin's.

Ok, hope that brain dump helps...
#24
09/13/2005 (12:48 am)
Thanks a lot Clark, it's an excellent explanation!
#25
09/13/2005 (8:14 am)
Thanks for the response Clark, glad you were still tracking this one. We're caught up in a press release / screen shot flurry but when that's done, I'll respond with some more information about the tests showing the slow downs.
Page«First 1 2 Next»