FxGrass Discussion: Performance
by Stefan Lundmark · in General Discussion · 10/15/2004 (3:17 pm) · 14 replies
Hello everyone
I've been playing around with much stuff lately, but the FxGrass feature has always interested me, and it's a cool feature for a game that's landscape is mainly forest.
Now, I have been doing much testing with this and there is something odd with the results and I thought I'd ask others about this as I feel I can't answer it myself.
Why doesn't the hardware seem to help, in regards to performance?
We got five machines here, for testing. We placed FxGrass objects (4, to be precise) on the map and ran it on all of the computers with identical software on identical hardware.
The resolution is 1024x768, 100hz, 1400 View Distance. Culling set to 64 units.
These are the results:
AMD - XP1700+, 512 RAM, Radeon8500.
15-16ish fps.
Pentium 4 HT - 3400mhz, 1024 RAM, GeForce 4 Ti4800.
15-18ish fps.
Pentium 4 HT - 2200mhz, 2048 RAM, GeForce FX 6900.
19-21ish fps.
AMD Athlon64 3000+, 512 RAM, Radeon 800XT.
Around the same as above in fps.
Pentium 4 - 1400mhz, 256 RAM, GeForce 3 MX.
14ish fps.
---------------------------------------------------------------------------------
Now, don't you all think the difference between these systems are kinda slim?
I was expecting alot more fps from the Athlon64 and the P4's with HT.
I've heard someone say (could had been Melv, but don't quote me on that) that this has something to do with Vertex Buffers and that it will work much better in TSE.
Is that true?
Is there any logical explanation behind why it works like it does?
To me it looks like the feature doesn't harness the power of the system.
I've been playing around with much stuff lately, but the FxGrass feature has always interested me, and it's a cool feature for a game that's landscape is mainly forest.
Now, I have been doing much testing with this and there is something odd with the results and I thought I'd ask others about this as I feel I can't answer it myself.
Why doesn't the hardware seem to help, in regards to performance?
We got five machines here, for testing. We placed FxGrass objects (4, to be precise) on the map and ran it on all of the computers with identical software on identical hardware.
The resolution is 1024x768, 100hz, 1400 View Distance. Culling set to 64 units.
These are the results:
AMD - XP1700+, 512 RAM, Radeon8500.
15-16ish fps.
Pentium 4 HT - 3400mhz, 1024 RAM, GeForce 4 Ti4800.
15-18ish fps.
Pentium 4 HT - 2200mhz, 2048 RAM, GeForce FX 6900.
19-21ish fps.
AMD Athlon64 3000+, 512 RAM, Radeon 800XT.
Around the same as above in fps.
Pentium 4 - 1400mhz, 256 RAM, GeForce 3 MX.
14ish fps.
---------------------------------------------------------------------------------
Now, don't you all think the difference between these systems are kinda slim?
I was expecting alot more fps from the Athlon64 and the P4's with HT.
I've heard someone say (could had been Melv, but don't quote me on that) that this has something to do with Vertex Buffers and that it will work much better in TSE.
Is that true?
Is there any logical explanation behind why it works like it does?
To me it looks like the feature doesn't harness the power of the system.
About the author
#2
Average Grass Count at 1100 is nothing compared to the examples and you could probably use the foliage replicator for that instead since you ain't using culling.
Also, 140k texture size? Can you elaborate on that one please? :)
Note though: Sway and Light doesn't seem to affect performance in our case.
Bendik showed FxGrass off with 200 000 grass objects total at once, I believe..
That looks really great.
--------------------------------------
The point being that increasingly better systems do not seem to make it any better performance-wise.
10/15/2004 (3:48 pm)
Culling OFF? Gash.Average Grass Count at 1100 is nothing compared to the examples and you could probably use the foliage replicator for that instead since you ain't using culling.
Also, 140k texture size? Can you elaborate on that one please? :)
Note though: Sway and Light doesn't seem to affect performance in our case.
Bendik showed FxGrass off with 200 000 grass objects total at once, I believe..
That looks really great.
--------------------------------------
The point being that increasingly better systems do not seem to make it any better performance-wise.
#3
What grass counts are you using?
Light doesn't affect things here, but high sway rates can cause
occasional chugging.
>The point being that increasingly better systems do not seem
>to make it any better performance-wise.
But is there a visual quality difference?
I've been impressed with TGE performace on some laptops... didn't
look quite as good but ran smoothly enough.
10/15/2004 (5:21 pm)
140k file size, pngs 512x512, alpha channel, not images of grass What grass counts are you using?
Light doesn't affect things here, but high sway rates can cause
occasional chugging.
>The point being that increasingly better systems do not seem
>to make it any better performance-wise.
But is there a visual quality difference?
I've been impressed with TGE performace on some laptops... didn't
look quite as good but ran smoothly enough.
#4
I put 1M grasscounts on a placement area 1024x1024 on my low end machines.
with cullres 16 ,viewclosest 1,viewdistance 70 then i get 10 fps
if i use viewdistance 1400 hehe then i can get some coffee because my machines stand still :)
Edit.
But when i use grass i often tweak my textures so i cant see the differns between the grass and the ground texture.
So even if i have viewdistance 50 on the fx grass it looks like its grass all over.
10/15/2004 (8:24 pm)
Made it a try I put 1M grasscounts on a placement area 1024x1024 on my low end machines.
with cullres 16 ,viewclosest 1,viewdistance 70 then i get 10 fps
if i use viewdistance 1400 hehe then i can get some coffee because my machines stand still :)
Edit.
But when i use grass i often tweak my textures so i cant see the differns between the grass and the ground texture.
So even if i have viewdistance 50 on the fx grass it looks like its grass all over.
#5
I'm asking how the performance could be increased, but obviously I was misinformed about the vertex shaders.
10/16/2004 (6:08 am)
@Sam: FxGrass is not a standard feature with TGE..I'm asking how the performance could be increased, but obviously I was misinformed about the vertex shaders.
#6
Feeding fxGrass (or even fxReplicators) via VRam could mean some speed improvement.
It is on my list of things to try out, but that is a long list :)
10/19/2004 (1:40 am)
Wouldn't be possible to use vertex buffers in TGE already? The OGL function has been supported by hardware for some time.Feeding fxGrass (or even fxReplicators) via VRam could mean some speed improvement.
It is on my list of things to try out, but that is a long list :)
#7
10/19/2004 (4:52 pm)
I think the current one already uses vertex shaders.. but that there are problems with it. Anyway, seems like I have to dig into the code and look myself.
#8
Please post your findings on this issue. I haven't had a chance to look at this, otherwise I would help. But I would love to see what you come up with!
Thanks,
Joe
10/20/2004 (6:03 am)
Stefan,Please post your findings on this issue. I haven't had a chance to look at this, otherwise I would help. But I would love to see what you come up with!
Thanks,
Joe
#9
- Melv.
10/20/2004 (8:27 am)
I'm not sure what the fxGrass uses (I didn't do it) but the fxFoliageReplicator, doesn't use vertex shaders. I mentioned sometime ago that there are many improvements that could be made to the replicators such as the use of vertex-buffers and/or shaders to improve performance. The fxFoliageReplicator isn't a finished product in any sense but simply a stepping-stone to better things such as the fxGrass. It's all there and fairly easy to improve upon. :)- Melv.
#10
Does it do better performance, as this seems an issue for me at the moment/'
Steve
12/15/2004 (12:18 pm)
I cant seem to get the FX GRass to work on my machine. It asks for an image filename but doesnt do anything else!! Is the Grass rep worth pursuing and getting to work? How different is it to the foilage rep??Does it do better performance, as this seems an issue for me at the moment/'
Steve
#11
I did a major clean-up later and one day when I decided to give it another try it worked.
fxGrass is a performance hit waiting to happen if you go overboard, and you will go overboard the first few times, trust me :)
Overall it is a nice addition tough, I'm going to give some more time when I get back to design mode.
12/20/2004 (2:18 am)
Stevie, I had similar problems at one point with fxGrass. I "fixed" it but cant really remember how (IIRC, it just started to work) but it might have been something wrong in my scripts.I did a major clean-up later and one day when I decided to give it another try it worked.
fxGrass is a performance hit waiting to happen if you go overboard, and you will go overboard the first few times, trust me :)
Overall it is a nice addition tough, I'm going to give some more time when I get back to design mode.
#12
Cheers
12/20/2004 (2:19 am)
Although i have read the info for FX Grass and FX Foilage, can u please explain maybe... why they are two separate things - what is the main difference between the FX Grass and FX Foilage.Cheers
#13
fxGrass displays more objects and places them in a random "chaos" way, emulation the way grass actually appear.
fxFoliage spreads fewer objects in a more plain, random fashion. Better suited for general foliage (plants, flowers etc) that has a organic, yet suttle pattern to its placement.
12/20/2004 (2:38 am)
As I understand it, fxGrass displays more objects and places them in a random "chaos" way, emulation the way grass actually appear.
fxFoliage spreads fewer objects in a more plain, random fashion. Better suited for general foliage (plants, flowers etc) that has a organic, yet suttle pattern to its placement.
#14
I haven't looked at the code myself, but there's a bunch of pretty basic things that can be done to increase performance a lot. The first thing to look at would be the batch size of what is being drawn. Depending on how much is being drawn, all the grass should be done in 1-5 draw calls. Ideally they would all draw from the same vertex buffer, but if there's more than 65536 verts of grass, it should be broken into multiple buffers. Using the same texture for all the grass would improve the performance if it isn't already. For more variety you could put several grass textures into one atlas to save the hit you take from changing textures.
See nVIDIA's and/or ATI's developer sites for more info. You can do everything from optimizing your vertex order for maximum vertex cache performance to drawing all the grass to a smaller render buffer to reduce alpha overdraw.
12/20/2004 (1:59 pm)
Stefan, the results you posted would seem to indicate that the scenes you are rendering are GPU limited and not CPU limited. The speeds were pretty much going according to the quality of the video cards rather than the CPU speeds. This is to be expected as drawing a bunch of grass should not be too taxing on the CPU. I haven't looked at the code myself, but there's a bunch of pretty basic things that can be done to increase performance a lot. The first thing to look at would be the batch size of what is being drawn. Depending on how much is being drawn, all the grass should be done in 1-5 draw calls. Ideally they would all draw from the same vertex buffer, but if there's more than 65536 verts of grass, it should be broken into multiple buffers. Using the same texture for all the grass would improve the performance if it isn't already. For more variety you could put several grass textures into one atlas to save the hit you take from changing textures.
See nVIDIA's and/or ATI's developer sites for more info. You can do everything from optimizing your vertex order for maximum vertex cache performance to drawing all the grass to a smaller render buffer to reduce alpha overdraw.
Torque Owner Sam3d
I'm using 4 grass replicators & 4 foliage replicators,
average grass count is 1100,
average texture size is 140k
sway on, light on.
P4 2800, 1MB ram, radeon 9800 pro.
1024x768, 80hz, 1200 View Distance. Culling OFF
Full screen, with everything visible, never drops below 30 fps.