Optimizing the particle system
by Guilherme Santos · in Torque Game Engine Advanced · 02/10/2005 (3:46 am) · 5 replies
I was looking to the particle system, and I think with a few modifications it can be optimized.
In the method ParticleEmitter::copyToVB the vertex data is created and it's locked. After it's locked a pointer to the vertex data it's returned and each vertex is updated with the particle data.
This procedure it's slow!
I would like to suggest copying the particles data all at once using dMemcpy.
For example: using a temporary variable to fill the particle data and after send it to the vertex buffer will result in an incredible gain of fps.
----file particleEmitter.cpp ----
GFXVertexPCT tempVert[4];
if( mDataBlock->orientParticles )
{
for( U32 i=0; i
{
//setupOriented( partPtr, camPos, &verts[i*4] );
setupOriented( partPtr, camPos, tempVert );
dMemcpy(&verts[i*4], &tempVert, sizeof(GFXVertexPCT)*4);
}
}
else
{
// somewhat odd ordering so that texture coordinates match the oriented
// particles
Point3F basePoints[4];
basePoints[0] = Point3F(-1.0, 0.0, 1.0);
basePoints[1] = Point3F(-1.0, 0.0, -1.0);
basePoints[2] = Point3F( 1.0, 0.0, -1.0);
basePoints[3] = Point3F( 1.0, 0.0, 1.0);
MatrixF camView = GFX->getWorldMatrix();
camView.transpose(); // inverse - this gets the particles facing camera
for( U32 i=0; i
{
//setupBillboard( partPtr, basePoints, camView, &verts[i*4] );
setupBillboard( partPtr, basePoints, camView, tempVert );
dMemcpy(&verts[i*4], &tempVert, sizeof(GFXVertexPCT)*4);
}
}
---- ----
From here it's easy to organize the code to reflect this optimization.
I have also added particle animation using only one texture. The overhead it's small.
What do you think of these suggestions?
I can submit it has a resource after I finished cleaning up the code :)
--- edit ---
Using 17 particle system with ~35 particles each, i have a gain superior to 20fps in debug mode (running in a GeForce 6600 GT and a P4 3.4GHz).
In the method ParticleEmitter::copyToVB the vertex data is created and it's locked. After it's locked a pointer to the vertex data it's returned and each vertex is updated with the particle data.
This procedure it's slow!
I would like to suggest copying the particles data all at once using dMemcpy.
For example: using a temporary variable to fill the particle data and after send it to the vertex buffer will result in an incredible gain of fps.
----file particleEmitter.cpp ----
GFXVertexPCT tempVert[4];
if( mDataBlock->orientParticles )
{
for( U32 i=0; i
//setupOriented( partPtr, camPos, &verts[i*4] );
setupOriented( partPtr, camPos, tempVert );
dMemcpy(&verts[i*4], &tempVert, sizeof(GFXVertexPCT)*4);
}
}
else
{
// somewhat odd ordering so that texture coordinates match the oriented
// particles
Point3F basePoints[4];
basePoints[0] = Point3F(-1.0, 0.0, 1.0);
basePoints[1] = Point3F(-1.0, 0.0, -1.0);
basePoints[2] = Point3F( 1.0, 0.0, -1.0);
basePoints[3] = Point3F( 1.0, 0.0, 1.0);
MatrixF camView = GFX->getWorldMatrix();
camView.transpose(); // inverse - this gets the particles facing camera
for( U32 i=0; i
//setupBillboard( partPtr, basePoints, camView, &verts[i*4] );
setupBillboard( partPtr, basePoints, camView, tempVert );
dMemcpy(&verts[i*4], &tempVert, sizeof(GFXVertexPCT)*4);
}
}
---- ----
From here it's easy to organize the code to reflect this optimization.
I have also added particle animation using only one texture. The overhead it's small.
What do you think of these suggestions?
I can submit it has a resource after I finished cleaning up the code :)
--- edit ---
Using 17 particle system with ~35 particles each, i have a gain superior to 20fps in debug mode (running in a GeForce 6600 GT and a P4 3.4GHz).
#2
02/10/2005 (2:33 pm)
The particle system has only be converted over, by the end of TSE devlopment they will be able to take advantage of shaders, best to hold off till they are finished with it.
#3
- Updated all the particles to a static vertex array - then copy them all to the vertex buffer at once at the end of the function.
- Removed most indexed access to data ([]).
- Unrolled the loop in setupBillboard
Thanks for pointing this out!
02/10/2005 (2:53 pm)
Hey Guilherme, I've taken a good look at this and you're right, it's not accessing the vertex buffer optimally at all. I've updated the code with the following improvements:- Updated all the particles to a static vertex array - then copy them all to the vertex buffer at once at the end of the function.
- Removed most indexed access to data ([]).
- Unrolled the loop in setupBillboard
Thanks for pointing this out!
#4
Beside the particle animation, I added an optional shader to render the particles. But I using the Effect architecture implemented by David Rodrigues.
I think I can modify it to read a ShaderData or CustomMaterial :)
02/11/2005 (3:16 pm)
@Brian: No problem ;)Beside the particle animation, I added an optional shader to render the particles. But I using the Effect architecture implemented by David Rodrigues.
I think I can modify it to read a ShaderData or CustomMaterial :)
#5
02/15/2005 (11:53 am)
Yeah, that's already on my list, gotta have shader driven particles!
Associate Tom Spilman
Sickhead Games