by date
G-Buffer Normals and Trig Lookup Textures
G-Buffer Normals and Trig Lookup Textures
| Name: | Pat Wilson | ![]() |
|---|---|---|
| Date Posted: | Aug 28, 2008 | |
| Rating: | 3.0 out of 5 | |
| Public: | YES | |
| Comments: | YES | |
| RSS Feed: | or Subscribe with . | |
| Profile Page: | View profile page for Pat Wilson |
Blog post
(Cross posted from personal blog. Hopefully someone has a rockin' idea about this.) I've switched to using spherical co-ordinates to encode world-space G-buffer normals. This has a lot of advantages to it, especially for 8:8:8:8 G-buffers. You can now store [Theta, Phi, DepthHi, DepthLo] in the 8:8:8:8 target. Bumping the format to 16:16:16:16 not only gives greater precision, but (depending on the depth resolution you need) you can get an extra channel for information storage. (Mmmm virtual texture bitfield?)
There are a few problems with this, though. The first is the atan2 function. On good hardware, this will not be an issue. My GeForce 8800 chews through any shader I throw at it. My Radeon x1300...not so much. Of course it's easy to make things run fast on good hardware. The challenge is making it run decently on lower end hardware. This is how I am encoding/decoding G-buffer normals:
inline float2 cartesianToSpGPU( in float3 normalizedVec )
{
float atanYX = atan2( normalizedVec.y, normalizedVec.x );
float2 ret = float2( atanYX / PI, normalizedVec.z );
return POS_NEG_ENCODE( ret );
}
inline float3 spGPUToCartesian( in float2 spGPUAngles )
{
float2 expSpGPUAngles = POS_NEG_DECODE( spGPUAngles );
float2 scTheta;
sincos( expSpGPUAngles.x * PI, scTheta.x, scTheta.y );
float2 scPhi = float2( sqrt( 1.0 - expSpGPUAngles.y * expSpGPUAngles.y ), expSpGPUAngles.y );
// Renormalization not needed
return float3( scTheta.y * scPhi.x, scTheta.x * scPhi.x, scPhi.y );
}
Storing normal.z instead of acos( normal.z ) saves a decent chunk of encode/decode.
So I decided to try to use a lookup texture instead of calling atan2 to encode the normals. I made a 256x256 A8 texture and filled it with atan2 values. The texture can be seen, to the right. This is the code for generating the texture:
GFXTexHandle *RenderPrePassMgr::getAtan2Texture()
{
if( mAtan2Handle.isNull() )
{
// Create a lookup texture to output a normalized atan2 result
const U32 cLookupTexSz = 256;
mAtan2Handle.set( cLookupTexSz, cLookupTexSz, GFXFormatA8, &GFXLookupTextureProfile, 1 );
GFXLockedRect *atan2Mem = mAtan2Handle.lock();
for( int y = 0; y < cLookupTexSz; y++ )
{
for( int x = 0; x < cLookupTexSz; x++ )
{
F32 xval = ( ( x / F32(cLookupTexSz) ) * 2.0f - 1.0f );
F32 yval = ( ( y / F32(cLookupTexSz) ) * 2.0f - 1.0f );
U8 &outU8 = atan2Mem->bits[y * cLookupTexSz + x];
F32 atanRes = ( atan2( yval, xval ) + M_PI_F ) / M_2PI_F;
U8 u8Res = mFloor( atanRes * 255.0f );
outU8 = u8Res;
}
}
mAtan2Handle.unlock();
}
return &mAtan2Handle;
}
There is a possible discontinuity when V is near 0.5 and U < 0.5. So if normal.y is near 0.0, and normal.x < 0.0 than you get some artifacts that won't occur if you actually call atan2. The A8 format isn't the issue (I don't think) because even if you use the actual function, you are still encoding the result to an 8-bit value. The resolution of the texture could be an issue, but doubling the resolution did not effect the error rate, in my tests.
I haven't quite figured out what to do with this yet. A G-buffer shader is going to be heavy on math, light on texture operations (Well this depends on how you are doing deferred shading. See upcoming ShaderX7!) and doing the atan2 as a texture sample can save a lot of instructions and cycles, depending on the hardware.
Recent Blog Posts
| List: | 08/28/08 - G-Buffer Normals and Trig Lookup Textures 06/17/08 - Multi-Threaded Mesh Skinning Showing Promise 08/14/06 - Torque X and Microsoft 02/11/06 - 10th Most Popular Game on Live 07/13/05 - Development of TSE360 07/03/05 - Where's the beef 06/09/05 - Recovery Time 04/30/05 - FrameTemp template class and COM in Torque |
|---|
Submit your own resources!| Novack (Aug 28, 2008 at 19:23 GMT) |
| Ross Pawley (Aug 28, 2008 at 21:02 GMT) |
| Pat Wilson (Aug 28, 2008 at 21:42 GMT) |
inline float2 cartesianToSpGPU( in float3 normalizedVec, in sampler2D atan2Sampler )
{
float atanYXOut = tex2D( atan2Sampler, floor( POS_NEG_ENCODE(normalizedVec.xy ) * 255.0 ) / 255.0 ).a;
float2 ret = float2( atanYXOut, POS_NEG_ENCODE( normalizedVec.z ) );
return ret;
}
The critical bit is:
floor( value * 255.0 ) / 255.0
| Pat Wilson (Aug 28, 2008 at 21:56 GMT) |
I'll take a proper GPU any day of the week over a lookup, but this is a pretty slick hack. This also lets you fold range-reduction logic into the results. I don't think I can do any kind of great range reduction on the result of atan2() because I am taking both the sine and cosine of the result.
| Orion Elenzil (Aug 29, 2008 at 01:24 GMT) |
Quote:
I made a 256x256 A8 texture and filled it with atan2 values.
awesome.
| Pat Wilson (Aug 29, 2008 at 20:45 GMT) |
It's not updated regularly, and it doesn't have any kind of consistent content.
| Novack (Aug 30, 2008 at 04:49 GMT) |
I tought you were angry with me, but was sure you just needed some time to rethink things.
Nice name election, btw.
| Hadoken (Sep 03, 2008 at 13:29 GMT) |
I'm interested in knowing more about this technique you're using, in particular what's the advantage of using spherical coords?
I mean, if you stored screen-space normal and did screen-space lighting instead of world-space, two cartesian coordinates would be enough to store a normal (assuming you're culling back-facing triangles), and that would save you all the encoding/decoding.
Apparently these days world-space deferred lighting is somewhat popular though, and I'd like to know why :)
You must be a member and be logged in to either append comments or rate this resource.



3.0 out of 5


