Removing the 32 bit limit for packUpdate.
by Jerry Segler · 09/02/2005 (5:25 pm) · 13 comments
Download Code File
So you have finally hit the wall and found that the 32 bits that have been provided for in packUpdate are just not enough. You probably even looked into changing it, but discovered it's not an easy change to make. Well this resource will let you break that 32 bit barrier and put in as many as you want.
However, first a warning.. If you go down this road you will be far off the beaten path of a standard code base. You will not be able to easily merge in others resources, packs, etc to your code base. If you don't feel like you could do such a wide ranging change yourself then you probably shouldn't be doing it.
My major hope is that GG will adopt this (or a similar system) as part of the base so that this artificial limitation can go away. Admittedly it does help enforce the real limitation of limited bandwidth in a networking situation.
Now on to describe the changes.
First I built a specialized BitSet class called PackBitSet. I made it look similar to the STL bitset class.
This class DOES NOT support conversion to/from a U32 bitmask. I highly recommend never allowing that to occur since it will only lead to bugs and unsupportable code down the road for you. It does support a range constructor (U32 bitStart,U32 bitEnd), a bit array constructor (U32 bitlen, U32 bits[]), and a composite constructor (U32 masklen, PackBitSet masks[], U32 bitlen, U32 bits[]). These support construction of constant masks. I also included a helper macro to auto size static arrays.
The next thing I did was renaming/revaluing ALL of the constants that head towards the PackBitSet. I did this to help myself merge in outside resources. If something is referencing InitialUpdateMask I know it's old code and I need to convert it over to the new way. I highly recommend doing that since the type of the constants aren't changing, but the values are changing greatly (1 << bitID vs bitID). I appended _BITID to all the simple bit id's.
For those cases were a true mask was required (ThreadMask, SoundMask, ImageMask) I declare them as extern const PackBitSet's and appended _BITMASK to them. The actual static const definition had to be put in the respective .cc file though.
There are alot of them and I'm not going to list them all here, examine the patch file for the details.
Next the class members we will be changing the type/prototypes of:
netObject:
U32 mDirtyMaskBits turns into
PackBitSet mDirtyMaskBits_PBS
setMaskBits(U32) turns into 2 functions
void setMaskBits_BITMASK(const PackBitSet &orMask);
void setMaskBit_BITID(const U32 orMask);
clearMaskBits(U32) turns into 2 functions
void clearMaskBits_BITMASK(const PackBitSet &orMask);
void clearMaskBit_BITID(const U32 orMask);
For the set/clear we typically only operate on 1 bit. For those cases that operate on 2, I just call it twice.
If they operate on 3 or more I created a static mask.
virtual F32 getUpdatePriority(CameraScopeQuery *focusObject, U32 updateMask, S32 updateSkips) turns into
virtual F32 getUpdatePriority(CameraScopeQuery *focusObject, const PackBitSet &updateMask, S32 updateSkips);
virtual U32 packUpdate(NetConnection *conn, U32 mask, BitStream *stream) turns into
virtual void packUpdate(NetConnection *conn, const PackBitSet &mask, PackBitSet &retMask, BitStream *stream);
You will noticed that instead of returning a PackBitSet it carries along it's retMask as a reference.
It also carries the incoming mask as a const reference. In theory that shouldn't be a problem, however a few
places do cear mask bits, but it is an network optimization local to that function so I just made a local copy of
the const mask in those situations.
netConnection:
U32 mask turns into
PackBitSet mask_PBS
U32 updateMask turns into
PackBitSet updateMask_PBS;
lightUpdateGrouper.h
U32 LightUpdateGrouper::BitIterator::getMask() turns into
U32 LightUpdateGrouper::BitIterator::getMask_BITID()
U32 LightUpdateGrouper::getKeyMask(const U32 key) const turns into
U32 LightUpdateGrouper::getKeyMask_BITID(const U32 key) const
LightUpdateGrouper was returning a bitmask, now it will be returning a bit id. Again a place where the name change will draw attention to any code that is expecting the old behavior.
The next step is to search all the .h files for subclasses implementing these methods and fixing the declarations.
Then you have to go through all the .cc files and correct the code to work under the new semantics. mask & InitialMask turns into mask.test(InitialMask_BITID), etc.
When your all done you should have a working torque that no longer has the 32 bit limitation. I set it up for a 64 bit limit now, but it's just a simple constant in PackBitSet.h so you can expand it out if you decide you really need all those extra bits.
As a side effect of the switch from mask to bitid the pathcamera.h mask bug gets fixed.
I did find and fix a subtle potential bug when calling clearMaskBits. If you were trying to clear multiple bits and not all of them had been set and this would empty out the updateMask then walk->connection->ghostPushToZero(walk) would not get called. i.e. clearMaskBits(3) when updateMask=1 would not do the push.
To apply the patch file go to your code directory and execute:
patch -p1 -u
It doesn't update any makefiles, etc so you will have to add engine/core/PackBitSet.cc manually for your build environment.
This should work for any platform, but I have only tested it on Windows.
So you have finally hit the wall and found that the 32 bits that have been provided for in packUpdate are just not enough. You probably even looked into changing it, but discovered it's not an easy change to make. Well this resource will let you break that 32 bit barrier and put in as many as you want.
However, first a warning.. If you go down this road you will be far off the beaten path of a standard code base. You will not be able to easily merge in others resources, packs, etc to your code base. If you don't feel like you could do such a wide ranging change yourself then you probably shouldn't be doing it.
My major hope is that GG will adopt this (or a similar system) as part of the base so that this artificial limitation can go away. Admittedly it does help enforce the real limitation of limited bandwidth in a networking situation.
Now on to describe the changes.
First I built a specialized BitSet class called PackBitSet. I made it look similar to the STL bitset class.
This class DOES NOT support conversion to/from a U32 bitmask. I highly recommend never allowing that to occur since it will only lead to bugs and unsupportable code down the road for you. It does support a range constructor (U32 bitStart,U32 bitEnd), a bit array constructor (U32 bitlen, U32 bits[]), and a composite constructor (U32 masklen, PackBitSet masks[], U32 bitlen, U32 bits[]). These support construction of constant masks. I also included a helper macro to auto size static arrays.
The next thing I did was renaming/revaluing ALL of the constants that head towards the PackBitSet. I did this to help myself merge in outside resources. If something is referencing InitialUpdateMask I know it's old code and I need to convert it over to the new way. I highly recommend doing that since the type of the constants aren't changing, but the values are changing greatly (1 << bitID vs bitID). I appended _BITID to all the simple bit id's.
For those cases were a true mask was required (ThreadMask, SoundMask, ImageMask) I declare them as extern const PackBitSet's and appended _BITMASK to them. The actual static const definition had to be put in the respective .cc file though.
There are alot of them and I'm not going to list them all here, examine the patch file for the details.
Next the class members we will be changing the type/prototypes of:
netObject:
U32 mDirtyMaskBits turns into
PackBitSet mDirtyMaskBits_PBS
setMaskBits(U32) turns into 2 functions
void setMaskBits_BITMASK(const PackBitSet &orMask);
void setMaskBit_BITID(const U32 orMask);
clearMaskBits(U32) turns into 2 functions
void clearMaskBits_BITMASK(const PackBitSet &orMask);
void clearMaskBit_BITID(const U32 orMask);
For the set/clear we typically only operate on 1 bit. For those cases that operate on 2, I just call it twice.
If they operate on 3 or more I created a static mask.
virtual F32 getUpdatePriority(CameraScopeQuery *focusObject, U32 updateMask, S32 updateSkips) turns into
virtual F32 getUpdatePriority(CameraScopeQuery *focusObject, const PackBitSet &updateMask, S32 updateSkips);
virtual U32 packUpdate(NetConnection *conn, U32 mask, BitStream *stream) turns into
virtual void packUpdate(NetConnection *conn, const PackBitSet &mask, PackBitSet &retMask, BitStream *stream);
You will noticed that instead of returning a PackBitSet it carries along it's retMask as a reference.
It also carries the incoming mask as a const reference. In theory that shouldn't be a problem, however a few
places do cear mask bits, but it is an network optimization local to that function so I just made a local copy of
the const mask in those situations.
netConnection:
U32 mask turns into
PackBitSet mask_PBS
U32 updateMask turns into
PackBitSet updateMask_PBS;
lightUpdateGrouper.h
U32 LightUpdateGrouper::BitIterator::getMask() turns into
U32 LightUpdateGrouper::BitIterator::getMask_BITID()
U32 LightUpdateGrouper::getKeyMask(const U32 key) const turns into
U32 LightUpdateGrouper::getKeyMask_BITID(const U32 key) const
LightUpdateGrouper was returning a bitmask, now it will be returning a bit id. Again a place where the name change will draw attention to any code that is expecting the old behavior.
The next step is to search all the .h files for subclasses implementing these methods and fixing the declarations.
Then you have to go through all the .cc files and correct the code to work under the new semantics. mask & InitialMask turns into mask.test(InitialMask_BITID), etc.
When your all done you should have a working torque that no longer has the 32 bit limitation. I set it up for a 64 bit limit now, but it's just a simple constant in PackBitSet.h so you can expand it out if you decide you really need all those extra bits.
As a side effect of the switch from mask to bitid the pathcamera.h mask bug gets fixed.
I did find and fix a subtle potential bug when calling clearMaskBits. If you were trying to clear multiple bits and not all of them had been set and this would empty out the updateMask then walk->connection->ghostPushToZero(walk) would not get called. i.e. clearMaskBits(3) when updateMask=1 would not do the push.
To apply the patch file go to your code directory and execute:
patch -p1 -u
It doesn't update any makefiles, etc so you will have to add engine/core/PackBitSet.cc manually for your build environment.
This should work for any platform, but I have only tested it on Windows.
#2
What about wrapping new and old code code in macros #ifdef, so people can use 32 bit or use the provided class by recompiling? It will be more reliable :)
Just my 2 cents...
Well done!
09/03/2005 (4:47 pm)
Interesting, indeedWhat about wrapping new and old code code in macros #ifdef, so people can use 32 bit or use the provided class by recompiling? It will be more reliable :)
Just my 2 cents...
Well done!
#3
I was quite concerned about performance when I built this which is why it doesn't use STL, it just looks similar to it.
All of the functions are inlined and if you wanted to you could set PBS_MAX_U32 down to 1. Since everything is inlined the on stack footprint would be 8 bytes instead of 4.. The operations do expand out to a few more instructions but it is not a noticeable impact. The networking code is tight because it doesn't write out more than it needs to and this doesn't change that.
As for an #ifdef version that was the first aborted way I tried actually.
The major problem is that the data type of the masks (1 << n) and the bits (n) are the same and it is very hard to debug and get right.
Additionally some of the method signatures needed to change as well.. Passing things around by reference instead of returning them, etc.
It can be done with #ifdef, etc but it is a real pain to maintain and get right. If you forget to #ifdef a constant you may or may not notice.
-Jerry
09/03/2005 (8:44 pm)
The code doesn't change the over the wire structure at all. The exact same bytes will go out over the wire as in the base code. The networking code never writes the update mask out directly.I was quite concerned about performance when I built this which is why it doesn't use STL, it just looks similar to it.
All of the functions are inlined and if you wanted to you could set PBS_MAX_U32 down to 1. Since everything is inlined the on stack footprint would be 8 bytes instead of 4.. The operations do expand out to a few more instructions but it is not a noticeable impact. The networking code is tight because it doesn't write out more than it needs to and this doesn't change that.
As for an #ifdef version that was the first aborted way I tried actually.
The major problem is that the data type of the masks (1 << n) and the bits (n) are the same and it is very hard to debug and get right.
Additionally some of the method signatures needed to change as well.. Passing things around by reference instead of returning them, etc.
It can be done with #ifdef, etc but it is a real pain to maintain and get right. If you forget to #ifdef a constant you may or may not notice.
-Jerry
#4
09/06/2005 (11:33 pm)
I wasn't really concerned about changes over the wire, but the performance of packing and unpacking the bitstream itself...that code gets called extremely often (for every client, every object scoped to that client gets it's pack called every 200 milliseconds...and it doesn't cache, so if an object is scoped to 50 clients, it's pack gets called 50 times in that one networking pulse), so it's operation needs to be as fast as possible.
#5
11/16/2006 (5:33 pm)
anyone got any performance stats on this bad boy yet ?
#6
Cheers.
12/04/2009 (10:33 am)
The Download is gone. Any chance of reposting it? Wanted to look over what you did :)Cheers.
#7
http://rapidshare.com/files/316695612/8576.bitset.diff
12/05/2009 (11:39 am)
I still had the original diff around:http://rapidshare.com/files/316695612/8576.bitset.diff
#8
12/09/2009 (4:27 am)
Thanks, got it. Im a bit off the beaten path already what with Ballistic emmitters on shapes (my own version), and a completely redone water coverage system... :)
#9
Took me all Sunday day/night (No sleep).
It works but has some problems still, mainly with MountedImages.
I need to work out that section in shapebase better ..
But at least I can debug it now. And woot, 64 bit Mask :)
01/09/2012 (3:01 am)
Amazing.. I just merged this code into T3D 1.2 :)Took me all Sunday day/night (No sleep).
It works but has some problems still, mainly with MountedImages.
I need to work out that section in shapebase better ..
But at least I can debug it now. And woot, 64 bit Mask :)
#10
was this resource still just a merge or was there enough code changed that it took a partial re-write? i believe that this is a resource that we will be rather interested in using.
Paul
01/21/2012 (11:00 pm)
John:was this resource still just a merge or was there enough code changed that it took a partial re-write? i believe that this is a resource that we will be rather interested in using.
Paul
#11
The Mask system has not really changed much in 9 years.
01/22/2012 (4:29 am)
Mostly just a merge, But I havent done a full test (Dedicated server) yetThe Mask system has not really changed much in 9 years.
#12
01/22/2012 (7:16 pm)
Id love to know the results of this when you finally get to test it. 
Torque 3D Owner Stephen Zepp
One of the reasons that Torque Networking is so fast is the fact that whenever possible we use the most optimized method for packing and unpacking updates. My big concern here would be that you may have gone from code that compiles into extremely fast assembly (bit checking), vs additional overhead in accessing the abstracted bitSet implementation.
The networking code is called an amazingly large amount of times in a multi-player game, so this needs to be really really fast!