When is an object's Transform transmitted ?
by Orion Elenzil · in Torque Game Engine · 02/27/2006 (9:44 am) · 19 replies
Hi Folks.
Specifically i'm wondering about stationary bots.
(Not stationery bots, those hardworking but mindless slaves who bring us all the wonderful hallmark products)
If an AIPlayer's position and rotation are not changing on the server,
are those values still being sent to clients ?
Seems the answer should be no,
but i'm not sure and hoped someone might have a quick definitive answer.
We've got several stationary bots on the server who just stand there playing an animation loop and every ten seconds or so switch to a different loop. This sounds pretty like low network-overhead to me, is that true ?
tia,
Orion
Specifically i'm wondering about stationary bots.
(Not stationery bots, those hardworking but mindless slaves who bring us all the wonderful hallmark products)
If an AIPlayer's position and rotation are not changing on the server,
are those values still being sent to clients ?
Seems the answer should be no,
but i'm not sure and hoped someone might have a quick definitive answer.
We've got several stationary bots on the server who just stand there playing an animation loop and every ten seconds or so switch to a different loop. This sounds pretty like low network-overhead to me, is that true ?
tia,
Orion
About the author
#2
thanks for the quick reply Badguy.
the moveMask mechanism is still in there,
but i think moveMask is set way too often.
i put a printf() in player.cc::if (stream->writeFlag(mask & MoveMask)) ,
and it's sending the transform info more or less continuously.
which seems bad. it's 13 bytes per bot.
i'll check stock TGE1.3 and then start hunting thru my own code.
02/27/2006 (10:36 am)
[arg stupid accidentally hitting the back button and losing my post!]thanks for the quick reply Badguy.
the moveMask mechanism is still in there,
but i think moveMask is set way too often.
i put a printf() in player.cc::if (stream->writeFlag(mask & MoveMask)) ,
and it's sending the transform info more or less continuously.
which seems bad. it's 13 bytes per bot.
i'll check stock TGE1.3 and then start hunting thru my own code.
#3
this is totally happening in stock TGE1.3.
uh... that seems really bad.
to repro:
use the stock mission without Kork.
1. in player.cc, in packUpdate,
right inside the "if (stream->writeFlag(mask & MoveMask))",
just add a con::printf().
2. run a stand-alone server.
3. run a client, connect.
-> note you won't see the printf.
this is because there's only one player in the game,
and it's not going to ghost to itself.
4. run another client, connect.
-> wham! constant printfs() inside if(MoveMask).
that seems wildly unexpected..
consider 30 players mutually in scope.
each client is now getting 29 * 13 bytes = 377 bytes per frame !
i must be mistaken here,
can someone verify/clarify ?
02/27/2006 (11:11 am)
Wow -this is totally happening in stock TGE1.3.
uh... that seems really bad.
to repro:
use the stock mission without Kork.
1. in player.cc, in packUpdate,
right inside the "if (stream->writeFlag(mask & MoveMask))",
just add a con::printf().
2. run a stand-alone server.
3. run a client, connect.
-> note you won't see the printf.
this is because there's only one player in the game,
and it's not going to ghost to itself.
4. run another client, connect.
-> wham! constant printfs() inside if(MoveMask).
that seems wildly unexpected..
consider 30 players mutually in scope.
each client is now getting 29 * 13 bytes = 377 bytes per frame !
i must be mistaken here,
can someone verify/clarify ?
#4
but it looks to me like these lines get executed every time player::updatePos() is called,
which i believe is every tick, yes ?
stock TG1.3 player.cc, ~line 2691
02/27/2006 (11:33 am)
Correct me if i'm wrong,but it looks to me like these lines get executed every time player::updatePos() is called,
which i believe is every tick, yes ?
stock TG1.3 player.cc, ~line 2691
setPosition(start,mRot); setMaskBits(MoveMask); updateContainer();
#5
but i'm now adding a function like Player::updateMoveMaskOnlyIfReallyMoved()..
02/27/2006 (11:37 am)
Further talking w/ someone who understands more about UDP packets than i do, which isn't saying much, has clued me in that in the 30 players case, the 377 bytes may not be such a big deal.but i'm now adding a function like Player::updateMoveMaskOnlyIfReallyMoved()..
#6
One possible thing to watch for when sending no position updates at all is when the objects go out of scope. They then tend to remain invisible when they should come back into scope as there is no position update information.
I had this happen on a physics simulation I was messing with when I was trying to cut down the packet flow when the objects were at rest. It may be better to say count ticks and just send an update every 10th tick or so when no movement is taking place.
02/27/2006 (12:28 pm)
OrionOne possible thing to watch for when sending no position updates at all is when the objects go out of scope. They then tend to remain invisible when they should come back into scope as there is no position update information.
I had this happen on a physics simulation I was messing with when I was trying to cut down the packet flow when the objects were at rest. It may be better to say count ticks and just send an update every 10th tick or so when no movement is taking place.
#7
thanks for thinking about this.
I think this is a safe change w/r/t scoping.
when an object comes back into scope,
MoveMask should be true and the new position will be resent, i think.
I made a resource for this,
it could probably use some criticism before being approved:
www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=9915
edit: malformed url.
02/27/2006 (1:45 pm)
David,thanks for thinking about this.
I think this is a safe change w/r/t scoping.
when an object comes back into scope,
MoveMask should be true and the new position will be resent, i think.
I made a resource for this,
it could probably use some criticism before being approved:
www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=9915
edit: malformed url.
#8
here's one for a torque scoping master.
suppose we have clientA and clientB.
let's say they start mutually in scope, both stationary.
* clientA moves so that clientB is out of scope.
* while out of scope, clientB moves.
* with my change, this will cause a serverside clientB.setMaskBits(MoveMask),
but that's not ghosted to clientA because clientB is out of scope.
* clientB stops moving.
* clientA moves again so that clientB is back in scope.
my question:
will clientA get the new position of clientB ?
i hope the answer is:
"Yes, because setMaskBits sets the mask for all clients,
the bits are only cleared when the info has been ghosted to the client."
can anyone verify ?
tia,
Orion
02/27/2006 (4:28 pm)
So thinking more,here's one for a torque scoping master.
suppose we have clientA and clientB.
let's say they start mutually in scope, both stationary.
* clientA moves so that clientB is out of scope.
* while out of scope, clientB moves.
* with my change, this will cause a serverside clientB.setMaskBits(MoveMask),
but that's not ghosted to clientA because clientB is out of scope.
* clientB stops moving.
* clientA moves again so that clientB is back in scope.
my question:
will clientA get the new position of clientB ?
i hope the answer is:
"Yes, because setMaskBits sets the mask for all clients,
the bits are only cleared when the info has been ghosted to the client."
can anyone verify ?
tia,
Orion
#9
This is a potentially valuable fix. I'd like to go a little more in-depth on it before putting it into HEAD, though. First off - Torque networking runs at a fixed bandwidth, so what flags are set isn't all that important. What IS important is, what updates are going when? Are important things being starved of updates?
Since updates are sent by priority in the available space, making sure the priority values are important is also a big win.
Basically, what I'd do if I was working on this would be to add printfs to the pack and unpack that trigger when that flag is sent/received. That way I could run a dedicated server and a client and see exactly what's happening with the networking by watching their respective consoles. I might also add some debug code (or check if DEBUG_NET already does this) and see what OTHER updates are getting sent in packets and how big they are.
Since player physics are a little finicky (insofar as a small difference in position can trigger a big change in the prediction, imagine a player taking an extra step towards a cliff due to a precision issue!) having extra updates in-flight is probably not a bad idea in many situations.
Good work so far, I hope we can take the extra steps to quantify how much of a win this is, and what its implications are!
Ben
02/27/2006 (6:44 pm)
Hey guys,This is a potentially valuable fix. I'd like to go a little more in-depth on it before putting it into HEAD, though. First off - Torque networking runs at a fixed bandwidth, so what flags are set isn't all that important. What IS important is, what updates are going when? Are important things being starved of updates?
Since updates are sent by priority in the available space, making sure the priority values are important is also a big win.
Basically, what I'd do if I was working on this would be to add printfs to the pack and unpack that trigger when that flag is sent/received. That way I could run a dedicated server and a client and see exactly what's happening with the networking by watching their respective consoles. I might also add some debug code (or check if DEBUG_NET already does this) and see what OTHER updates are getting sent in packets and how big they are.
Since player physics are a little finicky (insofar as a small difference in position can trigger a big change in the prediction, imagine a player taking an extra step towards a cliff due to a precision issue!) having extra updates in-flight is probably not a bad idea in many situations.
Good work so far, I hope we can take the extra steps to quantify how much of a win this is, and what its implications are!
Ben
#10
clientA will get the update as soon as the Server decides clientB is back in clientA's frustum.
I would have to check to see what exactly triggered server to do a client scope query.
02/27/2006 (7:19 pm)
Orion,clientA will get the update as soon as the Server decides clientB is back in clientA's frustum.
I would have to check to see what exactly triggered server to do a client scope query.
#11
02/27/2006 (7:21 pm)
But it's not entirely frustrum based - objects not in frustrum get a lower priority but are still scoped IIRC.
#12
my bad.
I assumed that to be honest, coming from onCameraScopeQuery and stuff :)
02/27/2006 (7:47 pm)
Ahh, Ok.my bad.
I assumed that to be honest, coming from onCameraScopeQuery and stuff :)
#13
Ben's idea of placing printf's is a good one. This stuff can be quite messy. I'm not even sure that I have the complete picture yet but here's what I understand:
Ghosting originates from NetConnection::ghostWritePacket(). Within each NetConnection is a list of ghosts, and this list is split into ghosts that have an update this tick and those that don't. The separator between these two groups appears to be mGhostZeroUpdateIndex.
For ghosts that don't have an update this tick, nothing is done with them, including any change in scoping. This potentially means that an object is out of scope by whatever rules are set up (ie: frustum checks) but is still being ghosted. However as nothing has changed, there isn't any special information the client has gained by it hanging around (it just continues to add to the total number of ghosts, which is limited).
For objects that do have an update this tick, they are cleared from scope for the connection prior to call to mScopeObject->onCameraScopeQuery(). If a scoped object exists for this connection, its onCameraScopeQuery() is used to determine which objects are in scope. For a ShapeBase scoped (control) object, this attempts to go through the scene manager to determine which objects should be in scope within the scene. When one is found, the connection's objectInScope() is called for it.
NetConnection's objectInScope() first checks to see if the given object has already been ghosted. If so, then the object is again made InScope and we're done. The information that is sent to the client will be based on the object's current update mask.
If the object is not being ghosted (it never was, or it used to be but is no longer) then it is added to the connection's ghost list and given an update mask of 0xFFFFFFFF. This is important as this update mask will be sent to the object's packUpdate(), which should cause the object to send all of its information. This is why there are checks like (mask & InitialUpdateMask) within Player::packUpdate() for example.
There's also a bunch of stuff to prioritise the order of ghost data to be sent and the handling of a full bit stream. But you can trace that within ghostWritePacket() if you're interested.
So in the end, I believe that the answer to Orion's question is that all should be fine and ClientA will indeed get the new position of ClientB.
But that's my theory based on the code in front of me (TGE 1.4). If someone wants to do an actual trace as Ben suggests, I'm sure we'd all benefit! :o)
- LightWave Dave
02/27/2006 (9:28 pm)
Greetings!Ben's idea of placing printf's is a good one. This stuff can be quite messy. I'm not even sure that I have the complete picture yet but here's what I understand:
Ghosting originates from NetConnection::ghostWritePacket(). Within each NetConnection is a list of ghosts, and this list is split into ghosts that have an update this tick and those that don't. The separator between these two groups appears to be mGhostZeroUpdateIndex.
For ghosts that don't have an update this tick, nothing is done with them, including any change in scoping. This potentially means that an object is out of scope by whatever rules are set up (ie: frustum checks) but is still being ghosted. However as nothing has changed, there isn't any special information the client has gained by it hanging around (it just continues to add to the total number of ghosts, which is limited).
For objects that do have an update this tick, they are cleared from scope for the connection prior to call to mScopeObject->onCameraScopeQuery(). If a scoped object exists for this connection, its onCameraScopeQuery() is used to determine which objects are in scope. For a ShapeBase scoped (control) object, this attempts to go through the scene manager to determine which objects should be in scope within the scene. When one is found, the connection's objectInScope() is called for it.
NetConnection's objectInScope() first checks to see if the given object has already been ghosted. If so, then the object is again made InScope and we're done. The information that is sent to the client will be based on the object's current update mask.
If the object is not being ghosted (it never was, or it used to be but is no longer) then it is added to the connection's ghost list and given an update mask of 0xFFFFFFFF. This is important as this update mask will be sent to the object's packUpdate(), which should cause the object to send all of its information. This is why there are checks like (mask & InitialUpdateMask) within Player::packUpdate() for example.
There's also a bunch of stuff to prioritise the order of ghost data to be sent and the handling of a full bit stream. But you can trace that within ghostWritePacket() if you're interested.
So in the end, I believe that the answer to Orion's question is that all should be fine and ClientA will indeed get the new position of ClientB.
But that's my theory based on the code in front of me (TGE 1.4). If someone wants to do an actual trace as Ben suggests, I'm sure we'd all benefit! :o)
- LightWave Dave
#14
Regarding the "13 bytes not being that big a deal"--Ben alluded to this, but just to reinforce the central issue here:
What will happen is that if those objects have a relatively high updatePriority (based on view frustum, importance to the target--things that can kill you are important!, etc.) will consistently fill up your packet, causing other ghosted objects with lower priority to get skipped. Since your packet size is default pretty low (200 bytes I think is stock), if this is in fact happening like it seems you've proven, it would cause a cascade effect of low priority objects that are still in scope getting far fewer updates transmitted over time (every couple of packets since packets get full quickly), which, if enough of them exist, would cause a general trend of lots of interpolation across few packets, and if it gets bad enough, warping.
I happen to know that in your case Orion, this is a pretty critical catch, and in your shoes I would absolutely spend the time it takes to confirm.
02/27/2006 (10:12 pm)
David's description is right on and in fact, explained something to me that I didn't quite get...an object can be just about almost out of scope in that an object goes "semi-out of scope" yet doesn't get removed. I always thought (and I think at some threshold it still happens), that once out of scope, the object is basically deleted on the client once confirmed really out of scope.Regarding the "13 bytes not being that big a deal"--Ben alluded to this, but just to reinforce the central issue here:
What will happen is that if those objects have a relatively high updatePriority (based on view frustum, importance to the target--things that can kill you are important!, etc.) will consistently fill up your packet, causing other ghosted objects with lower priority to get skipped. Since your packet size is default pretty low (200 bytes I think is stock), if this is in fact happening like it seems you've proven, it would cause a cascade effect of low priority objects that are still in scope getting far fewer updates transmitted over time (every couple of packets since packets get full quickly), which, if enough of them exist, would cause a general trend of lots of interpolation across few packets, and if it gets bad enough, warping.
I happen to know that in your case Orion, this is a pretty critical catch, and in your shoes I would absolutely spend the time it takes to confirm.
#15
I vtuned with the same bots orion is testing and it showed scanZoneNew taking about 5% of the app time. This is something that's triggered from the zoneInsert above.
I added an operator== to matrixF
and then modified the set transform function to do a sanity check:
Looking over the code this seemed ok to me, since when an object is initally added to the scene it is inserted in to the sceneManager, then whenever it really moves it will be rezoned. But most of this section of the code in new to me.
Any reason you guys know that this would be a bad thing to do?
thanks
02/28/2006 (10:40 am)
Along the same lines, when the position is updated, it calls SceneObject::SetTransform, which will remove the object from whatever zone it's currently in and reinsert it into the zone manager.void SceneObject::setTransform(const MatrixF& mat)
{
mObjToWorld = mWorldToObj = mat;
mWorldToObj.affineInverse();
AssertFatal(mObjBox.isValidBox(), "Bad object box!");
resetWorldBox();
if ( mSceneManager != NULL && mNumCurrZones != 0) {
mSceneManager->zoneRemove(this);
mSceneManager->zoneInsert(this);
if (getContainer())
getContainer()->checkBins(this);
}
if(isClientObject())
mLightingInfo.mDirty = true;
setRenderTransform(mat);
}I vtuned with the same bots orion is testing and it showed scanZoneNew taking about 5% of the app time. This is something that's triggered from the zoneInsert above.
I added an operator== to matrixF
bool operator==(const MatrixF &mat) const
{
for(U32 i = 0 ; i < 16 ; i++)
{
if(mat.m[i] != m[i])
return false;
}
return true;
}and then modified the set transform function to do a sanity check:
void SceneObject::setTransform(const MatrixF& mat)
{
//sanity check have we really changed positions?
//if we haven't then we can skip some of the rezoning work
bool sameAsLast = false;
if(mObjToWorld == mat)
sameAsLast = true;
mObjToWorld = mWorldToObj = mat;
mWorldToObj.affineInverse();
AssertFatal(mObjBox.isValidBox(), "Bad object box!");
resetWorldBox();
if (!sameAsLast && mSceneManager != NULL && mNumCurrZones != 0) {
mSceneManager->zoneRemove(this);
mSceneManager->zoneInsert(this);
if (getContainer())
getContainer()->checkBins(this);
}
if(isClientObject())
mLightingInfo.mDirty = true;
setRenderTransform(mat);
}Looking over the code this seemed ok to me, since when an object is initally added to the scene it is inserted in to the sceneManager, then whenever it really moves it will be rezoned. But most of this section of the code in new to me.
Any reason you guys know that this would be a bad thing to do?
thanks
#16
I'm curious as to the performance increase you saw with your changes. How much did your 5% processing time drop by?
My only comment on your code is based on your check of if(mObjToWorld == mat), is it even necessary to process any code within SceneObject::setTransform()? If no part of the object's transform has changed, then you wouldn't need to reassign the world matrices nor rebuild the inverse transform, world box, lighting or render transform.
Perhaps what you were after was a check for just the object's change in position rather than the whole transform. In that case you could rotate the object and it wouldn't cause a rezone. Even in this case, something to potentially watch out for is a very large object that may exist in more than one zone. In that case a rotation could cause a rezone.
This isn't a code path that I've personally been down, but perhaps my gut feelings will still help out. :o)
- LightWave Dave
02/28/2006 (11:10 am)
Clint,I'm curious as to the performance increase you saw with your changes. How much did your 5% processing time drop by?
My only comment on your code is based on your check of if(mObjToWorld == mat), is it even necessary to process any code within SceneObject::setTransform()? If no part of the object's transform has changed, then you wouldn't need to reassign the world matrices nor rebuild the inverse transform, world box, lighting or render transform.
Perhaps what you were after was a check for just the object's change in position rather than the whole transform. In that case you could rotate the object and it wouldn't cause a rezone. Even in this case, something to potentially watch out for is a very large object that may exist in more than one zone. In that case a rotation could cause a rezone.
This isn't a code path that I've personally been down, but perhaps my gut feelings will still help out. :o)
- LightWave Dave
#17
In real gameplay when we have lots of players standing around doing stuff but not actually moving, they won't get rezoned.
That was what I tried at first, just an early exit from the function. But that caused some problems with the renderTransform I believe. Something must not be getting set initially and I didn't track it down further. That other code isn't too awefully hefty, but yes, we should be able to bypass everything if we track down the problem.
hey that's a great improvement. but yes rezoning on rotation, I think I'll leave it the way it is now since all those float compares shouldn't be too big a deal. At least for us I think we have bigger fish to fry :)
thanks for the feedback
02/28/2006 (11:48 am)
Quote:in my test situation it dropped by 100% completely off the radar now. Note that my test situation is unique, like orion said we have a bunch of bots that aren't changing their position. I load up a single player and profile while the player isn't moving. So nothing should be moving or need to rezone, and with the change it doesn't.
I'm curious as to the performance increase you saw with your changes. How much did your 5% processing time drop by?
In real gameplay when we have lots of players standing around doing stuff but not actually moving, they won't get rezoned.
Quote:My only comment on your code is based on your check of if(mObjToWorld == mat), is it even necessary to process any code within SceneObject::setTransform()? If no part of the object's transform has changed, then you wouldn't need to reassign the world matrices nor rebuild the inverse transform, world box, lighting or render transform.
That was what I tried at first, just an early exit from the function. But that caused some problems with the renderTransform I believe. Something must not be getting set initially and I didn't track it down further. That other code isn't too awefully hefty, but yes, we should be able to bypass everything if we track down the problem.
Quote:
Perhaps what you were after was a check for just the object's change in position rather than the whole transform. In that case you could rotate the object and it wouldn't cause a rezone. Even in this case, something to potentially watch out for is a very large object that may exist in more than one zone. In that case a rotation could cause a rezone.
hey that's a great improvement. but yes rezoning on rotation, I think I'll leave it the way it is now since all those float compares shouldn't be too big a deal. At least for us I think we have bigger fish to fry :)
thanks for the feedback
#18
many, many thanks to all who've posted here. it's been very illuminating.
i think getting qantitative measures of effectiveness is unfortunately fairly far beyond my own extremely nascent understanding of these finer points of ghosting.
i've got some qualitative stuff tho !
re Ben's concerns about client/server simulation discrepencies,
i *think* he may've been right on.
i started seeing some strange behaviour which may have been cause by this
(EG NPCs falling thru floors only on some clients, people moonwalking)
and changed my "have you really moved?" threshhold so that it's true more often.
(changed from 0.02 to 0.00001)
But Then,
this is sort of humorous,
a while ago i wrote custom player-vs-player collisions which would make client-side-only changes to a player's position, relying on the updates from the server and Torque's nifty position interpolating to give a pleasing sort of rubber-band effect.
.. but of course, for non-moving NPCs, the server was no longer sending updates, which meant that a client could push a stationary NPC all over the place.
to fix i have for now followed Ben's advice (how many more times in my life will i say that? many i'm sure) and make sure that each server object sets its MoveMask at least every 20 ticks.
Other than that tho, we haven't seen any issues with this.
thanks again,
Orion
02/28/2006 (5:54 pm)
Hi All -many, many thanks to all who've posted here. it's been very illuminating.
i think getting qantitative measures of effectiveness is unfortunately fairly far beyond my own extremely nascent understanding of these finer points of ghosting.
i've got some qualitative stuff tho !
re Ben's concerns about client/server simulation discrepencies,
i *think* he may've been right on.
i started seeing some strange behaviour which may have been cause by this
(EG NPCs falling thru floors only on some clients, people moonwalking)
and changed my "have you really moved?" threshhold so that it's true more often.
(changed from 0.02 to 0.00001)
But Then,
this is sort of humorous,
a while ago i wrote custom player-vs-player collisions which would make client-side-only changes to a player's position, relying on the updates from the server and Torque's nifty position interpolating to give a pleasing sort of rubber-band effect.
.. but of course, for non-moving NPCs, the server was no longer sending updates, which meant that a client could push a stationary NPC all over the place.
to fix i have for now followed Ben's advice (how many more times in my life will i say that? many i'm sure) and make sure that each server object sets its MoveMask at least every 20 ticks.
Other than that tho, we haven't seen any issues with this.
thanks again,
Orion
#19
I saw similar issues with the performance of rezoning players as Clint saw.
One thing I thought of when thinking through the scoping concept, is that you might want to have a running count of how many times you have considered the object to be static and only stop updating the object after N counts. The reason being that if the object was initially moving and has then stopped, the client might then be forward predicting the object and due to packet loss might be unaware of the stopping. The N count simply helps to ensure that the client actually has the "stopped" state before you essentially disable updates for them.
Having said that, torque really doesnt do forward prediction so much as reverse backwards prediction does it :)
03/02/2006 (3:49 am)
We used to have a dirty flag on the transform matrices we used. That way we could test the update of the matrix to determine if the object needs a network/render update.I saw similar issues with the performance of rezoning players as Clint saw.
One thing I thought of when thinking through the scoping concept, is that you might want to have a running count of how many times you have considered the object to be static and only stop updating the object after N counts. The reason being that if the object was initially moving and has then stopped, the client might then be forward predicting the object and due to packet loss might be unaware of the stopping. The N count simply helps to ensure that the client actually has the "stopped" state before you essentially disable updates for them.
Having said that, torque really doesnt do forward prediction so much as reverse backwards prediction does it :)
Torque Owner Badguy
where when the position is changed the mask is set
when the mask is set the network update is made.
i'm sure you will find this is still in place.
to find out what occurs in a given netupdate
you can locate the pack/unpack update functions for the desired object.
and there you can view what is involved in the object serialization that occurs.