Pre-Processing Graphics (and other things) using advanced Math.
by Pauliver · in Torque Game Engine Advanced · 03/18/2006 (8:45 pm) · 9 replies
I was sitting in calc the other day (were doing derivatives), and i couldn't help but think that just maybe on a machine w/ multiple processors this might be a way to help improve performance. I wanted to hear what people thought, and if anyone knew of anything like it...
Base knowledge [posted so i can be corrected if im wrong]:
At the center of any game is a simple loop that runs the game.
A loop can only be run on one proc at a time.
now i will admit my graphics knowledge is very limited (im mainly an internals guy, not a graphics person, that may change if i get enough feedback on this to try it)
What if, you used some kind of structure like a hashmap in which to store and retrive data (it would have to be specialized). You could then use Calc (derivatives and custom algorithms specifically) to estimate where a player will be next tick and then pre-perform any calculations/rendering that you had time for on the estimated data.
this is a very rough layout its the basic idea, i was wondering if anyone had heard of anything like this? does something allready do this?
Edit: This is not just for graphics, this could be used to predict any calculations that your machine has to make. the predicted data would be stored in the hashmap like structure, and then before any calculations that the main loop of the processors has to do it could check to see if its pre-done and stored in the hashmap.
Base knowledge [posted so i can be corrected if im wrong]:
At the center of any game is a simple loop that runs the game.
A loop can only be run on one proc at a time.
now i will admit my graphics knowledge is very limited (im mainly an internals guy, not a graphics person, that may change if i get enough feedback on this to try it)
What if, you used some kind of structure like a hashmap in which to store and retrive data (it would have to be specialized). You could then use Calc (derivatives and custom algorithms specifically) to estimate where a player will be next tick and then pre-perform any calculations/rendering that you had time for on the estimated data.
this is a very rough layout its the basic idea, i was wondering if anyone had heard of anything like this? does something allready do this?
Edit: This is not just for graphics, this could be used to predict any calculations that your machine has to make. the predicted data would be stored in the hashmap like structure, and then before any calculations that the main loop of the processors has to do it could check to see if its pre-done and stored in the hashmap.
#2
We actually always interpolate for every ghosted object ever sent, because when the client receives the first packet for a particular object, it in fact ignores it completely until the second one arrives!
The reason for this is to have two known update states to interpolate between for accurate interpolation (as opposed to extrapolation, which is what you are kind of defining above).
We also use extrapolation for allowing the client side (non-authoritative) simulations to handle high latency/packet loss situations--i.e., clients will extrapolate (predict future movement state based on a single point and a previous history of velocity) to project into the future what they think the server side (authoritative) simulation is going to transmit.
Finally, we do extensive interpolation between physics ticks (which occur every 32 milliseconds), because in fact your video card is operating much faster than the physics update rate--if we were to only accept the physics position of all objects and render that directly, it would cause your player to do "mini-warps" from point to point every 32 milliseconds, instead of extremely smooth movement across the screen. That's what the function interpolateTick() does for many of your objects--it takes the last two known physics tick positions, and smoothly interpolates between them to get a render position to send to the video card.
If you'll follow this to the logical conclusion, you'll note a couple of things:
1) Clients are always utilizing data that is "in the past". That makes sense when you take into account that by definition, there is never zero time of transmission of network data, so instead of fighting that transmission time (latency), we design specifically for it.
1a) Too in depth for this discussion really, but in fact this is true for all objects on a client simulation except for one: the control object. Since a player (real human) is never going to be happy with pressing the "forward" key, but having to wait anywhere from 400 to 1000+ milliseconds to receive an update from the server (internet transmission here as well as networking tick cycle phasing), we actually apply a move input directly to the client simulation before we even send it to the server. This means that:
--for all ghosted objects, the clients are always using "in the past data".
--for the control object, the client is actually applying moves before the server even gets them, which means it's "in the future" for it's own control object.
--once the server applies the move, it then sends that information back to all clients (including the controlling client). For non-controlling clients, this is "in the past" data. For the controlling client it's still slightly "in the past", but it has to be integrated into the client's "in the future" state (remember, the client already applied this move themselves!)...which means that the client winds back in time to a previously synchronized state (from the server), and then applies the current updates being sent by the server, then winds forward with any moves that the server has not received yet.
2) Clients and the server are always in a constant "unwind to the past, apply moves/updates, wind back to the future" cycle, and the timing of this cycle is always different for each client.
3) Objects in our client simulation (which is what is rendered, so it's what it important to the player) are always in a constant tug of war between the client's non-authoritative position, and authoritative updates from the server. These updates may be from the past, or from the future, so all interpolation, extrapolation, and client side prediction (control objects only) is done in "future" ticks, and we render the last most complete state so the renders of the objects aren't completely insane.
4) Torque networking and Torque physics are the way they are for a reason--several of them in fact!
5) Stephen always gets verbose after a boot camp!
03/19/2006 (8:06 am)
Little known fact about Torque networking:We actually always interpolate for every ghosted object ever sent, because when the client receives the first packet for a particular object, it in fact ignores it completely until the second one arrives!
The reason for this is to have two known update states to interpolate between for accurate interpolation (as opposed to extrapolation, which is what you are kind of defining above).
We also use extrapolation for allowing the client side (non-authoritative) simulations to handle high latency/packet loss situations--i.e., clients will extrapolate (predict future movement state based on a single point and a previous history of velocity) to project into the future what they think the server side (authoritative) simulation is going to transmit.
Finally, we do extensive interpolation between physics ticks (which occur every 32 milliseconds), because in fact your video card is operating much faster than the physics update rate--if we were to only accept the physics position of all objects and render that directly, it would cause your player to do "mini-warps" from point to point every 32 milliseconds, instead of extremely smooth movement across the screen. That's what the function interpolateTick() does for many of your objects--it takes the last two known physics tick positions, and smoothly interpolates between them to get a render position to send to the video card.
If you'll follow this to the logical conclusion, you'll note a couple of things:
1) Clients are always utilizing data that is "in the past". That makes sense when you take into account that by definition, there is never zero time of transmission of network data, so instead of fighting that transmission time (latency), we design specifically for it.
1a) Too in depth for this discussion really, but in fact this is true for all objects on a client simulation except for one: the control object. Since a player (real human) is never going to be happy with pressing the "forward" key, but having to wait anywhere from 400 to 1000+ milliseconds to receive an update from the server (internet transmission here as well as networking tick cycle phasing), we actually apply a move input directly to the client simulation before we even send it to the server. This means that:
--for all ghosted objects, the clients are always using "in the past data".
--for the control object, the client is actually applying moves before the server even gets them, which means it's "in the future" for it's own control object.
--once the server applies the move, it then sends that information back to all clients (including the controlling client). For non-controlling clients, this is "in the past" data. For the controlling client it's still slightly "in the past", but it has to be integrated into the client's "in the future" state (remember, the client already applied this move themselves!)...which means that the client winds back in time to a previously synchronized state (from the server), and then applies the current updates being sent by the server, then winds forward with any moves that the server has not received yet.
2) Clients and the server are always in a constant "unwind to the past, apply moves/updates, wind back to the future" cycle, and the timing of this cycle is always different for each client.
3) Objects in our client simulation (which is what is rendered, so it's what it important to the player) are always in a constant tug of war between the client's non-authoritative position, and authoritative updates from the server. These updates may be from the past, or from the future, so all interpolation, extrapolation, and client side prediction (control objects only) is done in "future" ticks, and we render the last most complete state so the renders of the objects aren't completely insane.
4) Torque networking and Torque physics are the way they are for a reason--several of them in fact!
5) Stephen always gets verbose after a boot camp!
#3
Actually, I'll send this to my physics programmer to maybe explain some more our update issues :)
Phil.
03/21/2006 (7:12 am)
Nice one Stephen!Actually, I'll send this to my physics programmer to maybe explain some more our update issues :)
Phil.
#4
Thanks for the explanation!
03/21/2006 (7:15 am)
Stephen, it is disturbing how well you know the DarkSide of Torque(tm).Thanks for the explanation!
#5
03/21/2006 (8:12 am)
Thanks Stephen! Great explanation.
#6
haveing read it, it leaves me with a single question "What could be done as systems move away from a single faster core, to many cores to increase performance on machines?"
04/11/2006 (9:52 am)
Never did thank you for that long post Stephen, It was very informative.haveing read it, it leaves me with a single question "What could be done as systems move away from a single faster core, to many cores to increase performance on machines?"
#7
--push off the networking low level stuff (packet handling level, acknowledgement/synchronization, socket calls, etc.) to a different thread, and only deliver confirmed accurate updates to the main thread. This will save some cycles by using a different cpu, as long as your multi-threaded work is stable and efficient.
--prune off background/non-physics related updates to a different thread. Given a LARGE world, with thousands or tens of thousands of objects, we can put non-time critical update into a different update loop on a different thread. This would require pretty exstensive redesign of the update system, but has a pretty strong potential.
--prune off physics related updates to a different thread (or the PPU!). Again, pretty extensive redesign, and not network efficient. Useful mainly for client side only "eye candy" effects that require physics (think fluid dynamics--waterfalls, fountains, etc., or possibly things like ragdoll if synched properly server to client to server).
Unfortunately, much of this falls back to the "how do we keep servers and clients all on the same page, and still have awesome physics" challenge that no one in the industry (or even other industries) has solved yet.
04/11/2006 (11:06 am)
@Pauliver--great question, without much of a great answer (given Torque's networking scheme). Some thoughts:--push off the networking low level stuff (packet handling level, acknowledgement/synchronization, socket calls, etc.) to a different thread, and only deliver confirmed accurate updates to the main thread. This will save some cycles by using a different cpu, as long as your multi-threaded work is stable and efficient.
--prune off background/non-physics related updates to a different thread. Given a LARGE world, with thousands or tens of thousands of objects, we can put non-time critical update into a different update loop on a different thread. This would require pretty exstensive redesign of the update system, but has a pretty strong potential.
--prune off physics related updates to a different thread (or the PPU!). Again, pretty extensive redesign, and not network efficient. Useful mainly for client side only "eye candy" effects that require physics (think fluid dynamics--waterfalls, fountains, etc., or possibly things like ragdoll if synched properly server to client to server).
Unfortunately, much of this falls back to the "how do we keep servers and clients all on the same page, and still have awesome physics" challenge that no one in the industry (or even other industries) has solved yet.
#8
I did some TGE testing over a year ago (that i have since lost) where i threaded some/all of the networking stuff and used a queue to pass data, at the time it was pretty pointless, but it was a good exercies for me.
So when i ask that question my thought was that any change to an engine using multi threads would prettymuch have to extensvily use Queue's (or a similar data structure), and be a complete re-write to move away from a central loop and a linear code-flow to a more abstract kindof spider-web application. While that isn't the best way to describe it i think it gives a good mental picture.
It sounds like i wasn't too far off. In a non-centralized system like this wouldn't you also need some kind of rather advanced syncranization (spelling) system to make sure that one of your core's doesn't get too far behind? <-- prettymuch the same thing as your "how do we keep servers and clients all on the same page, and still have awesome physics"
Thank you very much for your insight, i guess i need to think more on this (Hey if it were an easy thing it would already be out there right? and we would be gaming on dual proc dual core machines which never lag..)
04/11/2006 (8:23 pm)
Again Stephen, thank you for the responce. I did some TGE testing over a year ago (that i have since lost) where i threaded some/all of the networking stuff and used a queue to pass data, at the time it was pretty pointless, but it was a good exercies for me.
So when i ask that question my thought was that any change to an engine using multi threads would prettymuch have to extensvily use Queue's (or a similar data structure), and be a complete re-write to move away from a central loop and a linear code-flow to a more abstract kindof spider-web application. While that isn't the best way to describe it i think it gives a good mental picture.
It sounds like i wasn't too far off. In a non-centralized system like this wouldn't you also need some kind of rather advanced syncranization (spelling) system to make sure that one of your core's doesn't get too far behind? <-- prettymuch the same thing as your "how do we keep servers and clients all on the same page, and still have awesome physics"
Thank you very much for your insight, i guess i need to think more on this (Hey if it were an easy thing it would already be out there right? and we would be gaming on dual proc dual core machines which never lag..)
#9
04/11/2006 (11:05 pm)
One thing to note: IIRC (and I may not, it's late), TGE 1.4 does allow for multi-threaded Events at a very low level (NetEVent and SimEvent I think), so this is a bit more feasible than it was last year.
Torque Owner Andrew Hull