TorqueScript has suboptimal performance
by smally · in Torque 3D Professional · 09/25/2012 (8:29 pm) · 112 replies
This is a continuation of the conversation from github.com/GarageGames/Torque3D/issues/10.
Feel free to jump in and continue the conversation here.
Feel free to jump in and continue the conversation here.
About the author
Most people say that is it is the intellect which makes a great scientist. They are wrong: it is character. -- Albert Einstein
#22
09/26/2012 (3:45 pm)
The issue is with Sim::spawnObject() object creation:SimObject *spawnObject(String spawnClass, String spawnDataBlock, String spawnName,
String spawnProperties, String spawnScript)
{
if (spawnClass.isEmpty())
{
Con::errorf("Unable to spawn an object without a spawnClass");
return NULL;
}
String spawnString;
spawnString += "$SpawnObject = new " + spawnClass + "(" + spawnName + ") { ";
if (spawnDataBlock.isNotEmpty() && !spawnDataBlock.equal( "None", String::NoCase ) )
spawnString += "datablock = " + spawnDataBlock + "; ";
if (spawnProperties.isNotEmpty())
spawnString += spawnProperties + " ";
spawnString += "};";
// Evaluate our spawn string
Con::evaluate(spawnString.c_str());
// Get our spawnObject id
const char* spawnObjectId = Con::getVariable("$SpawnObject");
// Get the actual spawnObject
SimObject* spawnObject = findObject(spawnObjectId);
// If we have a spawn script go ahead and execute it last
if (spawnScript.isNotEmpty())
Con::evaluate(spawnScript.c_str(), true);
return spawnObject;
}This relies on the console. I think this is how every object gets created from C++ or script.
#23
From what I can tell at least you can create any SimObject, in this sense the API can *theoretically* do anything you'd be able to do from TorqueScript.
09/26/2012 (3:46 pm)
@smally indeed, the way engineAPI is structured currently the memory you construct the objects in has to be owned by the callee. Perhaps the reasoning here is that since you are creating simulation objects from your own code, you might as well manage the memory too. Or maybe it's so you can easily construct pointer-less types in arrays? Maybe it's just incomplete? Or perhaps there is some other reasoning I am unaware of.From what I can tell at least you can create any SimObject, in this sense the API can *theoretically* do anything you'd be able to do from TorqueScript.
#24
09/26/2012 (3:53 pm)
@frank I actually saw that a while ago, that can be easily rewritten not to use script. This is only used by the spawnObject ConsoleFunction and the SpawnSphere. When creating normal SimObjects in script I can assure you, it doesn't go anywhere near that code. Look at the implementation for the OP_CREATE_OBJECT opcode which uses ConsoleObject::create.
#25
Oh, okay. I will have to look a that. Thanks for the clarification.
09/26/2012 (4:00 pm)
@jamesu,Oh, okay. I will have to look a that. Thanks for the clarification.
#26
Can you tell me if you meant no optimization for locals or no allocation for.. I can't imagine you'd mean the latter.
09/27/2012 (1:45 pm)
@james when you stated "Optimized allocation for globals (none for local variables)" Can you tell me if you meant no optimization for locals or no allocation for.. I can't imagine you'd mean the latter.
#27
Would it be a good idea to start mapping where the console touches the engine? Some obvious places are console commands and such, callbacks, etc. However it could be useful to see where those locations are as we experiment. I know a lot of you may have done this or know this to some degree already so if you have some ideas on this. Could this be something we use doxygen for?
09/27/2012 (10:45 pm)
@All,Would it be a good idea to start mapping where the console touches the engine? Some obvious places are console commands and such, callbacks, etc. However it could be useful to see where those locations are as we experiment. I know a lot of you may have done this or know this to some degree already so if you have some ideas on this. Could this be something we use doxygen for?
#28
A question about speed and the console interface.
Right now in my Python extension I call console functions like this:
I know if I can avoid the check to see if the function exists it will speed this up. I was thinking about building a list and checking to see if this function was already in the list. If it is then I will check the list first and return the function wrapper if it is in the list. This would avoid the namespace check on the next run of that console command.
Another avenue is to use the ConsoleXMLExport function to create this list before hand.
The question I have is: Considering the engine the way it is now, is Con::exec the fastest way to execute a console command?
09/27/2012 (11:14 pm)
@All,A question about speed and the console interface.
Right now in my Python extension I call console functions like this:
- Python checks to see if the Python object that houses the Sim interface has an attribute of <name>. I check the console to see if this function exists by calling script_get_namespace_entry in c_scripting.cpp. If it has a namespace it returns true, if not false.
- If the function exists check is true a Python function is returned to from the attribute check. This function has the exec call and parameters get wrapped into an array list of c strings the way exec expects them.
- Then Python sees that this attribute is executable and runs the function returning any results as a string.
I know if I can avoid the check to see if the function exists it will speed this up. I was thinking about building a list and checking to see if this function was already in the list. If it is then I will check the list first and return the function wrapper if it is in the list. This would avoid the namespace check on the next run of that console command.
Another avenue is to use the ConsoleXMLExport function to create this list before hand.
The question I have is: Considering the engine the way it is now, is Con::exec the fastest way to execute a console command?
#29
When I wrote that I was getting a bit confused with the implementation of the console in T3D vs previous products. Previously variable allocation was much worse. In T3D however there is a chunker (FreeListChunker) which should in theory resolve the overhead of allocating variable entries, unless the memory used by the entries grows past DataChunker::ChunkSize in which case it will keep allocating extra blocks every time the variable list grows in a function.
There also isn't as much overhead in ExprEvalState::pushFrame (when entering the function) as I initially thought as it keeps stack frames around. The only other potential allocation overhead I can see at the moment is when you create a string variable. Perhaps string allocation could be optimized.
In a sense though I am getting the feeling people have gone down this road before but they still haven't managed to optimize it to a point where it rivals the speed of other scripting languages.
I still think though the way local variables are used could be optimized. While normal variable lookup is optimized in the sense that it only looks up the name in the StringTable once (when the Codeblock is loaded), it still has to walk through the hash table each time when looking up a variable. With the exception of arrays (%foo[123], %foo[1,2,3]), do we really need to lookup a variable in a hash table each time?
...
On another note, I've been experimenting a bit with a refactor of how function arguments are passed. i.e. Instead of converting everything to a string it will keep the variable type (this includes changing the type used in the native function macros), and when returning it will retain the type of the return variable.
My first implementation was actually worse as my console value type was terrible (though it did work). I'm currently rewriting it so hopefully I'll see some real improvements in my test times this time round.
@frank
The fastest way of executing a console function? If your function is defined in script, Con::exec. If your function is a ConsoleFunction/ConsoleMethod/DefineEngineFunction/DefineEngineMethod, the fastest possible way is by directly calling the function, assuming you know its address.
09/28/2012 (3:45 am)
@smallyWhen I wrote that I was getting a bit confused with the implementation of the console in T3D vs previous products. Previously variable allocation was much worse. In T3D however there is a chunker (FreeListChunker) which should in theory resolve the overhead of allocating variable entries, unless the memory used by the entries grows past DataChunker::ChunkSize in which case it will keep allocating extra blocks every time the variable list grows in a function.
There also isn't as much overhead in ExprEvalState::pushFrame (when entering the function) as I initially thought as it keeps stack frames around. The only other potential allocation overhead I can see at the moment is when you create a string variable. Perhaps string allocation could be optimized.
In a sense though I am getting the feeling people have gone down this road before but they still haven't managed to optimize it to a point where it rivals the speed of other scripting languages.
I still think though the way local variables are used could be optimized. While normal variable lookup is optimized in the sense that it only looks up the name in the StringTable once (when the Codeblock is loaded), it still has to walk through the hash table each time when looking up a variable. With the exception of arrays (%foo[123], %foo[1,2,3]), do we really need to lookup a variable in a hash table each time?
...
On another note, I've been experimenting a bit with a refactor of how function arguments are passed. i.e. Instead of converting everything to a string it will keep the variable type (this includes changing the type used in the native function macros), and when returning it will retain the type of the return variable.
My first implementation was actually worse as my console value type was terrible (though it did work). I'm currently rewriting it so hopefully I'll see some real improvements in my test times this time round.
@frank
The fastest way of executing a console function? If your function is defined in script, Con::exec. If your function is a ConsoleFunction/ConsoleMethod/DefineEngineFunction/DefineEngineMethod, the fastest possible way is by directly calling the function, assuming you know its address.
#30
09/28/2012 (3:53 am)
I'd love to see an example of how to call it by address, cause I spent a bit of time trying to solve that myself.
#31
I've been around with Torque for ages, and I'm pretty versed in how to use TS to get the most out of it, and if you DO get good with it, you can do some absolutely nuts things.
So I'm kinda curious where everyone's gripes are with it outside it being slower than it could be with some solid optimization.
09/28/2012 (6:48 am)
Reading through that discussion, there's quite a bit of hate for TS(optimizations non-withstanding).I've been around with Torque for ages, and I'm pretty versed in how to use TS to get the most out of it, and if you DO get good with it, you can do some absolutely nuts things.
So I'm kinda curious where everyone's gripes are with it outside it being slower than it could be with some solid optimization.
#32
09/28/2012 (7:08 am)
<3 TS ... apart from the slow thing ... which is why heavy lifting goes in C++ ...
#33
TS is cool and works fine for me and others. Tribes2 was running smooth at old PCs, and with modern CPUs it is slow? o_O
Of course if you want to find 1000th digit in "Pi", or compare its speed with C++, than TS is is slow! It was not made for handling complex stuff.
Move heavy stuff to C++ and enjoy!
09/28/2012 (7:14 am)
@Steve: +1TS is cool and works fine for me and others. Tribes2 was running smooth at old PCs, and with modern CPUs it is slow? o_O
Of course if you want to find 1000th digit in "Pi", or compare its speed with C++, than TS is is slow! It was not made for handling complex stuff.
Move heavy stuff to C++ and enjoy!
#34
I do believe Con::exec is your target there, unless you know the specific address of the function your wanting to call.
@James, I'm glad your going down the road of a native type since this will actually make an impact on the speed. We shouldn't really need to look up local variables in a hash table each time they are referenced or used. Globals could be mapped to an identifier and when a scope doesn't have a reference point to that global, it could be mapped similar.
@All, I sure hope it's not coming off as hate. I like TS.. and I don't believe any of us would be looking at it with such a critical eye if we didn't see it as worth doing.
I also agree with moving the heavy lifting to C++ since TS wasn't meant to be the all around, script the entire game in it solution.
TS Implementations have varied a lot across products from what I've seen, and T3D's looks better than T2D's did as far as performance went. I've also used TS Since the original release of the engine.
I guess my point is, if we can squeeze some more out of it what does it hurt? There are some who can do the scripting and others who can do the C++ part, but not everyone can do both, and it only really serves to better the engine as a whole anyway . :)
09/28/2012 (10:31 am)
@Frank, I agree.. mapping where it touches the engine could help.I do believe Con::exec is your target there, unless you know the specific address of the function your wanting to call.
@James, I'm glad your going down the road of a native type since this will actually make an impact on the speed. We shouldn't really need to look up local variables in a hash table each time they are referenced or used. Globals could be mapped to an identifier and when a scope doesn't have a reference point to that global, it could be mapped similar.
@All, I sure hope it's not coming off as hate. I like TS.. and I don't believe any of us would be looking at it with such a critical eye if we didn't see it as worth doing.
I also agree with moving the heavy lifting to C++ since TS wasn't meant to be the all around, script the entire game in it solution.
TS Implementations have varied a lot across products from what I've seen, and T3D's looks better than T2D's did as far as performance went. I've also used TS Since the original release of the engine.
I guess my point is, if we can squeeze some more out of it what does it hurt? There are some who can do the scripting and others who can do the C++ part, but not everyone can do both, and it only really serves to better the engine as a whole anyway . :)
#35
I was mostly referring to when talking about script languages, TS is at the bottom of most everyone's list of 'preferred' ones, and I'm curious why that is.
Is it the syntax? The fact that it's typeless? etc.
Personally, I rank TS quite highly compared to several other languages because of the utterly rediculous programming judo I can do with it, but I very well understand everyone has different appeals, and I'm curious what, in their eyes, makes TS inferior to say, Python, or Lua(again, performance non-withstanding).
09/28/2012 (10:39 am)
I guess 'hate' may not have been the right word.I was mostly referring to when talking about script languages, TS is at the bottom of most everyone's list of 'preferred' ones, and I'm curious why that is.
Is it the syntax? The fact that it's typeless? etc.
Personally, I rank TS quite highly compared to several other languages because of the utterly rediculous programming judo I can do with it, but I very well understand everyone has different appeals, and I'm curious what, in their eyes, makes TS inferior to say, Python, or Lua(again, performance non-withstanding).
#36
Speaking of Tribes 2, I should probably run some tests against early Torque for comparison ;)
09/28/2012 (10:39 am)
@fyodor @steve @jeff Personally I don't hate TorqueScript, I just find the implementation annoying. Notably I had a lot of problems with the speed in iTorque, where execution speed takes a nosedive. And really it should be much faster, especially when people are tempted to overly use it in a project. Sure you can just implement critical parts in C++, but to me that is just ignoring the issue at hand.Speaking of Tribes 2, I should probably run some tests against early Torque for comparison ;)
#37
TS has served its purpose. It was built when relatively few VMs were available. However, just about every VM has surpassed it in features and speed. Some of the gripes:
The questions are: Do we spend the time and effort maintaining a less featured scripting language indefinitely? Or do we spend the time and effort ONCE integrating a more advanced and fully featured scripting language?
In my opinion the scripting language should be fast enough to fully prototype a game. Then you identify your bottlenecks and optimize the bottlenecks.
09/28/2012 (10:44 am)
@Jeff,TS has served its purpose. It was built when relatively few VMs were available. However, just about every VM has surpassed it in features and speed. Some of the gripes:
- Arrays are not arrays and are slow.
- Limited iterator support.
- No JIT.
- No way to import binary libraries dynamically. This limits code reuse.
- Server integration limited. TS is fine for FPS, but has few features for integration with MMO servers.
- Not based upon any standard language.
- Very limited abstraction support.
The questions are: Do we spend the time and effort maintaining a less featured scripting language indefinitely? Or do we spend the time and effort ONCE integrating a more advanced and fully featured scripting language?
In my opinion the scripting language should be fast enough to fully prototype a game. Then you identify your bottlenecks and optimize the bottlenecks.
#38
The things I believe (my perception here) people are having issue with is no INHERENT array type. Yes you can simgroup things but people kind of expect to see an array type in a language of any sort.
One thing I do like, even though it's a little weird.. is % for local variables and $ for global. It helps people keep in mind the scope they are working in, and for new comers that's important. Though it can be a little annoying but eh, whatever lol. It's quick to code and gets the job done, therefore I'm happy with it.
Comparing TS to python is almost self defeatist to me because you either love python or you hate it. I don't care for it myself mostly because I prefer C Like syntax.. I never saw the problem with defining a block with { and } instead of whitespace. Lua is just fast. I don't get otherwise why it's so loved so I can't contribute to that part.
@Frank, I agree actually.. which is why I took the route I did.. :) TS++ maybe? lol
09/28/2012 (10:50 am)
@Jeff, yes TS is a ninja script if you know what your doing ;) Which is why I love it. The things I believe (my perception here) people are having issue with is no INHERENT array type. Yes you can simgroup things but people kind of expect to see an array type in a language of any sort.
One thing I do like, even though it's a little weird.. is % for local variables and $ for global. It helps people keep in mind the scope they are working in, and for new comers that's important. Though it can be a little annoying but eh, whatever lol. It's quick to code and gets the job done, therefore I'm happy with it.
Comparing TS to python is almost self defeatist to me because you either love python or you hate it. I don't care for it myself mostly because I prefer C Like syntax.. I never saw the problem with defining a block with { and } instead of whitespace. Lua is just fast. I don't get otherwise why it's so loved so I can't contribute to that part.
@Frank, I agree actually.. which is why I took the route I did.. :) TS++ maybe? lol
#39
Personally I would really like to use some of Javascript's features with Torque not because it is my favored VM, but because it is fast and well supported with V8. I prefer Python to JS by far, but I can see the wisdom of a JIT compiled language like JS. There is no reason that the TS compiler cannot be retargeted to another VM either.
09/28/2012 (10:59 am)
For a code example:# create a list list = [2,3,7,8,10] # operate on every member of the list and return the modified list # the map function is optimized in C and is really fast list = map(lambda x: x+2,list)To do the same in TS would require a loop and be very slow regardless. This is only one feature of Python. You could take any modern scripting language and produce similar examples in either: abstraction, efficiency, speed, or library support.
Personally I would really like to use some of Javascript's features with Torque not because it is my favored VM, but because it is fast and well supported with V8. I prefer Python to JS by far, but I can see the wisdom of a JIT compiled language like JS. There is no reason that the TS compiler cannot be retargeted to another VM either.
#40
09/28/2012 (11:03 am)
the JIT of javascript is highly dependent of the engine running the javscript. Yes, it's JIT with V8 but it's not the same elsewhere.
Associate James Urquhart
Torque3D has a callback system which is used in place of a lot of the script callbacks. See the DECLARE_CALLBACK and IMPLEMENT_CALLBACK macros. Not every callback uses this so it's a bit incomplete, but it's a neat idea. I'm not quite sure how you implement a callback using the engineAPI though (I think you might have to set the address?), more documentation is needed in that area I think. You could of course stick whatever you want on IMPLEMENT_CALLBACK to tie it in with another system.
I'd imagine if you wanted to implement a new VM you'd handle the "gee what happens when there is no VM?" scenario in your new code. Currently in TorqueScript when you reference a SimObject you either use an ID or a name which is then looked up with Sim::findObject which resolves everything based on lists owned by Sim::. TorqueScript in this case is more like the glue which binds stuff together.