Game Development Community

My Experience on optimisation on the iPhone

by Scott Wilson-Billing · in iTorque 2D · 02/23/2010 (4:53 pm) · 103 replies

Just thought I'd post some notes on what has really improved performance.

1. PVRTC

2. Atlas sprite sheets, mSourceRect etc.

3. Sprite pools, static and animated. Animations are put back into the pool at the end of the animation.

4. NO particle effects - just using animations generated from TimelineFX.

5. While loops replacing for loops - I kid you not!

6. Turn off updateCallBack - use timers for game loop events, dynamically change timer interval dependant on number of objects on screen - this smoothed stuff out.

7. Other stuff like collision groups to cut down on the noise from onCollision callbacks.

I've still got more PNGs to convert to PVRs but at the moment I'm very pleased:

3GS - 50+ FPS

iTouch 2nd Gen - 50+ FPS

I'm waiting feedback from our testers on the 3G.

The iPhone 2G is a bit of a no-go and I'm not going to put in any more effort here.

Thanks for all the advice from people on this forum.

Cheers
#21
05/08/2010 (5:54 pm)
Thanks for the feedback. So far I've only tested my game using an iPod Touch 2nd generation. Everything is in script at this point and I've been experimenting to see how complicated I can make my levels, keeping the framerate above 30 fps. I'm actually surprised how many objects I can have on screen before I see any performance issues, so I'm quite impressed with the engine so far.

I'm tempted to keep everything in script, and scale the levels accordingly to keep above 30 fps. This seems feasible on my 2nd gen iPod Touch, but I'm not sure how this model compares to the 1st gen, or the iPhone 2G. I've looked at the specs and they don't seem all that different (400 MHz compared to 533), so I wasn't expecting the game logic to suffer too much. The biggest difference is probably the GPU, so I will probably need to scale back on the graphics effects. I'm thinking add some user-selectable options so that users with older models can turn off certain things.

I've ordered a refurbished iPod Touch 1st gen, so I will be able try that out in a few days.
#22
05/08/2010 (11:37 pm)
Scott and Marc, I really appreciate your feedback, and it's obvious that you guys do know what you're talking about. But, I still don't get this C++ optimisation step, and I'm not convinced that it's necessary. I'm still convinced that the bottleneck is the GPU, and I don't see how it's possible to improve graphics performance by hacking the game loop logic, which is handled by the Arm processor. I think the only way to improve things is to work on the graphics content.

Scott, have you really seen any significant improvements by porting to C++? I would be very interested to know if you have any framerates comparisons. If you do see a big improvement, are you doing a lot of mathematical type calculations in your script? My game is a breakout/invaders hybrid and most of my sprites are implemented using tilemaps, which seem to be handled very efficiently by the engine. My scripts contain mostly very simple logical operations, and I'd be very surprised if they are the bottleneck, unless the scripting engine is extremely bad.

Having said all of the above, I personally would have preferred it if iTGB was based on C++ from the beginning as that is my language of choice anyway. The actual engine is great, but the current toolchain and development process of using script to produce an iTorque game (which is what they advertise) is about the worst I've ever seen in a commercial software product. It would have been much better if they'd built iTorque around Xcode from the beginning.

Any thoughts?

#23
05/09/2010 (2:58 am)
Mark, I did see the average FPS for the 3G go from 10-13 to 20-23 FPS when I move some code to C++ that was being run every 200ms - this was the code that checked if the invaders had hit the left/right/down boundaries and every invader had to be checked and moved accordingly.

My understanding from people with more knowledge than me about Torque is that the engine can get CPU bound due to the nature in which the script is being interpreted.

I think the guiding principal should be that if it runs fine in script on the older devices and you are happy with FPS then leave it in script - don't go to C++ for the sake of it .
#24
05/09/2010 (4:58 am)
The bottlenecks in all 3 of our iPhone games have been CPU related.

Make sure that you are running the Torque Profiler and see what code is getting hit on... if you see lots of "script talking" (ie. execute, onBlahBlahBlah, etc.), then make sure to eliminate those calls and replace them with C++ functionality.
#25
05/09/2010 (6:38 am)
Thanks Luke. Does the Torque profiler run on the device or does it have to be done on the desktop. Is there a link you could point me at?

Cheers
#26
05/09/2010 (1:34 pm)
Scott, that's very interesting to see that you're getting such a big improvement in FPS. So you're right, it must be CPU limted. I'm surprised, but there you go. I'll see how my game runs on the 1st gen iPod Touch, and if necessary I'll think about porting to C++.

Luke, I didn't know about the Torque profiler, so thanks for that. We definitely need something like that to find where the bottlenecks are.

Scott, I found this reference in the documentation section:

http://tdn.garagegames.com/wiki/Torque/Profiler

#27
05/09/2010 (1:38 pm)
Thanks for the link Mark - it looks like the profiling has to be done on the desktop? But, its a good start for identifying those parts of the code running a bit slow.
#28
05/09/2010 (6:00 pm)
Scott, I've been thinking. Have you tried using t2dTrigger for your screen boundaries instead of using a timer? You would then get a callback whenever an invader reaches the scene edge. In my game I've set up a trig area at the bottom of the screen, just below the camera, and whenever an object enters that area I get the following callback:

function trig::onEnter(%this, %obj)
{
// put stuff here to deal with the %obj that entered trig
}

In my game I use that area for collecting lost balls, and powerups etc.

Just a thought. It might a little more efficient, not sure.
#29
05/09/2010 (11:59 pm)
Mark, I did consider a trigger but I read on the forum somewhere (early days) that they were not performant. The timer isn't so much of a problem because I do a whole bunch of other stuff now e.g. drop bomb, swoop down screen etc.
#30
05/10/2010 (12:25 am)
Ok, just a thought. If you're doing a lot of things in that 200ms timer loop, I just thought it would distribute the load if you had a trigger with a smaller callback function. I've not experienced any slow downs using t2dtrigger. I must admit, I do like using timers because at least then you tend to get very repeatable performance, whereas callBacks are random, which could potentially cause bottlenecks at random times (impossible to debug).

BTW, I don't know if you've experience this, but I seem to be getting some random slowdowns that don't seem to be triggered by anything specific. I can be sat there with very little going on and then suddenly I get massive lag for a fraction of a second, and it's really obvious. Have you ever seen anything like that? I'm not sure even how to trace where this could be happening, since it only happens occasionally.
#31
05/10/2010 (12:30 am)
Yep, I get them during the first couple of levels (2-3) - I have the FPS showing and I don't see a drop in frame rate. It used to be very bad until I start pre-allocating all my sprites (inc animated) and put them in an object pool.

After the first 2/3 levels the game seems to settle down and I see them much less. Perhaps the Torque profiler will help out?

One thing which drove me nuts was that at one point I got a constant every couple of seconds frame rate drop, perhaps 10 fps. Eventually this was fixed with a reboot of the device, having spent hours trying to track it down in code :)
#32
05/10/2010 (12:31 am)
@Mark: You can easily measure what is causing the FPS problem.
Thats what the instruments actually are for :)
Run them on the device and ensure to include the corresponding ones from cpu usage, gpu usage and memory usage.

you will find out that on 1st gen and iphone 3G, your bottleneck is close to always the cpu (100%) while the gpu gets hardly above 30-50% of usage. (SpaceShooter Component was like this too)


From former experiments from various users here its clearly indicated that TGBs "physics system" is particularily responsible for this, but its not the only thing responsible for it (using too many callbacks especially update ones is bad too for example)


As for the "it is first worse and then gets better behavior" thats a common thing with all 3D related programming (TGB is 2D through 3D) and is caused by the fact that textures on their first usage must be uploaded to the graphic ram. This has some significant impact on the rendering performance.
To optimize that you would normally have all textures you use on screen at start, hidden behind a gui object (loading screen) for example
#33
05/10/2010 (12:46 am)
Thanks for the feedback. I have tried using instruments but it didn't show anything obvious, although it's possible that I wasn't using it very effectively. I will try that again.

I've been following some of your older posts (Scott's and Marc's) and I've implemented quite a few of your suggestions, which help a lot. I'm using sprite pools now, which makes a big difference, but I'm still not sure where those random slowdowns are coming from. These seem to be happening when everything is already loaded, and at rnadon times where nothing specific is happening. I'm still not using PVR compression yet, though, so perhaps there is some memory swapping going on.
#34
05/10/2010 (1:32 am)
memory swapping is possible but much more likely VRAM overwrites.
they happen when you try to load more textures than fit into the 22.x MB available vram on pre 3GS. In that case the oldest unused texture(s) in the vram are overwritten with the texture you want to draw now and as in the case of "not rendered before" the texture first needs to be uploaded again.

if your design allows, use tricks similar to the one above for level start to initiate such switches.
Often they are likely to happen "not just like that" but on some given world happening like you moving through area or alike after you have talked to an npc for new direction. in this case use the npc talk dialog to pre-render it behind that for example to get it into vram. during talk the slight stutter is unnoticable as nothing "moving is going on"


If your game does not allow missuse of such standstill alike phases its naturally a whole different story and little can be done about it aside of going to pvr which take less vRAM but just as important: load faster as they are already in gpu native format (unlike png / jpg that are loaded and then converted into plain byte data)
#35
05/11/2010 (10:49 pm)
Just letting you guys know. I've been able to get rid of that lag problem that I was talking about. It was actually poor design on my part.

I'm pooling all of my sprites, so I set them up and create a few simSets as part of my level initialisation. It seems obvious now, but the mistake I was making is that I was deleting the objects that were nolonger needed during the level. This must have been causing excessive amounts of memory management type activity that eventually caused some lag. The actual lag didn't show up immediately though, and it was sometimes quite a while before it showed up, which must be something to do with how the memory managment works on the device. What I am doing now is to recycle those objects, putting them back into their containers when they are finished being used, and I then clear out the containers at the end of the level.

My average fps has gone down a bit, because of the extra control logic, but it runs very smoothly now and I don't see any lag. I'm implemented most of the optimisation steps that Scott mentioned in the original post, and I'm happy with the result. I must admit I haven't seen any huge improvements in fps, but I have seen a huge improvement in smoothless. My fps is now very consistent.



#36
05/12/2010 (12:51 am)
Nice one Mark! I have a flag that is associated with all my pooled objects (I have pools for static and animated) and instead of calling safeDelete on an object I pass it off to my object manager when I'm done with it. The manager checks the flag, if present it goes back into the (named) pool, otherwise the manager calls safeDelete.

One other thing - do you have animations and have you pooled these? The bit I struggled with was knowing when an animation had finished and therefore when to put it back into the pool. In the end I figured out onAnimationEnd callback is the place where the animation calls the object manager.
#37
05/12/2010 (3:25 pm)
Scott, thanks for the feedback. Initially I was always calling safeDelete when an object was nolonger needed in that level, and that was the thing that causing some lag at randon times. Now, I'm doing something similar to the way you do it. After an object has been used I put it back into the pool (I have about 6 different pools, so each pool stores only one type of sprite), then when the level is complete I run a function that cleans up the containers, and deletes all of the objects that aren't needed anymore.

BTW, I also had a bit of a struggle figuring out how to pool animated sprites. In the beginning I didn't do any pooling of the animated sprites, and I was just creating them on demand (for explosions and things like that). Amasingly, this actually works fairly well, but there was some obvious lag, and I wasn't happy with it. In the end I decided to create a pool of animated sprites, and I use t2dAnimatedSprite::onAnimationEnd(), the same as you, for managing the pools.
#38
05/14/2010 (9:19 pm)
Just thought you might be interested to know. I now have all three versions of the iPod Touch available for testing, and here are my approximate results so far:

iPod Touch 3rd Gen: performance is amazing. FPS is right up there in the 50s and hardly changes, and CPU usage is around 60% at most.

iPod Touch 2nd Gen: this is what I've been using all along, so my game so far has been optimised for this. I'm seeing FPS in the 30-40 range, and CPU usage is around 65-70%. Gameplay is smooth, and no obvious bottlenecks.

iPod Touch 1st Gen: this is where things take a nose five. FPS is down to about 15 or so, and the game is almost unplayable. The 1st gen really sucks bad compared to the others. CPU usage is around 75%, so it appears as though the bottleneck is the GPU, which is what I was thinking originally. I'm not 100% on that yet, but initial testing seems to suggest that the GPU is the problem.

My 1st gen is from 2007 and I'm wondering how many people have these devices compared to others. Do you think it is even worth bothering with? I'm not even sure if it's possible to make my game run fast enough on that device, even if I ported everything to C++. I think it's just that the engine is too slow for that device. What about the iPhone, how does that compare?

Any suggestions?

At the moment, I'm working on a mininal game with just a ball bouncing around the screen (very minimal scripting), so I can gauge how much I can get from the 1st gen device, and even that runs slowly, so I'm thinking of giving up on the 1st gen.

Does anyone else have any FPS comparisons, and has anyone been able to get fast enough framerates using the 1st generation devices.

I'm going to d/l some apps and try them on my 1st gen, and I'll let you know what I find.
#39
05/14/2010 (11:17 pm)
Ok, I now have a ball bouncing example that runs smoothly on the 1st gen. I think I am just trying to do too much graphics, and what I am trying to do is not possible, which is frustrating. My game is definitely not CPU limited, which makes sense, as there is very little difference in CPU between 1st and 2nd gen, which is the point I was trying to make before.

I guess I will need to rethink the whole thing, and if 1st gen iPod is an important platform I think I will need to think about moving on, and trying a different engine. Moving to C++ will improve things, obviously, but my scripting is so simple that I don't think it will make the game playble on 1st gen devices (plus the CPU is only running around 70% (even on the 1st gen), so that kind of confirms it).

Are there any iTGB games out there that are fast action games that require fast FPS, so I can gauge whether I am doing something wrong somewhere? I've not seen a fast action game out there that shows any torque splash screens, and I d/l a lot of games.

The main problem that I am now seeing in the forums is that the iTorque community has become so small that most of the responses are biased and unreliable, and I'm losing confidence.

I'm going to evaluate Unity....

#40
05/15/2010 (2:00 am)
Mark, it sounds like your game is fine for 2nd gen touches - this is what I would consider the base "gaming" platform. Most avid app store gamers will have moved on from the 1st gen touch. Personally I would spend the effort on completing your game for the 2nd and 3rd gen touches, get it on the app store and then see what the response is.

Incidentally, iWT is now in the "What's hot" section of the app store in 21 countries (not US unfortunately) and "Staff Favourites" in two others, so the poor Torque performance on 1st gen touch obviously isn't a drag ;-)