TGEA 1.0.7 Beta 2 - Client problems during and after connection
by Jason "fireVein" Culwell · in Torque Game Engine Advanced · 04/03/2008 (8:16 am) · 13 replies
On a fresh install of the TGEA 1.0.7 SDK, with nothing modified at all, a client that is trying to connect to my server will crash. Upon examination of the logfile it says that there are several invalid datablocks. Having encountered this problem before, I went to to the terrain texture editor and removed all references to the textures that it had listed(wrong path) and the client connected fine. I believe this is a known issue already, however.
So, after the client is able to connect there is another issue... the scene isn't lit. Terrain is overbright, interiors are black, ect.. It turns out that during loading the client completely skips the lighting phase, and if there is a *.ml already there it won't load the file. I haven't looked at any scripts or code yet to see if I can find out why a connecting client is skipping the lighting phase. I will take a look at them a later when I have more time, just wanted to get this bug report up asap.
edit: Sorry, forgot to check logfile :P I found this in it:
*** Phase 3: Mission Lighting
Invalid filename 'C:/Documents and Settings/Owner/Desktop/zwitter on Menotyou/scriptsAndAssets/data/missions/stronghold_b29c92csg.ml'. Failed to light mission.
Mission lighting done
Is the server telling the client which *.ml to use? I checked the name of the *.ml in the server, and it is identical to the one above. I tried copying it over to the client pc, which did not have a *.ml at all, and it still fails to light or load the *.ml
I don't think that system specs or drivers are relevent to the issue, but if you want/need them I will post them. Same goes for the dxdiag data.
So, after the client is able to connect there is another issue... the scene isn't lit. Terrain is overbright, interiors are black, ect.. It turns out that during loading the client completely skips the lighting phase, and if there is a *.ml already there it won't load the file. I haven't looked at any scripts or code yet to see if I can find out why a connecting client is skipping the lighting phase. I will take a look at them a later when I have more time, just wanted to get this bug report up asap.
edit: Sorry, forgot to check logfile :P I found this in it:
*** Phase 3: Mission Lighting
Invalid filename 'C:/Documents and Settings/Owner/Desktop/zwitter on Menotyou/scriptsAndAssets/data/missions/stronghold_b29c92csg.ml'. Failed to light mission.
Mission lighting done
Is the server telling the client which *.ml to use? I checked the name of the *.ml in the server, and it is identical to the one above. I tried copying it over to the client pc, which did not have a *.ml at all, and it still fails to light or load the *.ml
I don't think that system specs or drivers are relevent to the issue, but if you want/need them I will post them. Same goes for the dxdiag data.
About the author
http://www.microdotproductions.com - I am a self taught programmer that has been hacking away at code for a little over 10 years now. I am a very passionate and persistent programmer, and gamer. I love challenges and problem solving.
#2
is this thing alive?
04/03/2008 (8:22 pm)
Issue resolved. It fixed itself. o.O I did nothing out of the norm. WEIRD No, seriously guys... is this thing alive?
#3
It's also important to note that the problem will mysteriously 'go away' and come back again without any code changes.
Here's a log from the crashing, connecting client:
*** Phase 1: Download Datablocks & Targets
ParticleEmitterData((null)) unable to find particle datablock: 48
ParticleEmitterData((null)) unable to find particle datablock: 127
SFXProfile((null))::onAdd: Invalid packet, bad description id: 4
SFXProfile((null))::onAdd: The profile is missing a description!
ParticleEmitterData((null)) unable to find particle datablock: 135
SFXProfile((null))::onAdd: Invalid packet, bad description id: 8
SFXProfile((null))::onAdd: The profile is missing a description!
ParticleEmitterData((null)) unable to find particle datablock: 152
ParticleEmitterData((null)) unable to find particle datablock: 120
ParticleEmitterData((null)) unable to find particle datablock: 146
SFXProfile((null))::onAdd: Invalid packet, bad description id: 3
SFXProfile((null))::onAdd: The profile is missing a description!
SFXProfile((null))::onAdd: Invalid packet, bad description id: 7
SFXProfile((null))::onAdd: The profile is missing a description!
ParticleEmitterData((null)) unable to find particle datablock: 172
SFXProfile((null))::onAdd: Invalid packet, bad description id: 3
SFXProfile((null))::onAdd: The profile is missing a description!
SplashData::onAdd: Invalid packet, bad datablockId(particle emitter): 0x7b
SplashData::onAdd: Invalid packet, bad datablockId(particle emitter): 0x73
ParticleEmitterData((null)) unable to find particle datablock: 175
SFXProfile((null))::onAdd: Invalid packet, bad description id: 12
SFXProfile((null))::onAdd: The profile is missing a description!
ParticleEmitterData((null)) unable to find particle datablock: 62
ParticleEmitterData((null)) unable to find particle datablock: 64
ParticleEmitterData((null)) unable to find particle datablock: 142
ParticleEmitterData((null)) unable to find particle datablock: 143
04/08/2008 (1:39 am)
I'm having this exact same issue.... it seems to be a problem with datablocks not being sent correctly, I found some of the SFX ones were referencing filenames with .wav instead of .ogg, fixed those but now I'm getting issues with ParticleData datablocks crashing, I added some code in that would prevent the crashes (out of bounds lookup in particledata->particleDataBlocks[]) but now the particle effects that were screwing up obviously don't show, and I get the fullbright mission as well.. This is with the newly shipped version of 1.7It's also important to note that the problem will mysteriously 'go away' and come back again without any code changes.
Here's a log from the crashing, connecting client:
*** Phase 1: Download Datablocks & Targets
ParticleEmitterData((null)) unable to find particle datablock: 48
ParticleEmitterData((null)) unable to find particle datablock: 127
SFXProfile((null))::onAdd: Invalid packet, bad description id: 4
SFXProfile((null))::onAdd: The profile is missing a description!
ParticleEmitterData((null)) unable to find particle datablock: 135
SFXProfile((null))::onAdd: Invalid packet, bad description id: 8
SFXProfile((null))::onAdd: The profile is missing a description!
ParticleEmitterData((null)) unable to find particle datablock: 152
ParticleEmitterData((null)) unable to find particle datablock: 120
ParticleEmitterData((null)) unable to find particle datablock: 146
SFXProfile((null))::onAdd: Invalid packet, bad description id: 3
SFXProfile((null))::onAdd: The profile is missing a description!
SFXProfile((null))::onAdd: Invalid packet, bad description id: 7
SFXProfile((null))::onAdd: The profile is missing a description!
ParticleEmitterData((null)) unable to find particle datablock: 172
SFXProfile((null))::onAdd: Invalid packet, bad description id: 3
SFXProfile((null))::onAdd: The profile is missing a description!
SplashData::onAdd: Invalid packet, bad datablockId(particle emitter): 0x7b
SplashData::onAdd: Invalid packet, bad datablockId(particle emitter): 0x73
ParticleEmitterData((null)) unable to find particle datablock: 175
SFXProfile((null))::onAdd: Invalid packet, bad description id: 12
SFXProfile((null))::onAdd: The profile is missing a description!
ParticleEmitterData((null)) unable to find particle datablock: 62
ParticleEmitterData((null)) unable to find particle datablock: 64
ParticleEmitterData((null)) unable to find particle datablock: 142
ParticleEmitterData((null)) unable to find particle datablock: 143
#4
I'm not sure why at this point, but sometimes the datablocks are already in sort-order and when that happens, everything works.
I believe the sort has something to do with the addition of the fast datablock loading on local clients that was also added to transmitDataBlocks(). Just eliminating the sort will probably break something else.
04/08/2008 (6:07 am)
This needs more research, but these problems appear to be related to a datablock sort that was added recently to the transmitDataBlocks() console method in gameConnection.cpp. If the sort actually rearranges the datablocks, I believe this causes a change in the ids assigned to the datablocks on the connecting client. This effectively breaks many datablock types that reference another datablock. (SFXProfile points to SFXDescription, ParticleEmitterData points to Particle, etc.)I'm not sure why at this point, but sometimes the datablocks are already in sort-order and when that happens, everything works.
I believe the sort has something to do with the addition of the fast datablock loading on local clients that was also added to transmitDataBlocks(). Just eliminating the sort will probably break something else.
#5
//pGroup->sort();
This was an isolated addition from the local client changes based on some discussion in the forums. I can't find said discussion now due to the site's search problems...
04/08/2008 (7:11 am)
GameConnection.cpp line 1286.. comment out the sort and please see if the issue persists://pGroup->sort();
This was an isolated addition from the local client changes based on some discussion in the forums. I can't find said discussion now due to the site's search problems...
#6
04/08/2008 (7:37 am)
So far, just commenting out the sort works reliably for me. If there's any validity to the sort, I suspect any problems would show up with local client connections.
#7
04/09/2008 (7:10 pm)
Cheers... this was giving me almost random but persistant "no particles" errors while joining a remote server.
#8
The real reason the pGroup->sort() call was causing problems is that the sort was using an incorrect qsort comparison function. In short, pointers-to-pointers were being incorrectly cast to pointers causing arbitrary values to be used for the comparison keys. The end result was that the pGroup->sort() was doing something more like a random shuffle than a sort. When the rearranged datablocks were then sent to a connecting client they were no longer in their original creation order. Therefore, the client might get an SFXProfile before its SFXDescription, or a ParticleEmitterData before its ParticleData. When these datablocks went looking for their references, they didn't exist.
Here is a fix for the comparison function in console/simManager.cpp, near line 406:
04/10/2008 (5:35 am)
Here's a little more info regarding problems caused by the sort() call in transmitDataBlocks().The real reason the pGroup->sort() call was causing problems is that the sort was using an incorrect qsort comparison function. In short, pointers-to-pointers were being incorrectly cast to pointers causing arbitrary values to be used for the comparison keys. The end result was that the pGroup->sort() was doing something more like a random shuffle than a sort. When the rearranged datablocks were then sent to a connecting client they were no longer in their original creation order. Therefore, the client might get an SFXProfile before its SFXDescription, or a ParticleEmitterData before its ParticleData. When these datablocks went looking for their references, they didn't exist.
Here is a fix for the comparison function in console/simManager.cpp, near line 406:
S32 QSORT_CALLBACK SimDataBlockGroup::compareModifiedKey(const void* a, const void* b)
{
const SimDataBlock* dba = *((const SimDataBlock**)a);
const SimDataBlock* dbb = *((const SimDataBlock**)b);
return dba->getModifiedKey() - dbb->getModifiedKey();
/* ORIGINAL CODE
return (reinterpret_cast<const SimDataBlock* >(a))->getModifiedKey() -
(reinterpret_cast<const SimDataBlock*>(b))->getModifiedKey();
*/
}In practice, the pGroup->sort() call can probably be safely skipped in those apps that only call transmitDataBlocks() when a new client connects. The sort matters more in apps that modify their datablocks and then want to resend just the modified datablocks using additional transmitDataBlocks() calls.
#9
So in the end it was not a problem with the sort after all. The sort just makes the system less forgiving of situations where there are two datablocks with matching names. This is probably fine, since I wanted these to be unique datablocks but never noticed the errors until now.
04/11/2008 (1:54 pm)
I've been doing a fair amount of multi-player testing with the pGroup->sort() call restored and the comparison function fixed. I started seeing some bad datablock errors again and was ready to pull the sort, but upon closer examination, I found that it was happening in a couple of places where I had inadvertently repeated a datablock with the same name. This increased the datablock's modified-key and changed its position in the sort order to later. So in the end it was not a problem with the sort after all. The sort just makes the system less forgiving of situations where there are two datablocks with matching names. This is probably fine, since I wanted these to be unique datablocks but never noticed the errors until now.
#10
I did include the fix to the sort function that Jeff posted and I added a console warning for when you have duplicate datablock names since I think those are both useful outside of transmitDataBlocks.
At some point in the future we probably will require unique datablock names but I wanted to get some input from our Associates and community on the matter before adding that restriction.
05/01/2008 (2:10 pm)
For the next version of TGEA I have decided to pull out the sort since it can still have issues with duplicate datablock names and I would rather not impose the restriction of unique datablock names just yet.I did include the fix to the sort function that Jeff posted and I added a console warning for when you have duplicate datablock names since I think those are both useful outside of transmitDataBlocks.
At some point in the future we probably will require unique datablock names but I wanted to get some input from our Associates and community on the matter before adding that restriction.
#11
When a new datablock is created, it looks for a datablock with a matching name, and uses it if it exists, but it flags an error if the types don't match. This would seem to enforce a unique datablock naming scheme. I suppose there may be other useful ways to create datablocks though.
05/01/2008 (2:39 pm)
Isn't there already a unique datablock name requirement? The critical piece of code is found here in compiledEval.cpp at line 417:// Are we creating a datablock? If so, deal with case where we override
// an old one.
if(isDataBlock)
{
// Con::printf(" - is a datablock");
// Find the old one if any.
SimObject *db = Sim::getDataBlockGroup()->findObject(callArgv[2]);
// Make sure we're not changing types on ourselves...
if(db && dStricmp(db->getClassName(), callArgv[1]))
{
Con::errorf(ConsoleLogEntry::General, "Cannot re-declare data block %s with a different class.", callArgv[2]);
ip = failJump;
break;
}
// If there was one, set the currentNewObject and move on.
if(db)
currentNewObject = db;
} When a new datablock is created, it looks for a datablock with a matching name, and uses it if it exists, but it flags an error if the types don't match. This would seem to enforce a unique datablock naming scheme. I suppose there may be other useful ways to create datablocks though.
#12
I'm sure there are plenty of bad things that could happen if you did that and we definitely don't encourage people to use that approach but I am leery to cut it off without a thorough review of the implications.
The sort is only a minor speed increase when you re-transmit changed datablocks to a client (which can be dangerous if not handled correctly) so I am at the point of feeling like it really isn't worth all of the hassle.
I am open to being swayed in either direction, however.
05/01/2008 (5:05 pm)
I was thinking along the lines of mods/addons overriding an existing datablock of the same name and type with its own version, which the code above wouldn't stop you from doing (since they would be of the same class).I'm sure there are plenty of bad things that could happen if you did that and we definitely don't encourage people to use that approach but I am leery to cut it off without a thorough review of the implications.
The sort is only a minor speed increase when you re-transmit changed datablocks to a client (which can be dangerous if not handled correctly) so I am at the point of feeling like it really isn't worth all of the hassle.
I am open to being swayed in either direction, however.
#13
The sort operation is not particularly important to my needs so you could leave it out or you could add an optional argument to transmitDatablocks() that can be used to explicitly turn it on by users who need it.
Still, I don't think one necessarily needs to enforce locked datablocks in order to support efficient repeat calls to transmitDatablocks(). I *think* the algorithm just needs to work like this:
For each transmitDatablocks() call on a specific client:
-- send all untransmitted datablocks to the client in creation order
-- send any previously transmitted datablocks updated since the last call to the client in any order
05/02/2008 (7:05 am)
I see what you mean, not so much unique datablocks (one datablock per datablock name) but constant, locked, or read-only datablocks. I agree that enforcing that (preventing datablock overrides or redefinition) could be problematic. I don't think we do any datablock overrides in a multiplayer context, but in standalone, datablock redefinition is fundamental to being able to reload edited scripts while in-game. We even have a customization in our trunk that makes the datablock clone operator, ":" perform a copy when the datablock is an override rather than new.The sort operation is not particularly important to my needs so you could leave it out or you could add an optional argument to transmitDatablocks() that can be used to explicitly turn it on by users who need it.
Still, I don't think one necessarily needs to enforce locked datablocks in order to support efficient repeat calls to transmitDatablocks(). I *think* the algorithm just needs to work like this:
For each transmitDatablocks() call on a specific client:
-- send all untransmitted datablocks to the client in creation order
-- send any previously transmitted datablocks updated since the last call to the client in any order
Torque Owner Jason "fireVein" Culwell
*** Phase 1: Download Datablocks & Targets
SFXProfile((null))::onAdd: Invalid packet, bad description id: 6
Error, unable to load sound profile for precipitation datablock
SFXProfile((null))::onAdd: Invalid packet, bad description id: 4
Error, unable to load sound profile for precipitation datablock
I am using the same install as earlier, other than removing the references to the textures in terrain texture editor and re-saving the mission as mentioned above(and of course copying that mission to the other pc) nothing has been touched. Plus, just earlier the client was able to connect and load the mission(minus the lighting) just fine. I am confuzzled. I don't see why these errors would pop up now, all of a sudden.
Also as far as the client throwing an error when lighting the scene, it appears that the server is sending the client a reference to its own local folder and file(which i should have realized from the console log earlier because the path shown does not exist on the client pc). No wonder the client can't find the *.ml or won't light the scene. It looks like the mission path and file name are sent to the client via clientCmdMissionStartPhase3 and copied into $Client::MissionFile. In sceneLighting.cpp in SceneLighting::light() we see this:
// remove the '.mis' extension from the mission name
char misName[256];
dSprintf(misName, sizeof(misName), "%s", Con::getVariable("$Client::MissionFile"));
char * dot = dStrstr((const char*)misName, ".mis");
if(dot)
*dot = '\0';
// get the mission name
getMLName(misName, missionCRC, 1023, mFileName);
if(!ResourceManager->isValidWriteFileName(mFileName))
{
Con::warnf("Invalid filename '%s'. Failed to light mission.", mFileName);
return(false);
}
It strips off the .mis on the end but doesn't fix the directory path. So, of course isValidWriteFileName is going to return false, because the client and server more than likely do not have an identical directory path(unless of course an installer is used, and the user doesn't change the directory to install to). This explains why lighting works fine server side, but causes errors on connecting clients. I will verify this by copying the game to C:\test on both client and server and seeing what happens. Will have to install a fresh copy of the sdk and fix those texture references first tho. It seems TGEA only likes to let clients connect for a little while before tossing those invalid datablock errors. Two SDK installs later... still getting invalid datablock error. :\ Three, still getting invalid datablock error. On another fresh install I tried using newMission and now it just crashes with no errors in the log, but right after it starts to download datablocks. I don't know whats up with it, and I have to go for right now, so I will mess with it some more in a little bit.