Dedicated Server Crashes on Client Connect
by Charlie Sibbach · in Torque Game Engine · 09/29/2005 (11:58 pm) · 11 replies
Hey all,
I seem to be having more than my share of problems lately; everything has been a hassle... compiler errors, linker errors, on and on.
Anyway, I finally got most everything running, but now I'm having trouble. I'm running a dedicated server on FreeBSD on my LAN on one machine, and I'm trying to connect with a client on OSX. On the server side, I get this:
--------- Starting Dedicated Server ---------
Exporting server prefs...
Starting multiplayer mode
Binding server port to default IP
UDP initialized on port 28000
Engine initialized...
Sending heartbeat to master server [IP:216.116.32.49:28002]
Received info request from a master server [IP:216.116.32.49:28002].
Got Connect challenge Request from IP:192.168.1.251:49160
Got Connect Request
Connect request from: IP:192.168.1.251:49160
%
And then it locks up, needing a force quit. On the client side, I get a time-out error. On the mac, a local connection works just fine, so it will work, I know that much. Any idea why it would lockup here? I'm exploring the backtrace command and some of the built-in debugging, so hopefully I'll be able to figure this out myself when I'm not so tired, but I'm hoping there's some ideas.
I'm using the RTS starter kit. I've divided it into two separate modules, client and server, and stripped out the code the sides don't need, since we'll always be using Dedicated servers. Everything compiles, so that's not the problem, I don't think. I do get these errors as the server initializes (identical set for all the .dts files, I believe):
Loading compiled script midnightoilserver/server/scripts/items/building.cs.
Validation required for shape: midnightoilserver/data/shapes/building/building.dts
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'root'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'run'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'back'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'side'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'swimroot'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'swim'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'crouchroot'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'crouchforward'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'crawlroot'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'crawlforward'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'fall'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'jump'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'standjump'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'land'!
Is this normal for the RTS kit? It makes a certain bit of sense, since the RTS character models aren't as fully featured as say the FPS Orc. But it also seems odd that the basically stock kit has so many warnings. Is this preload problem related to the server crash? What does "validatation" entail?
If I can get this to work, I'll finally get to test my PeerConnection class. We're trying to do incremental changes so that we don't break anything as we build from the bottom up, but it's been much more difficult than it seems like it should be. Maybe I just have bad luck. But I'm definately feeling pretty comfortable with Torque now!
Thanks for any help, you guys are great!
I seem to be having more than my share of problems lately; everything has been a hassle... compiler errors, linker errors, on and on.
Anyway, I finally got most everything running, but now I'm having trouble. I'm running a dedicated server on FreeBSD on my LAN on one machine, and I'm trying to connect with a client on OSX. On the server side, I get this:
--------- Starting Dedicated Server ---------
Exporting server prefs...
Starting multiplayer mode
Binding server port to default IP
UDP initialized on port 28000
Engine initialized...
Sending heartbeat to master server [IP:216.116.32.49:28002]
Received info request from a master server [IP:216.116.32.49:28002].
Got Connect challenge Request from IP:192.168.1.251:49160
Got Connect Request
Connect request from: IP:192.168.1.251:49160
%
And then it locks up, needing a force quit. On the client side, I get a time-out error. On the mac, a local connection works just fine, so it will work, I know that much. Any idea why it would lockup here? I'm exploring the backtrace command and some of the built-in debugging, so hopefully I'll be able to figure this out myself when I'm not so tired, but I'm hoping there's some ideas.
I'm using the RTS starter kit. I've divided it into two separate modules, client and server, and stripped out the code the sides don't need, since we'll always be using Dedicated servers. Everything compiles, so that's not the problem, I don't think. I do get these errors as the server initializes (identical set for all the .dts files, I believe):
Loading compiled script midnightoilserver/server/scripts/items/building.cs.
Validation required for shape: midnightoilserver/data/shapes/building/building.dts
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'root'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'run'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'back'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'side'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'swimroot'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'swim'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'crouchroot'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'crouchforward'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'crawlroot'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'crawlforward'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'fall'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'jump'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'standjump'!
Warning: (game/player.cc @ 304) PlayerData::preload - Unable to find named animation sequence 'land'!
Is this normal for the RTS kit? It makes a certain bit of sense, since the RTS character models aren't as fully featured as say the FPS Orc. But it also seems odd that the basically stock kit has so many warnings. Is this preload problem related to the server crash? What does "validatation" entail?
If I can get this to work, I'll finally get to test my PeerConnection class. We're trying to do incremental changes so that we don't break anything as we build from the bottom up, but it's been much more difficult than it seems like it should be. Maybe I just have bad luck. But I'm definately feeling pretty comfortable with Torque now!
Thanks for any help, you guys are great!
#2
Your crash/lockup is actually much eariler in the sequence...I'd suggest you take a look at the resource I wrote called "TGE 1.3 Connection Sequence Overview" to get a better feel for how the dedicated server/dedicated client connection works.
FYI, this type of error sometimes happens when your Mac executable and your windows/linux executable is not exactly in synch, commonly caused by having slightly differing projects for each of the platforms.
09/30/2005 (7:20 am)
Don't worry about the preload warnings for animations--as you originally deduced, the RTS units have a much more limited set of animations than "normal" units.Your crash/lockup is actually much eariler in the sequence...I'd suggest you take a look at the resource I wrote called "TGE 1.3 Connection Sequence Overview" to get a better feel for how the dedicated server/dedicated client connection works.
FYI, this type of error sometimes happens when your Mac executable and your windows/linux executable is not exactly in synch, commonly caused by having slightly differing projects for each of the platforms.
#3
We're using CVS to manage the code, and being very careful to keep all three platforms in synch, so at least I can cross out of synch projects off my list. Or are you talking about the natural differences between the projects- the unavoidable stuff in the Platform layer?
09/30/2005 (5:58 pm)
Thanks. I'll take a look at that resource; I've been going through the code and scripts related to this and I thought I understood them, but there's always more to see.We're using CVS to manage the code, and being very careful to keep all three platforms in synch, so at least I can cross out of synch projects off my list. Or are you talking about the natural differences between the projects- the unavoidable stuff in the Platform layer?
#4
The other thing to keep in mind is that .dso files are NOT cross-platform in 1.3, so you need to not be crossing them between Mac and windows/linux.
09/30/2005 (7:19 pm)
No, I meant specifically issues above the platform layer--mostly having certain files in one platform's project but not in the other. There also were some bugs in the Mac networking layer that should have made it into 1.3, but you may want to search the forums here to confirm.The other thing to keep in mind is that .dso files are NOT cross-platform in 1.3, so you need to not be crossing them between Mac and windows/linux.
#5
I've been going through the NetInterface code with a fine toothed comb, and I've got a lot of debugging messages in there; it is indeed getting to onConnect, but as soon as it tries to talk to the client, it's locking up; so this does indeed point to problems earlier in the sequence... but according to the debug messages, everything seems to be working right up to that point!
Here's what I see right now on the server side. This is after starting a fresh dedicated server, and on the client side, querying the LAN and join()ing the server. It may or may not make sense, since it's all stuff I added:
--------- Starting Dedicated Server ---------
Exporting server prefs...
Starting multiplayer mode
Binding server port to default IP
UDP initialized on port 28000
Engine initialized...
Sending heartbeat to master server [IP:216.116.32.49:28002]
Received info request from a master server [IP:216.116.32.49:28002].
Got Connect challenge Request from IP:192.168.1.251:49169
// Generated AddressDigest
AddressDigest = -15251893.855967164.1597601954.-708047854
Got Connect Request from IP:192.168.1.251:49169
// Recieved address digest
AddressDigest = -15251893.855967164.1597601954.-708047854
// Validated
// Connection doesn't currently exist
!connection, making new connection
// Reads from incoming packet this string
Connection class of incoming conn: RTSConnection
// This is actually from the script function GameConnection::onConnectRequest in ClientConnection.cs
// Called during GameConnection::readConnectRequest()
Connect request from: IP:192.168.1.251:49169
// Now, it's back to handleConnectRequest
established connection! connectSequence = 0
after setNetworkConnection(true)
// entering onConnectionEstablished()
After adding to ClientGroup in onConnectionEstablished
mConnectArgv[0] = Case
made it to onConnect, with ^1042^Case
%
And it is here, in onConnect, after trying to write to the client, that it fails and locks up.
The code in question, excerpted from NetInterface::handleConnectRequest:
I just don't get it; everything seems to be working just as it should. I hate to ask other people to do my debugging, but except for the additional Con::printf() statements, I haven't changed anything in this code.
PS: That resource you linked to is really great; it lets me validate what I'm seeing in the code.
10/01/2005 (12:55 am)
OK, thanks for clearing that up. We're not keeping any .dso's in the repository, so no cross platform woes.I've been going through the NetInterface code with a fine toothed comb, and I've got a lot of debugging messages in there; it is indeed getting to onConnect, but as soon as it tries to talk to the client, it's locking up; so this does indeed point to problems earlier in the sequence... but according to the debug messages, everything seems to be working right up to that point!
Here's what I see right now on the server side. This is after starting a fresh dedicated server, and on the client side, querying the LAN and join()ing the server. It may or may not make sense, since it's all stuff I added:
--------- Starting Dedicated Server ---------
Exporting server prefs...
Starting multiplayer mode
Binding server port to default IP
UDP initialized on port 28000
Engine initialized...
Sending heartbeat to master server [IP:216.116.32.49:28002]
Received info request from a master server [IP:216.116.32.49:28002].
Got Connect challenge Request from IP:192.168.1.251:49169
// Generated AddressDigest
AddressDigest = -15251893.855967164.1597601954.-708047854
Got Connect Request from IP:192.168.1.251:49169
// Recieved address digest
AddressDigest = -15251893.855967164.1597601954.-708047854
// Validated
// Connection doesn't currently exist
!connection, making new connection
// Reads from incoming packet this string
Connection class of incoming conn: RTSConnection
// This is actually from the script function GameConnection::onConnectRequest in ClientConnection.cs
// Called during GameConnection::readConnectRequest()
Connect request from: IP:192.168.1.251:49169
// Now, it's back to handleConnectRequest
established connection! connectSequence = 0
after setNetworkConnection(true)
// entering onConnectionEstablished()
After adding to ClientGroup in onConnectionEstablished
mConnectArgv[0] = Case
made it to onConnect, with ^1042^Case
%
And it is here, in onConnect, after trying to write to the client, that it fails and locks up.
The code in question, excerpted from NetInterface::handleConnectRequest:
Con::printf("!connection, making new connection");
char connectionClass[255];
stream->readString(connectionClass);
Con::printf("Connection class of incoming conn: %s", connectionClass);
ConsoleObject *co = ConsoleObject::create(connectionClass);
NetConnection *conn = dynamic_cast<NetConnection *>(co);
if(!conn || !conn->canRemoteCreate())
{
Con::printf("Creation of connection failed!");
delete co;
return;
}
conn->registerObject();
conn->setNetAddress(address);
conn->setNetworkConnection(true);
conn->setSequence(connectSequence);
const char *errorString = NULL;
if(!conn->readConnectRequest(stream, &errorString))
{
sendConnectReject(conn, errorString);
conn->deleteObject();
return;
}
Con::printf("established connection! connectSequence = %d\n", connectSequence
);
conn->setNetworkConnection(true);
Con::printf("after setNetworkConnection(true)");
conn->onConnectionEstablished(false);
Con::printf("after onConnectionEstablished(false)");
conn->setEstablished();
Con::printf("after setEstablished()");I just don't get it; everything seems to be working just as it should. I hate to ask other people to do my debugging, but except for the additional Con::printf() statements, I haven't changed anything in this code.
PS: That resource you linked to is really great; it lets me validate what I'm seeing in the code.
#6
As you may know, FreeBSD does sockets a bit differently, and I'm assuming here that the issue is related to the OS, being that the linux platform layer is exactly that--for linux, and not tuned for FreeBSD.
Any *nix guru's out there ported TGE to FreeBSD and have any clues here about what may be the problem?
10/01/2005 (4:52 am)
Hmm....which code base are you installing? (for what platform?) I apologize, but I kind of skimmed your post originally, and missed the FreeBSD portion...which implies that you used the linux installer, but of course would have multiple problems as you mentioned with compilation, etc.As you may know, FreeBSD does sockets a bit differently, and I'm assuming here that the issue is related to the OS, being that the linux platform layer is exactly that--for linux, and not tuned for FreeBSD.
Any *nix guru's out there ported TGE to FreeBSD and have any clues here about what may be the problem?
#7
If you download the windows source and linux and run a diff you'll see a few files differ. Merge those differences into your current source. That should resolve the network issue.
That is assuming you are having problems due to the linux version and also that its the same problem I had :P
10/01/2005 (5:41 am)
Charlie: When I last downloaded the RTS kit I noticed that the linux version was not the same as the windows version. Local connections were fine but connecting to a dedicated server failed/hung/crashed out.If you download the windows source and linux and run a diff you'll see a few files differ. Merge those differences into your current source. That should resolve the network issue.
That is assuming you are having problems due to the linux version and also that its the same problem I had :P
#8
When you say the Windows/Linux/Mac source differs, you're talking about in the installer itself, correct? As, from/in the GarageGames download itself. That makes much more sense- Stephen, I wasn't sure what you were talking about at first by differences in the projects- "it's exactly the same code on both machines!" is all I could think.
I forget exactly how we did it now, but our codebase is either the Windows or Mac install, imported into CVS and compiled on all the platforms- so, just so I'm clear, there's differences between the installers?
In any case, it took a bit of doing to get it to build on FreeBSD, but not all that much once I decifered how the make files worked (very well- thank Buddha). I didn't imagine that there'd be problems networking on BSD, as sockets pretty much originated on BSD back in prehistory. Oh well. I was going to put out the config files as a resource if I got it working right, since I had to make my own conf.GCC3.FreeBSD.mk file. Of course, the issues with GCC3.4 had to be worked out first, as well- what a hassle.
I'll try a few things. First, I'll try running it on Mandriva Linux and see if it's a BSD-Linux issue; if so, then I guess our server cluster will just be Linux based! If that doesn't work, I'll DL all three versions of the installer and pick throught them with diff. Gary, do you have any specific memories of how those files differed or what the problem was? Thanks again for your help, guys!
10/02/2005 (10:38 am)
Thanks alot for your help! When you say the Windows/Linux/Mac source differs, you're talking about in the installer itself, correct? As, from/in the GarageGames download itself. That makes much more sense- Stephen, I wasn't sure what you were talking about at first by differences in the projects- "it's exactly the same code on both machines!" is all I could think.
I forget exactly how we did it now, but our codebase is either the Windows or Mac install, imported into CVS and compiled on all the platforms- so, just so I'm clear, there's differences between the installers?
In any case, it took a bit of doing to get it to build on FreeBSD, but not all that much once I decifered how the make files worked (very well- thank Buddha). I didn't imagine that there'd be problems networking on BSD, as sockets pretty much originated on BSD back in prehistory. Oh well. I was going to put out the config files as a resource if I got it working right, since I had to make my own conf.GCC3.FreeBSD.mk file. Of course, the issues with GCC3.4 had to be worked out first, as well- what a hassle.
I'll try a few things. First, I'll try running it on Mandriva Linux and see if it's a BSD-Linux issue; if so, then I guess our server cluster will just be Linux based! If that doesn't work, I'll DL all three versions of the installer and pick throught them with diff. Gary, do you have any specific memories of how those files differed or what the problem was? Thanks again for your help, guys!
#9
I cannot stress the usefulness of a good diff tool - something visual like Beyond Compare (for windows; there are definitely unix/X versions out there as well).
BSD sockets are the ad hoc standard for this stuff, but we never tested on BSD, so some compile errors are likely to crop up. In general, though, there shouldn't be any show stopper issues here, just a few minor hurdles to overcome.
10/02/2005 (9:53 pm)
The windows installer due to various minor oversights has slightly more recent code than the other two. A quick diff pass will highlight the differences - everything including the unix platform layer is more recent.I cannot stress the usefulness of a good diff tool - something visual like Beyond Compare (for windows; there are definitely unix/X versions out there as well).
BSD sockets are the ad hoc standard for this stuff, but we never tested on BSD, so some compile errors are likely to crop up. In general, though, there shouldn't be any show stopper issues here, just a few minor hurdles to overcome.
#10
Tonight I will test the server on Linux, and I'll post my results, good or bad. If it works, maybe somebody more interested can track down the root cause on BSD, as it's an interesting problem. If it doesn't, I'll try a fresh install of the RTS kit on both platforms, and see if that works. Then, I'll try the base Torque install. If that still doesn't work, well, then I'm going to be slightly perturbed.
As to compiling on BSD, no problems. I had to add some extra -l switches to the makefiles mainly, and make some changes so that it will compile using GCC3.4, but other than that, no errors- at least none that the compiler/linker is reporting.
10/03/2005 (4:13 pm)
I went back and figured out what version of the code we started with, and it is the Windows installer version. So, unless it is the new[er][ish] code that is broken, I guess I'm out of luck there.Tonight I will test the server on Linux, and I'll post my results, good or bad. If it works, maybe somebody more interested can track down the root cause on BSD, as it's an interesting problem. If it doesn't, I'll try a fresh install of the RTS kit on both platforms, and see if that works. Then, I'll try the base Torque install. If that still doesn't work, well, then I'm going to be slightly perturbed.
As to compiling on BSD, no problems. I had to add some extra -l switches to the makefiles mainly, and make some changes so that it will compile using GCC3.4, but other than that, no errors- at least none that the compiler/linker is reporting.
#11
All promises no code I know, but if in fact the issue lies in differences in the installers, that should be cleaned up in the future.
10/03/2005 (5:39 pm)
Without being too self-embarassing, the Mac/linux installers have always been not as polished as the Windows versions, mostly because until just recently we haven't had an in-house Mac expert. We do now, and I would expect that once his changes to the Mac build for 1.4RC2 is polished, they will be rolled into the RTS-SK once we bring that to the 1.4 baseline.All promises no code I know, but if in fact the issue lies in differences in the installers, that should be cleaned up in the future.
Torque Owner Charlie Sibbach
Bad datablocks can cause connection crashes, yes? And a bad datablock could be created by an invalid .dts file, yes? Or a .dts file missing certain animations?
datablock RTSUnitData(TestBuildingBlock : UnitBaseBlock)
{
shapeFile = "~/data/shapes/building/building.dts";
boundingBox = "10.0 10.0 3.0";
};
The question now is how or why these .dts files are not working, as the Mac side didn't have any trouble.
I forgot to mention that I've done the Added Player Positions resource, which added the swimming, crouching, etc stuff. However, the fact that it can't find even the Root animation is a bit strange.
Another edit: Duh, it's a building, it doesn't have any animations. The other characters do have the Root animation, at least. But they are missing "fall", "jump" etc, as well as the inevitable crouch, crawl and swim.