Bug in dnet.cc (with fix)
by Tim McClarren · in Torque Game Engine · 08/15/2007 (12:37 pm) · 5 replies
In Torque 1.4 and earlier, there is code in ConnectionProtocol::processRawPacket (at the bottom):
If you look at "handlePacket":
Okay... let's find "connectionError":
See the problem?
(In general, I try to avoid putting code in objects that causes them to delete themselves... but...)
This should fix it, but it's probably not the best fix:
void ConnectionProtocol::processRawPacket(BitStream* pstream)
{
...
if(mLastSeqRecvd != pkSequenceNumber && pkPacketType == DataPacket)
handlePacket(pstream);
mLastSeqRecvd = pkSequenceNumber;
}If you look at "handlePacket":
void NetConnection::handlePacket(BitStream *bstream)
{
...
if(mErrorBuffer[0])
connectionError(mErrorBuffer);
}Okay... let's find "connectionError":
void GameConnection::connectionError(const char *errorString)
{
if(isConnectionToServer())
{
Con::printf("Connection error: %s.", errorString);
Con::executef(this, 2, "onConnectionError", errorString);
}
else
{
Con::printf("Client %d packet error: %s.", getId(), errorString);
setDisconnectReason("Packet Error.");
}
deleteObject();
}See the problem?
(In general, I try to avoid putting code in objects that causes them to delete themselves... but...)
This should fix it, but it's probably not the best fix:
void ConnectionProtocol::processRawPacket(BitStream* pstream)
{
...
if(mLastSeqRecvd != pkSequenceNumber)
{
mLastSeqRecvd = pkSequenceNumber;
if(pkPacketType == DataPacket)
handlePacket(pstream);
}
}About the author
#2
08/15/2007 (3:06 pm)
It's the write to mLastSeqRecvd after the delete. So "this" is invalid and that means that write could be writing somewhere it shouldn't be. I'd say it's a good fix to move the write before the possibility of deletion.
#3
08/15/2007 (3:12 pm)
This could potentially cause a crash if a client were to receive an error. Because after the deleteObject call in GameConnection::connectionError the callstack will unwind and will eventually get back to ConnectionProtocol::processRawPacket where it will then be trying to assign a member in an object that no longer exists.
#4
The right conditions being: thread behavior allows a context switch during "NetConnection::handlePacket", after "this" is deleted, and the thread being switched to is able to execute a memory allocation for which the memory allocator then returns the just-deleted-memory. Now the thread executing "NetConnection::handlePacket" writes into memory that is occupied by a new, valid allocation holding something else, not itself.
It sounds rare, but our crash reporting code reported 25 crashes with an execution stack that matches! (second highest) :/
08/15/2007 (3:53 pm)
It *does* cause a crash in a code-base that is threaded. Eventually. Under the right conditions.The right conditions being: thread behavior allows a context switch during "NetConnection::handlePacket", after "this" is deleted, and the thread being switched to is able to execute a memory allocation for which the memory allocator then returns the just-deleted-memory. Now the thread executing "NetConnection::handlePacket" writes into memory that is occupied by a new, valid allocation holding something else, not itself.
It sounds rare, but our crash reporting code reported 25 crashes with an execution stack that matches! (second highest) :/
#5
08/16/2007 (2:24 pm)
This same issue is in TGE 1.5.2
Torque Owner Stefan Lundmark
Is it that the GameConnection is deleted upon error? I guess not, but I'm curious.