NASM assembly failure, Linux: blender_asm.asm: op size not speci
by David Stewart Zink · in Torque Game Engine · 01/07/2003 (6:10 pm) · 18 replies
Downloaded top of tree a couple days ago, no modifications, OS=LINUX, COMPILER=GCC2, BUILD=DEBUG, I run "make dedicated" or "make", everything fine so far except:
(retyping cause it's on a different crt, 'scuse any typos):
--> Assembling terrain/blender_asm.asm
terrain/blender_asm.asm:1099: operation size not specified
Line in question is:
punpcklbw mm4, [zero]
I am using "nasm version 0.98.28 compiled on Apr 22 2002".
Is this a nasm version issue? Anyone have any clues? Bueller?
(retyping cause it's on a different crt, 'scuse any typos):
--> Assembling terrain/blender_asm.asm
terrain/blender_asm.asm:1099: operation size not specified
Line in question is:
punpcklbw mm4, [zero]
I am using "nasm version 0.98.28 compiled on Apr 22 2002".
Is this a nasm version issue? Anyone have any clues? Bueller?
About the author
#2
"./torqueDemod_DEBUG.exe -dedicated -mission fps/data/missions/waterWorld.mis"
causes a SIGILL as soon as it tries to load the mission, in SSE_MatrixF_x_MatrixF (an assembly language routine) called from MatrixF::mul (as m_matF_x_matF).
Note that I tried 0.98 because that was the nasm that used to work last time I tried to get a project going with v12 a year + back.
01/08/2003 (1:19 am)
Not necessarily a real fix, since starting the dedicated server with"./torqueDemod_DEBUG.exe -dedicated -mission fps/data/missions/waterWorld.mis"
causes a SIGILL as soon as it tries to load the mission, in SSE_MatrixF_x_MatrixF (an assembly language routine) called from MatrixF::mul (as m_matF_x_matF).
Note that I tried 0.98 because that was the nasm that used to work last time I tried to get a project going with v12 a year + back.
#3
sourceforge.net/project/showfiles.php?group_id=6208
This issue came up before in another forum post, but I can't find it (I don't think "nasm" was in the subject line of that post...)
01/08/2003 (12:27 pm)
Unfortunately there are some versions of nasm 0.98 that work and some that don't. You could try pulling down some other versions:sourceforge.net/project/showfiles.php?group_id=6208
This issue came up before in another forum post, but I can't find it (I don't think "nasm" was in the subject line of that post...)
#4
i'm still having the same sigill problem
01/09/2003 (2:45 am)
http://www.garagegames.com/index.php?sec=mg&mod=forums&page=result.thread&qt=6813i'm still having the same sigill problem
#5
01/09/2003 (9:17 am)
Hmm...something fishy here. Will check it out this weekend.
#6
NASM version 0.98.34 compiled on Jul 23 2002
(Redhat 8's nasm)
01/09/2003 (8:26 pm)
Please give me the exact versions of nasm that you both are using. For instance I'm using:NASM version 0.98.34 compiled on Jul 23 2002
(Redhat 8's nasm)
#7
For some reason the nasm people supply RPM this and RPM that, in three different compression formats, but they don't just supply the binaries which would be useful. So I have downloaded the source to nasm 0.98.34 and installed it as per instructions in package, etc., and my nasm -r now prints "nasm version 0.98.34 compiled on Jan 11 2003".
Here's what I did:
I installed nasm 0.98.34 and verified that that was the only version left on the machine.
I ran "make clean ; make dedicated"
I cd to the examples directory.
I execute a script named "run" with the following contents:
===========
#!/bin/sh
./torqueDemod_DEBUG.exe -dedicated -mission fps/data/missions/waterWorld.mis
===========
I get the following output:
================
%
--------- Initializing MOD: Common ---------
% Loading compiled script common/client/canvas.cs.
% Loading compiled script common/client/audio.cs.
%
--------- Initializing MOD: FPS ---------
% Loading compiled script fps/client/init.cs.
% Loading compiled script fps/server/init.cs.
% Loading compiled script fps/data/init.cs.
% Loading compiled script fps/data/terrains/grassland/propertyMap.cs.
% Loading compiled script fps/data/terrains/scorched/propertyMap.cs.
%
--------- Initializing FPS: Server ---------
% Loading compiled script common/server/audio.cs.
% Loading compiled script common/server/server.cs.
% Loading compiled script common/server/message.cs.
% Loading compiled script common/server/commands.cs.
% Loading compiled script common/server/missionInfo.cs.
% Loading compiled script common/server/missionLoad.cs.
% Loading compiled script common/server/missionDownload.cs.
% Loading compiled script common/server/clientConnection.cs.
% Loading compiled script common/server/kickban.cs.
% Loading compiled script common/server/game.cs.
% Loading compiled script fps/server/scripts/commands.cs.
% Loading compiled script fps/server/scripts/centerPrint.cs.
% Loading compiled script fps/server/scripts/game.cs.
%
--------- Starting Dedicated Server ---------
% Exporting server prefs...
% Starting multiplayer mode
% Binding server port to default IP
% UDP initialized on port 28000
% Loading compiled script fps/server/scripts/audioProfiles.cs.
% Loading compiled script fps/server/scripts/camera.cs.
% Loading compiled script fps/server/scripts/markers.cs.
% ... Shape with old version.
% Loading compiled script fps/server/scripts/triggers.cs.
% Loading compiled script fps/server/scripts/inventory.cs.
% Loading compiled script fps/server/scripts/shapeBase.cs.
% Loading compiled script fps/server/scripts/item.cs.
% Loading compiled script fps/server/scripts/staticShape.cs.
% Loading compiled script fps/server/scripts/health.cs.
% ./run: line 2: 5678 Illegal instruction ./torqueDemod_DEBUG.exe -dedicated -mission fps/data/missions/waterWorld.mis
The SIGILL is still in SSE_MatrixF_x_Matrix_F.
01/11/2003 (7:15 pm)
As I mentioned before, I am using "nasm version 0.98.28 compiled on Apr 22 2002".For some reason the nasm people supply RPM this and RPM that, in three different compression formats, but they don't just supply the binaries which would be useful. So I have downloaded the source to nasm 0.98.34 and installed it as per instructions in package, etc., and my nasm -r now prints "nasm version 0.98.34 compiled on Jan 11 2003".
Here's what I did:
I installed nasm 0.98.34 and verified that that was the only version left on the machine.
I ran "make clean ; make dedicated"
I cd to the examples directory.
I execute a script named "run" with the following contents:
===========
#!/bin/sh
./torqueDemod_DEBUG.exe -dedicated -mission fps/data/missions/waterWorld.mis
===========
I get the following output:
================
%
--------- Initializing MOD: Common ---------
% Loading compiled script common/client/canvas.cs.
% Loading compiled script common/client/audio.cs.
%
--------- Initializing MOD: FPS ---------
% Loading compiled script fps/client/init.cs.
% Loading compiled script fps/server/init.cs.
% Loading compiled script fps/data/init.cs.
% Loading compiled script fps/data/terrains/grassland/propertyMap.cs.
% Loading compiled script fps/data/terrains/scorched/propertyMap.cs.
%
--------- Initializing FPS: Server ---------
% Loading compiled script common/server/audio.cs.
% Loading compiled script common/server/server.cs.
% Loading compiled script common/server/message.cs.
% Loading compiled script common/server/commands.cs.
% Loading compiled script common/server/missionInfo.cs.
% Loading compiled script common/server/missionLoad.cs.
% Loading compiled script common/server/missionDownload.cs.
% Loading compiled script common/server/clientConnection.cs.
% Loading compiled script common/server/kickban.cs.
% Loading compiled script common/server/game.cs.
% Loading compiled script fps/server/scripts/commands.cs.
% Loading compiled script fps/server/scripts/centerPrint.cs.
% Loading compiled script fps/server/scripts/game.cs.
%
--------- Starting Dedicated Server ---------
% Exporting server prefs...
% Starting multiplayer mode
% Binding server port to default IP
% UDP initialized on port 28000
% Loading compiled script fps/server/scripts/audioProfiles.cs.
% Loading compiled script fps/server/scripts/camera.cs.
% Loading compiled script fps/server/scripts/markers.cs.
% ... Shape with old version.
% Loading compiled script fps/server/scripts/triggers.cs.
% Loading compiled script fps/server/scripts/inventory.cs.
% Loading compiled script fps/server/scripts/shapeBase.cs.
% Loading compiled script fps/server/scripts/item.cs.
% Loading compiled script fps/server/scripts/staticShape.cs.
% Loading compiled script fps/server/scripts/health.cs.
% ./run: line 2: 5678 Illegal instruction ./torqueDemod_DEBUG.exe -dedicated -mission fps/data/missions/waterWorld.mis
The SIGILL is still in SSE_MatrixF_x_Matrix_F.
#8
01/11/2003 (8:09 pm)
What's your CPU? Distribution? Kernel version?
#9
I dropped copies of nasm and torque behind this page:
http://www.spies.com/~zink/nasmdbg.html
so if you want you can check whether my machine, my executable, or my nasm seems flakier.
01/13/2003 (12:23 am)
It's a debian distribution, 2.2.19 kernel. Only 128MB RAM, but the process never gets over 5mb before it dies. 1 GHz Pentium III, 256KB cache.I dropped copies of nasm and torque behind this page:
http://www.spies.com/~zink/nasmdbg.html
so if you want you can check whether my machine, my executable, or my nasm seems flakier.
#10
It appears to be multiplying two identity matrices.
eip 0x820c2fc 0x820c2fc
Dump of assembler code for function SSE_MatrixF_x_MatrixF:
0x820c2f0: mov 0x4(%esp,1),%edx
0x820c2f4: mov 0x8(%esp,1),%ecx
0x820c2f8: mov 0xc(%esp,1),%eax
0x820c2fc: movss (%edx),%xmm0
0x820c300: movups (%ecx),%xmm1
0x820c303: shufps $0x0,%xmm0,%xmm0
0x820c307: movss 0x4(%edx),%xmm2
0x820c30c: mulps %xmm1,%xmm0
which looks like it's possibly crashing at the first %xmm0 reference in the path of execution.
Open questions:
Is "movss (%edx),%xmm0" valid for all P-III processors? (It's obviously bad for P-II, IIRC)
Does Linux need an option turned on to allow SSE or SSE2 access? (FP in general has numerous unpleasant implications for operating systems (relating to save/restore over unplanned context switches), I can believe they'd have an option to forbid it...)
Why doesn't MOVSS have anything to do with MOVing in or out of the SS register? (I know, not your department)
01/13/2003 (8:40 am)
More infons:It appears to be multiplying two identity matrices.
eip 0x820c2fc 0x820c2fc
Dump of assembler code for function SSE_MatrixF_x_MatrixF:
0x820c2f0
0x820c2f4
0x820c2f8
0x820c2fc
0x820c300
0x820c303
0x820c307
0x820c30c
which looks like it's possibly crashing at the first %xmm0 reference in the path of execution.
Open questions:
Is "movss (%edx),%xmm0" valid for all P-III processors? (It's obviously bad for P-II, IIRC)
Does Linux need an option turned on to allow SSE or SSE2 access? (FP in general has numerous unpleasant implications for operating systems (relating to save/restore over unplanned context switches), I can believe they'd have an option to forbid it...)
Why doesn't MOVSS have anything to do with MOVing in or out of the SS register? (I know, not your department)
#11
If an unmasked SIMD floating-point exception and OSXMMEXCPT in
CR4 is 0.
If EM in CR0 is set.
If OSFXSR in CR4 is 0.
If CPUID feature flag SSE is 0.
Unfortunately I don't know how to display CR registers in gdb, they aren't apparently members of the set "all-registers".
Here's a quote from the manual, pay attention to the note on item 5, which is probably the problem:
Before an application attempts to use the SSE and/or SSE2 extensions, it should check that they
are present on the processor and that the operating system supports them. The application can
make this check by following these steps:
1. Check that the processor supports the CPUID instruction by attempting to execute the
CPUID instruction. If the processor does not support the CPUID instruction, it will
generate an invalid-opcode exception (#UD).
2. Check that the processor supports the SSE and/or SSE2 extensions. Execute the CPUID
instruction with an argument of 1 in the EAX register, and check that bit 25 (SSE) and/or
bit 26 (SSE2) are set to 1.
3. Check that the processor supports the FXSAVE and FXRSTOR instructions. Execute the
CPUID instruction with an argument of 1 in the EAX register, and check that bit 24
(FXSR) is set to 1.
4. Check that the operating system supports the FXSAVE and FXRSTOR instructions.
Execute a MOV instruction to read the contents of control register CR4, and check that bit
9 of CR4 (the OSFXSR bit) is set to 1.
5. Check that the operating system supports the SIMD floating-point exception handling.
Execute a MOV instruction to read the contents control register CR4, and check that bit 10
of CR4 (the OSXMMEXCPT bit) is set to 1.
NOTE
The OSFXSR and OSXMMEXCPT bits in control register CR4 must be set
by the operating system. The processor has no other way of detecting
operating-system support for the FXSAVE and FXRSTOR instructions or for
handling SIMD floating-point exceptions.
6. Check that emulation of the x87 FPU is disabled. Execute a MOV instruction to read the
contents control register CR0, and check that bit 2 of CR0 (the EM bit) is set to 0.
If the processor attempts to execute an unsupported SSE or a SSE2 instruction, the processor
will generate an invalid-opcode exception (#UD).
01/13/2003 (9:00 am)
According to the reference, the CPU raises "Undefined Opcode":If an unmasked SIMD floating-point exception and OSXMMEXCPT in
CR4 is 0.
If EM in CR0 is set.
If OSFXSR in CR4 is 0.
If CPUID feature flag SSE is 0.
Unfortunately I don't know how to display CR registers in gdb, they aren't apparently members of the set "all-registers".
Here's a quote from the manual, pay attention to the note on item 5, which is probably the problem:
Before an application attempts to use the SSE and/or SSE2 extensions, it should check that they
are present on the processor and that the operating system supports them. The application can
make this check by following these steps:
1. Check that the processor supports the CPUID instruction by attempting to execute the
CPUID instruction. If the processor does not support the CPUID instruction, it will
generate an invalid-opcode exception (#UD).
2. Check that the processor supports the SSE and/or SSE2 extensions. Execute the CPUID
instruction with an argument of 1 in the EAX register, and check that bit 25 (SSE) and/or
bit 26 (SSE2) are set to 1.
3. Check that the processor supports the FXSAVE and FXRSTOR instructions. Execute the
CPUID instruction with an argument of 1 in the EAX register, and check that bit 24
(FXSR) is set to 1.
4. Check that the operating system supports the FXSAVE and FXRSTOR instructions.
Execute a MOV instruction to read the contents of control register CR4, and check that bit
9 of CR4 (the OSFXSR bit) is set to 1.
5. Check that the operating system supports the SIMD floating-point exception handling.
Execute a MOV instruction to read the contents control register CR4, and check that bit 10
of CR4 (the OSXMMEXCPT bit) is set to 1.
NOTE
The OSFXSR and OSXMMEXCPT bits in control register CR4 must be set
by the operating system. The processor has no other way of detecting
operating-system support for the FXSAVE and FXRSTOR instructions or for
handling SIMD floating-point exceptions.
6. Check that emulation of the x87 FPU is disabled. Execute a MOV instruction to read the
contents control register CR0, and check that bit 2 of CR0 (the EM bit) is set to 0.
If the processor attempts to execute an unsupported SSE or a SSE2 instruction, the processor
will generate an invalid-opcode exception (#UD).
#12
I guess the operating system support from 4 onward in your above post could be missing in that kernel...
01/13/2003 (9:09 am)
Hmm, well torque detects whether SSE is present and only installs that function if it is. The relevant code is in x86UNIXMath.cc. Your kernel is older...maybe its the case that it is somehow "preventing" SSE access?I guess the operating system support from 4 onward in your above post could be missing in that kernel...
#13
In any case the Linux release notes will need to mention Linux kernel version requirements (if I didn't just overlook them) and we're half done.
ALSO, your code to check SSE is incomplete, as the quoted manual page also specifies: you also need to verify that the OS supports it, which you do by checking those two flags in CR4.
01/13/2003 (9:54 am)
If you relook at the quoted page, you'll notice the OS kernel needs to provide support for FX save and restore over context swaps as well as exception handling, which it is my bet the 2.2.19 kernels do not. And that's the core of the problem. I booted a 2.4.3 kernel I had lying around and it worked fine. Now I just have to remember why I'm not running that kernel by default.In any case the Linux release notes will need to mention Linux kernel version requirements (if I didn't just overlook them) and we're half done.
ALSO, your code to check SSE is incomplete, as the quoted manual page also specifies: you also need to verify that the OS supports it, which you do by checking those two flags in CR4.
#14
In any case, the requirements currently don't specify a kernel version. Maybe we'll have to specify 2.4 as the minimum version.
01/13/2003 (10:07 am)
I wonder if the 2.2 kernel fundamentally doesn't support this operations or if it is configurable...I seem to recall at one point getting at least the dedicated server working on debian 2.2 with the default kernel.In any case, the requirements currently don't specify a kernel version. Maybe we'll have to specify 2.4 as the minimum version.
#15
03/05/2003 (12:46 pm)
I repeat: the problem is not with the kernel code, the problem is with the check within the torque code.
#16
The other problem is that torque apparently has an incomplete sse check, so it crashes instead of failing gracefully on a 2.2. kernel. If this is so, it is reasonable to fix it. I don't have time to spend on it now, but if you send me a fix for it, I will integrate it into the head.
03/05/2003 (1:13 pm)
I think there are two problems. One problem is that the 2.2 kernel apparently doesn't support sse asm. That is a kernel problem. The other problem is that torque apparently has an incomplete sse check, so it crashes instead of failing gracefully on a 2.2. kernel. If this is so, it is reasonable to fix it. I don't have time to spend on it now, but if you send me a fix for it, I will integrate it into the head.
#17
03/05/2004 (12:29 pm)
Hey, total Torque newbie here, but I'm pretty good with linux. I'm running Debian Woody with a 2.4.21 kernel and nasm (0.98.28 - the debian build) and I get the same nasm problem as in the original post. So, I can safely say that the problem is not a kernel issue (at least the compile part anyway). I haven't upgraded nasm yet to see if I can get the engine to build. I'll post more if/when that happens.
#18
03/05/2004 (1:08 pm)
Just a followup, I un-installed the debian woody version of nasm, then compiled from source the 0.98.34 version and copied nasm into ~/bin and re-built and it ran great. I don't have any textures in the demo app, but I'm guessing that's another problem alltogether. :)
Torque Owner David Stewart Zink