Mac OS X Hardware Identification re-write
by Sean OBrien · 07/31/2009 (1:17 pm) · 4 comments
The older versions of Torque used the antiquated Gestalt system calls to determine processor, memory, clock speed, etc. Now that Apple has deprecated the Gestalt calls, this resource updates the engine to use the currently supported method for retrieving that information from the Kernel.
This only applies to Mac applications of both Torque 3D and TGEA 1.8.x and since it only effects two files, it is a rather simple upgrade.
While this doesn't do anything for *you* specifically, it does replace the functions which the engine is using to determine the specs of the Mac it is running on: memory max/avail, processor family, type, core-type, number of cores, speed, etc. With this information, the engine can choose at runtime to execute different implementations of certain routines to maximize performance or to simply take advantage of more complex and/or special purpose hard-coded instructions of the chip. This type of "on the fly" optimization is in addition to the build-time magic that the compiler optimizer routines perform.
For instance, the engine defaults to a "Generic x86 Processor" if it can't determine what's what. However, properly ID'ing the processor in my MacBook as an Intel Core 2 Duo with a 'Penryn' core opens up the possibility of using the SSE, SSE2, SSE3, SSE3_ext, and SSE4_1 extended instruction sets; allows the engine to see that this processor actually has two independent, 2 GHZ, 64-bit capable cores for executing instructions; and let's it know that there are just under 2 GB of memory available.
Both Torque 3D and TGEA 1.8 users need to do this first part. Only TGEA users need to update the second file at the bottom, however.
Replace the entire "Engine/source/platformMac/macCarbCPUInfo.cpp" file with this:
then for those using this with TGEA 1.8.x, change "Engine/source/platform/platform.h" as follows:
(1) Comment out the following sections
That should be it. Compile and check your log at the very tippy-top to see the updated information.
This only applies to Mac applications of both Torque 3D and TGEA 1.8.x and since it only effects two files, it is a rather simple upgrade.
While this doesn't do anything for *you* specifically, it does replace the functions which the engine is using to determine the specs of the Mac it is running on: memory max/avail, processor family, type, core-type, number of cores, speed, etc. With this information, the engine can choose at runtime to execute different implementations of certain routines to maximize performance or to simply take advantage of more complex and/or special purpose hard-coded instructions of the chip. This type of "on the fly" optimization is in addition to the build-time magic that the compiler optimizer routines perform.
For instance, the engine defaults to a "Generic x86 Processor" if it can't determine what's what. However, properly ID'ing the processor in my MacBook as an Intel Core 2 Duo with a 'Penryn' core opens up the possibility of using the SSE, SSE2, SSE3, SSE3_ext, and SSE4_1 extended instruction sets; allows the engine to see that this processor actually has two independent, 2 GHZ, 64-bit capable cores for executing instructions; and let's it know that there are just under 2 GB of memory available.
Both Torque 3D and TGEA 1.8 users need to do this first part. Only TGEA users need to update the second file at the bottom, however.
Replace the entire "Engine/source/platformMac/macCarbCPUInfo.cpp" file with this:
//-----------------------------------------------------------------------------
// Torque 3D
// Copyright (C) GarageGames.com, Inc.
//-----------------------------------------------------------------------------
#include <sys/types.h>
#include <sys/sysctl.h>
#include <math.h>
// Work around for OSX 10.4 SDK use. This is where the processor ID values
// are defined.
//#include <mach/machine.h>
#include "/Developer/SDKs/MacOSX10.5.sdk/usr/include/mach/machine.h"
#include "platformMac/platformMacCarb.h"
#include "platform/platformAssert.h"
#include "console/console.h"
#include "core/stringTable.h"
// Original code by Sean O'Brien ( http://www.garagegames.com/community/community/resources/view/17985 ).
// Reads sysctl() string value into buffer at DEST with maximum length MAXLEN
// Return: 0 on success, non-zero is error in accordance with stdlib and <errno.h>
int _getSysCTLstring(const char key[], char * dest, size_t maxlen) {
size_t len = 0;
int err;
// Call with NULL for 'dest' to have the required size stored in 'len'. If the 'key'
// doesn't exist, 'err' will be -1 and if all goes well, it will be 0.
err = sysctlbyname(key, NULL, &len, NULL, 0);
if (err == 0) {
AssertWarn((len <= maxlen), ("Insufficient buffer length for SYSCTL() read. Truncating.n"));
if (len > maxlen)
len = maxlen;
// Call with actual pointers to 'dest' and clamped 'len' fields to perform the read.
err = sysctlbyname(key, dest, &len, NULL, 0);
}
return err;
}
// TEMPLATED Reads sysctl() integer value into variable DEST of type T
// The two predominant types used are unsigned longs and unsiged long longs
// and the size of the argument is on a case-by-case value. As a "guide" the
// resources at Apple claim that any "byte count" or "frequency" values will
// be returned as ULL's and most everything else will be UL's.
// Return: 0 on success, non-zero is error in accordance with stdlib and <errno.h>
template <typename T>
int _getSysCTLvalue(const char key[], T * dest) {
size_t len = 0;
int err;
// Call with NULL for 'dest' to get the size. If the 'key' doesn't exist, the
// 'err' returned will be -1, so 0 indicates success.
err = sysctlbyname(key, NULL, &len, NULL, 0);
if (err == 0) {
AssertFatal((len == sizeof(T)), "Mis-matched destination type for SYSCTL() read.n");
// We're just double-checking that we're being called with the correct type of
// pointer for 'dest' so we don't clobber anything nearby when writing back.
err = sysctlbyname(key, dest, &len, NULL, 0);
}
return err;
}
// Short-hand routine to check specific CPU options and return a simple TRUE/FALSE value since
// we end up calling these same routines in the same pattern over and over again at the end.
int _supportsCPUOption(const char key[]) {
int err;
unsigned long lraw;
err = _getSysCTLvalue<unsigned long>( key, &lraw);
return ((err==0)&&(lraw==1));
}
Platform::SystemInfo_struct Platform::SystemInfo;
#define BASE_MHZ_SPEED 500000000
void Processor::init()
{
U32 procflags;
int err, cpufam, cputype, cpusub;
char buf[20];
unsigned long lraw;
unsigned long long llraw;
Con::printf("System Information:");
err = _getSysCTLstring("kern.ostype", buf, sizeof(buf));
if (err)
Con::printf(" Unable to determine OS type");
else
Con::printf(" Mac OS Kernel name: %s", buf);
err = _getSysCTLstring("kern.osrelease", buf, sizeof(buf));
if (err)
Con::printf(" Unable to determine OS release number");
else
Con::printf(" Mac OS Kernel version: %s", buf);
err = _getSysCTLvalue<unsigned long long>("hw.memsize", &llraw);
if (err)
Con::printf(" Unable to determine amount of installed RAM");
else
Con::printf(" Physical memory installed: %d MB", (llraw >> 20));
err = _getSysCTLvalue<unsigned long>("hw.usermem", &lraw);
if (err)
Con::printf(" Unable to determine available user address space");
else
Con::printf(" Addressable user memory: %d MB", (lraw >> 20));
////////////////////////////////
// Values for the Family Type, CPU Type and CPU Subtype are defined in the
// SDK files for the Mach Kernel ==> mach/machine.h
////////////////////////////////
Con::printf(" ");
Con::printf("Processor Information:");
// Determine CPU Family, Type, and Subtype
cpufam = 0;
cputype = 0;
cpusub = 0;
err = _getSysCTLvalue<unsigned long>("hw.cpufamily", &lraw);
if (err)
Con::printf(" Unable to determine 'family' of CPU");
else {
cpufam = (int) lraw;
err = _getSysCTLvalue<unsigned long>("hw.cputype", &lraw);
if (err)
Con::printf(" Unable to determine CPU type");
else {
cputype = (int) lraw;
err = _getSysCTLvalue<unsigned long>("hw.cpusubtype", &lraw);
if (err)
Con::printf(" Unable to determine CPU subtype");
else
cpusub = (int) lraw;
// If we've made it this far,
Con::printf(" Installed processor ID: Family 0x%08x Type %d Subtype %d",cpufam, cputype,cpusub);
}
}
// CPU Frequency is returned as integer Hertz so we need to divide it down.
err = _getSysCTLvalue<unsigned long long>("hw.cpufrequency", &llraw);
if (err) {
llraw = BASE_MHZ_SPEED;
Con::printf(" Unable to determine CPU Frequency. Defaulting to %d MHz", llraw);
} else {
llraw /= 1000000;
Con::printf(" Installed processor clock frequency: %d MHz", llraw);
}
Platform::SystemInfo.processor.mhz = (unsigned int)llraw;
// Here's one that the original version of this routine couldn't do -- number
// of processor cores. Sending "hw.packages" to SYSCTL() will return the number
// of physical processor packages in the machine but we're after the number of
// cores or "logical" processors.
err = _getSysCTLvalue<unsigned long>("hw.ncpu", &lraw);
if (err)
Con::printf(" Unable to determine number of processor cores");
else
Con::printf(" Installed/available processor cores: %d", lraw);
// Now use CPUFAM to determine and then store the processor type
// and 'friendly name' in GG-accessible structure. Note that since
// we have access to the Family code, the Type and Subtypes are useless.
//
// NOTE: Even this level of detail is almost assuredly not needed anymore
// and the Optional Capability flags (further down) should be more than enough.
switch(cpufam)
{
case CPUFAMILY_POWERPC_G3:
Platform::SystemInfo.processor.type = CPU_PowerPC_G3;
Platform::SystemInfo.processor.name = StringTable->insert("PowerPC G3");
break;
case CPUFAMILY_POWERPC_G4:
Platform::SystemInfo.processor.type = CPU_PowerPC_G4;
Platform::SystemInfo.processor.name = StringTable->insert("PowerPC G4");
break;
case CPUFAMILY_POWERPC_G5:
Platform::SystemInfo.processor.type = CPU_PowerPC_G5;
Platform::SystemInfo.processor.name = StringTable->insert("PowerPC G5");
break;
case CPUFAMILY_INTEL_6_14:
Platform::SystemInfo.processor.type = CPU_Intel_Core;
Platform::SystemInfo.processor.name = StringTable->insert("Intel 'Yonam' Core Processor");
break;
case CPUFAMILY_INTEL_6_15:
Platform::SystemInfo.processor.type = CPU_Intel_Core2;
Platform::SystemInfo.processor.name = StringTable->insert("Intel 'Merom' Core Processor");
break;
case CPUFAMILY_INTEL_6_23:
Platform::SystemInfo.processor.type = CPU_Intel_Core2;
Platform::SystemInfo.processor.name = StringTable->insert("Intel 'Penryn' Core Processor");
break;
case CPUFAMILY_INTEL_6_26:
Platform::SystemInfo.processor.type = CPU_Intel_Core2;
Platform::SystemInfo.processor.name = StringTable->insert("Intel 'Nehalem' Core Processor");
break;
default:
// explain why we can't get the processor type.
Con::warnf(" Unknown Processor (family, type, subtype): 0x%xt%d %d", cpufam, cputype, cpusub);
// for now, identify it as an x86 processor, because Apple is moving to Intel chips...
Platform::SystemInfo.processor.type = CPU_X86Compatible;
Platform::SystemInfo.processor.name = StringTable->insert("Unknown x86 Processor");
break;
}
// Now we can directly query the system about a litany of "Optional" processor capabilities
// to set the appropriate flags and throw info into the Log for the user.
procflags = 0;
// Seriously this one should be an Assert()
if (_supportsCPUOption("hw.optional.floatingpoint")) {
procflags |= CPU_PROP_FPU;
Con::printf(" Has hardware FPU");
}
// List of chip-specific features
if (_supportsCPUOption("hw.optional.mmx")) {
procflags |= CPU_PROP_MMX;
Con::printf(" Supports MMX");
}
if (_supportsCPUOption("hw.optional.sse")) {
procflags |= CPU_PROP_FPU;
Con::printf(" Supports SSE");
}
if (_supportsCPUOption("hw.optional.sse2")) {
procflags |= CPU_PROP_SSE2;
Con::printf(" Supports SSE 2");
}
if (_supportsCPUOption("hw.optional.sse3")) {
procflags |= CPU_PROP_SSE3;
Con::printf(" Supports SSE 3");
}
if (_supportsCPUOption("hw.optional.supplementalsse3")) {
procflags |= CPU_PROP_SSE3xt;
Con::printf(" Supports SSE 3 extensions");
}
if (_supportsCPUOption("hw.optional.sse4_1")) {
procflags |= CPU_PROP_SSE4_1;
Con::printf(" Supports SSE 4_1");
}
if (_supportsCPUOption("hw.optional.sse4_2")) {
procflags |= CPU_PROP_SSE4_2;
Con::printf(" Supports SSE 4_2");
}
if (_supportsCPUOption("hw.optional.altivec")) {
procflags |= CPU_PROP_ALTIVEC;
Con::printf(" Has AltiVec engine");
}
if (_supportsCPUOption("hw.cpu64bit_capable")) {
procflags |= CPU_PROP_64bit;
Con::printf(" Supports 64-bit operations");
}
Con::printf(" ");
// Finally some architecture-wide settings
err = _getSysCTLvalue<unsigned long>("hw.ncpu", &lraw);
if ((err==0)&&(lraw>1)) procflags |= CPU_PROP_MP;
err = _getSysCTLvalue<unsigned long>("hw.byteorder", &lraw);
if ((err==0)&&(lraw==1234)) procflags |= CPU_PROP_LE;
Platform::SystemInfo.processor.properties = procflags;
Con::printf("Detected %s running at %2.1f GHz.", Platform::SystemInfo.processor.name, ((float)Platform::SystemInfo.processor.mhz)/1000);
Con::printf(" ");
}then for those using this with TGEA 1.8.x, change "Engine/source/platform/platform.h" as follows:
(1) Comment out the following sections
/* DEPRECATED OLD ENUMS
/// Properties for x86 architecture chips.
enum x86Properties
{
CPU_PROP_C = (1<<0), ///< We should use C fallback math functions.
CPU_PROP_FPU = (1<<1), ///< Has an FPU. (It better!)
CPU_PROP_MMX = (1<<2), ///< Supports MMX instruction set extension.
CPU_PROP_3DNOW = (1<<3), ///< Supports AMD 3dNow! instruction set extension.
CPU_PROP_SSE = (1<<4), ///< Supports SSE instruction set extension.
CPU_PROP_RDTSC = (1<<5), ///< Supports Read Time Stamp Counter op.
CPU_PROP_SSE2 = (1<<6), ///< Supports SSE2 instruction set extension.
CPU_PROP_MP = (1<<7), ///< This is a multi-processor system.
};
/// Properties for PowerPC architecture chips.
enum PPCProperties
{
CPU_PROP_PPCMIN = (1<<0),
CPU_PROP_ALTIVEC = (1<<1), ///< Supports AltiVec instruction set extension.
CPU_PROP_PPCMP = (1<<7) ///< Multi-processor system
};
*/(2) Insert the following code directly underneath the sections you just commented/// Properties for CPU.
enum ProcessorProperties
{
CPU_PROP_C = (1<<0), ///< We should use C fallback math functions.
CPU_PROP_FPU = (1<<1), ///< Has an FPU. (It better!)
CPU_PROP_MMX = (1<<2), ///< Supports MMX instruction set extension.
CPU_PROP_3DNOW = (1<<3), ///< Supports AMD 3dNow! instruction set extension.
CPU_PROP_SSE = (1<<4), ///< Supports SSE instruction set extension.
CPU_PROP_RDTSC = (1<<5), ///< Supports Read Time Stamp Counter op.
CPU_PROP_SSE2 = (1<<6), ///< Supports SSE2 instruction set extension.
CPU_PROP_SSE3 = (1<<7), ///< Supports SSE3 instruction set extension.
CPU_PROP_SSE3xt = (1<<8), ///< Supports extended SSE3 instruction set
CPU_PROP_SSE4_1 = (1<<9), ///< Supports SSE4_1 instruction set extension.
CPU_PROP_SSE4_2 = (1<<10), ///< Supports SSE4_2 instruction set extension.
CPU_PROP_MP = (1<<11), ///< This is a multi-processor system.
CPU_PROP_LE = (1<<12), ///< This processor is LITTLE ENDIAN.
CPU_PROP_64bit = (1<<13), ///< This processor is 64-bit capable
CPU_PROP_ALTIVEC = (1<<14), ///< Supports AltiVec instruction set extension (PPC only).
};That should be it. Compile and check your log at the very tippy-top to see the updated information.
#2
I realized that I was a bit skimpy on the description so I went and filled it out a bit more as I should have originally. There are really two audiences for this: the TGEA 1.8 crowd I originally wrote this for and the T3D folks who are already using this because my code got accepted into the official code base.
For TGEA, this is a "ok, cool my computer is recognized now, neat-o" thing; for T3D on the other hand, this is the actual routine that is already being used, I'm simply trying to get a little feedback as the methods used (i.e. SYSCTLBYNAME and its various key values) are a little bit wonky in the values they return so I need to be sure we're asking for and getting the correct information.
Now, nobody should think that simply because this routine is in place and the information is correct, that Torque should suddenly run 2x faster! Instead it's purpose is two-fold: (1) The engine should be able to tell the specs of the machine because .... well why the heck not?, and (2) given that information, the Devs can make as many changes and tweaks to the internal routines as possible, which while it may not be much, it's probably better than nothing.
Try it out! =)
08/01/2009 (8:57 am)
Hey, Novack. I realized that I was a bit skimpy on the description so I went and filled it out a bit more as I should have originally. There are really two audiences for this: the TGEA 1.8 crowd I originally wrote this for and the T3D folks who are already using this because my code got accepted into the official code base.
For TGEA, this is a "ok, cool my computer is recognized now, neat-o" thing; for T3D on the other hand, this is the actual routine that is already being used, I'm simply trying to get a little feedback as the methods used (i.e. SYSCTLBYNAME and its various key values) are a little bit wonky in the values they return so I need to be sure we're asking for and getting the correct information.
Now, nobody should think that simply because this routine is in place and the information is correct, that Torque should suddenly run 2x faster! Instead it's purpose is two-fold: (1) The engine should be able to tell the specs of the machine because .... well why the heck not?, and (2) given that information, the Devs can make as many changes and tweaks to the internal routines as possible, which while it may not be much, it's probably better than nothing.
Try it out! =)
#3
08/01/2009 (11:28 am)
Excellent Sean, thank you very much!
#4
09/12/2009 (3:59 am)
I'm using this with Torque 1.5, and I had to add CPU_Intel_Core and CPU_Intel_Core2 to the ProcessorType enum. 
Torque 3D Owner Novack
CyberianSoftware
Could you elaborate a bit on the benefits of this enhancement, for those of us ignorants? :)