When enabling NUMA under RHEL4 Update 1 or CentOS 4.1, the kernel crashes during boot

Problem:

When enabling NUMA under RHEL4 Update 1 or CentOS 4.1, the kernel crashes during boot

Details:

The BIOS on this Tyan Quad Opteron platform does not set up NUMA data correctly under ACPI. When the kernel attempts to retrieve the information, it crashes. A similar platform such as the Omega 8430 does not exhibit this symptom.

As a workaround, a customized 2.6.9-11.ELsmp is provided. The kernel will not attempt to retrieve data through ACPI interface by turning off the build parameter CONFIG_ACPI_NUMA.

Solution:

First, disable NUMA temporary by passing the kernel parameter 'numa=off' at the Grub boot menu. Once the kernel boots successfully, mount the ASL Driver CD and install the customized 2.6.9-11.ELsmp kernel.

mount  /mnt/cdrom
cd  /mnt/cdrom/CentOS/4.1/NUMA
rpm  -Uvh --force kernel*2.6.9-11.EL*.rpm
Afterward, reboot the system.

Note: Since NUMA is not enabled by default, use the following steps when reinstalling the operating system:

  1. Install RHEL4 Update 1 or CentOS 4.1
  2. Install the customized 2.6.9-11.ELsmp kernel from the ASL Driver CD
  3. Edit /etc/grub.conf and add the kernel parameter 'numa=on'
  4. Reboot the system