Skip to content

Linux kernel bugs

Brice Goglin edited this page Oct 25, 2022 · 24 revisions

The following hwloc error messages are caused by the Linux kernel reporting invalid topology information. Recent errors are listed first.

Single packages with 2 modules exposed as 2 packages on AlderLake-N

Fixed in Linux 6.1 in

commit 2b12a7a126d62bdbd81f4923c21bf6e9a7fbd069
Author: Zhang Rui <[email protected]>
Date:   Fri Oct 14 17:01:46 2022 +0800

    x86/topology: Fix multiple packages shown on a single-package system

Single core instead of 4 on Hive Unmatched (riscv64)

See #536

Fixed in Linux 6.1 in

commit bf6cd1c01c959a31002dfa6784c0d8caffed4cf1
Author: Conor Dooley <[email protected]>
Date:   Tue Jul 5 20:04:34 2022 +0100

    riscv: dts: sifive: Add fu740 topology information

L3 shared by the entire Package on Intel Skylake in SubNUMA Cluster mode

The L3 is shared by the entire package instead of being shared by a NUMA node only (half a package).

This is not a bug but rather imprecise info.

  • The L3 is shared by the entire package when the cores access memory attached to another package (topology is correct there).
  • But the L3 is only shared by half the cores when accessing the local NUMA nodes of the package (it uses the L3 in front of the target NUMA node).

Intel decided to keep the exposed topology as is even if the info is incorrect in some cases. Details available in this Linux commit (which fixes some scheduler things but doesn't change the exposed topology):

commit 1340ccfa9a9afefdbab90d7935d4ed19817e37c2
Author: Alison Schofield <[email protected]>
Date:   Fri Apr 6 17:21:30 2018 -0700

    x86,sched: Allow topologies where NUMA nodes share an LLC

Incorrect PCI affinity on Xeon 9200 with hwloc 1.11.x

****************************************************************************
* hwloc 1.11.13 has encountered an incorrect PCI locality information.
* PCI bus 0000:40 is supposedly close to 2nd NUMA node of 1st package,
* however hwloc believes this is impossible on this architecture.
* Therefore the PCI bus will be moved to 1st NUMA node of 2nd package.
*
* If you feel this fixup is wrong, disable it by setting in your environment
* HWLOC_PCI_0000_40_LOCALCPUS= (empty value), and report the problem
* to the hwloc's user mailing list together with the XML output of lstopo.
*
* You may silence this message by setting HWLOC_HIDE_ERRORS=1 in your environment.
****************************************************************************

This warning is wrong and the fixup must be disabled. It appears on Xeon Skylake 9200 while it was only designed for earlier platforms. You should upgrade to hwloc 2+ or set environment variables such as HWLOC_PCI_0000_40_LOCALCPUS= to empty as specified in the warning (possibly multiple variables if multiple warnings).

For the record, the fixup was designed for Haswell and Broadwell Xeon running in Cluster-on-Die mode. Early BIOS releases did not report PCI affinity correctly there. It looks like most platforms were fixed later. The fixup was broken by mistake in hwloc 2.0 and therefore always disabled.

Invalid L3 cpuset on 24-core AMD EPYC processor

****************************************************************************            
* hwloc 1.11.8 has encountered what looks like an error from the operating system.                                                            
*                                                                                                                                             
* L3 (cpuset 0x60000060) intersects with NUMANode (P#0 cpuset 0x3f00003f
nodeset 0x00000001) without inclusion!                                                                 

Fixed in Linux 4.14 in this commit (and backported in 4.13.16):

commit 2b83809a5e6d619a780876fcaf68cdc42b50d28c
Author: Suravee Suthikulpanit <[email protected]>
Date:   Mon Jul 31 10:51:59 2017 +0200

    x86/cpu/amd: Derive L3 shared_cpu_map from cpu_llc_shared_mask

Packages Cut in Halves on Intel Xeon E5 v3/v4 with Cluster-on-Die

Each dual-NUMA package is reported as two single-NUMA packages.

Fixed in Linux 3.18 in this commit:

commit cebf15eb09a2fd2fa73ee4faa9c4d2f813cf0f09
Author: Dave Hansen <[email protected]>
Date:   Thu Sep 18 12:33:34 2014 -0700

    x86, sched: Add new topology for multi-NUMA-node CPUs

Invalid PCI locality on Intel Xeon E5 v3/v4 with Cluster-on-Die

****************************************************************************
* hwloc 1.11.2 has encountered an incorrect PCI locality information.
* PCI bus 0000:80 is supposedly close to 2nd NUMA node of 1st package,
* however hwloc believes this is impossible on this architecture.
* Therefore the PCI bus will be moved to 1st NUMA node of 2nd package.
*
* If you feel this fixup is wrong, disable it by setting in your environment
* HWLOC_PCI_0000_80_LOCALCPUS= (empty value), and report the problem
* to the hwloc's user mailing list together with the XML output of lstopo.
*
* You may silence this message by setting HWLOC_HIDE_ERRORS=1 in your environment.

This problem may look similar to the previous one but it's actually very different. This is actually a BIOS bug, nothing to fix in the kernel. hwloc detects the issue and fixes it automagically.

Invalid L3 cpuset on AMD 12-core Opteron 6200/6300 (Bulldozer and Piledriver)

****************************************************************************
* Hwloc has encountered what looks like an error from the operating system.
*
* object (L3 cpuset 0x000003f0) intersection without inclusion!

The fix was NEVER pushed to Linux.

Use hwloc >=1.11.2 and set HWLOC_COMPONENTS=x86 in your environment to work around the issue.

Invalid NUMA cpuset on AMD Opteron 6200/6300 (Bulldozer and Piledriver)

****************************************************************************
* Hwloc has encountered what looks like an error from the operating system.
*
* Socket (P#2 cpuset 0x0000ffff,0x0) intersects with NUMANode (P#3 cpuset
0x0000ff00,0xff000000) without inclusion!

This is likely not a kernel bug but rather a BIOS reporting invalid SRAT information.

Upgrading the BIOS is the only chance to get a proper fix. Otherwise try hwloc >=1.11.2 and set HWLOC_COMPONENTS=x86 in your environment to work around the issue.

L1 not shared between hyperthreads on early Pentium 4 SMT

L1 appears private to each PU instead of being shared per core.

Fixed in Linux 3.16 in this commit:

commit 2a2261553dd1472ca574acadbd93e12f44c4e6d5
Author: Peter Zijlstra <[email protected]>
Date:   Tue Jul 22 15:35:14 2014 +0200

    x86, cpu: Fix cache topology for early P4-SMT

Not fixed in the hwloc x86 backend (because we don't have hardware for testing).