Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(kernel-rolling) Add support for Hygon model 4h~7h and model 10h processors #213

Conversation

MingcongBai
Copy link
Contributor

@MingcongBai MingcongBai commented May 27, 2024

Picked and rebased from Gitee deepin-kernelsig/kernel #1.

From original pull request:

Update the CPU topology, microcode loading and QoS, NB, MCE, EDAC,
k10temp, i2c-piix4, audio drivers for Hygon family 18h model 4h~7h
and model 10h processors.

Reference:
https://gitee.com/OpenCloudOS/OpenCloudOS-Kernel/pulls/54
https://gitee.com/anolis/anck-next/pulls/11
https://gitee.com/OpenCloudOS/OpenCloudOS-Kernel-Stream/pulls/8

Builds tested

  • amd64
  • arm64
  • loong64

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zeno-sole for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@MingcongBai
Copy link
Contributor Author

MingcongBai commented May 27, 2024

Skipped the following patches due to major changes in arch/x86:

@MingcongBai
Copy link
Contributor Author

Skipped the following patches due to major changes in arch/x86:

* [cf2cdfd](https://github.com/deepin-community/kernel/commit/cf2cdfdf408e1bad37db9321679c42093eeabe58)

* [f80de0e](https://github.com/deepin-community/kernel/commit/f80de0ede9419e2c665caa743a2df9d27bec6263)

* [3d50b01](https://github.com/deepin-community/kernel/commit/3d50b01828f8ea14acf55469a22fc7bf061c6763)

For microcode loading support, please cross reference the changes in the following commits - I have no clue how to adapt these:

Liao Xuan added 25 commits May 29, 2024 08:35
From model 4h, Hygon processors use CPUID leaf 0xB to derive the
core ID, socket ID and APIC ID with the SMT and CORE level types.
But still set __max_die_per_package to nodes_per_socket because
of lacking the DIE level type.

Signed-off-by: Liao Xuan <[email protected]>
Add the PCI device IDs for Hygon family 18h model 4h processors.

Signed-off-by: Liao Xuan <[email protected]>
Add dedicated functions to initialize the northbridge for Hygon family
18h model 4h processors.

Signed-off-by: Liao Xuan <[email protected]>
The SB IOAPIC is on the device 0xb from Hygon family 18h model 4h.

Signed-off-by: Liao Xuan <[email protected]>
On Hygon family 18h platforms, we look at the 6th nibble(bit 20~23)
in the instance_id to derive the channel number.

Signed-off-by: Liao Xuan <[email protected]>
Add support for Hygon family 18h model 4h to get UMC base, instance
number and determine DDR memory types.

Signed-off-by: Liao Xuan <[email protected]>
Add Hygon family 18h model 4h processor support for DramOffset and
HiAddrOffset, and get the socket interleaving number from DramBase-
Address(D18F0x110).

Update intlv_num_chan and num_intlv_bits support for Hygon family 18h
model 4h processor.

Signed-off-by: Liao Xuan <[email protected]>
The cpuinfo_x86.cpu_die_id is get from CPUID or MSR in the commit
0028c221ed19 ("x86/CPU/AMD: Save AMD NodeId as cpu_die_id"). But the
value may not be continuous for Hygon model 4h~6h processors.

Use cpuinfo_x86.logical_die_id will always format continuous die
(or node) IDs, because it will convert the physical die ID to logical
die ID.

So use topology_logical_die_id() instead of topology_die_id() to
decode UMC ECC errors for Hygon processors.

Signed-off-by: Liao Xuan <[email protected]>
The DF F3 device ID used to get the temperature for Hygon family 18h
model 4h processor is the same as 17H_M30H, but with different offsets,
which may span two distributed ranges. The second offset range can be
considered as private for Hygon, so use struct hygon_private to describe
it.

Add a pointer priv in k10temp_data to point to the private data.
Add functions k10temp_get_ccd_support_2nd() and hygon_read_temp()
to support reading the second offset range.

Signed-off-by: Liao Xuan <[email protected]>
Remove IMC detecting path for Hygon processors.

Signed-off-by: Liao Xuan <[email protected]>
Add support to calculate LLC ID from the number of threads sharing
the cache for Hygon family 18h model 5h processor.

Signed-off-by: Liao Xuan <[email protected]>
Add root and DF F1/F3/F4 device IDs for Hygon family 18h model
5h processors. But some model 5h processors have the legacy(M04H)
DF devices, so add a if conditional to read the df1 register.

Signed-off-by: Liao Xuan <[email protected]>
Add Hygon family 18h model 5h processor support for amd64_edac.

Signed-off-by: Liao Xuan <[email protected]>
Add 18H_M05H DF F3 device ID to get the temperature for Hygon
family 18h model 5h processor.

Signed-off-by: Liao Xuan <[email protected]>
Hygon family 18h model 6h processor has the same DF F1 device ID
as M05H_DF_F1, but should get DF ID from DF F5 device.

Signed-off-by: Liao Xuan <[email protected]>
Add Hygon family 18h model 6h processor support for amd64_edac.

Signed-off-by: Liao Xuan <[email protected]>
Hygon family 18h model 6h has 2 cs mapped to 1 umc, so adjust for it.

Signed-off-by: Liao Xuan <[email protected]>
Add support for Hygon QoS feature.

Signed-off-by: Liao Xuan <[email protected]>
Add the new PCI ID 0x1d94 0x14a9 for Hygon family 18h model 5h
HDA controller.

Signed-off-by: Liao Xuan <[email protected]>
On Hygon family 18h model 5h controller, some registers such as
GCTL, SD_CTL and SD_CTL_3B should be accessed in dword, or the
writing will fail.

Signed-off-by: Liao Xuan <[email protected]>
Add Hygon family 18h model 7h processor support for amd_nb.

Signed-off-by: Liao Xuan <[email protected]>
Add Hygon family 18h model 7h processor support for amd64_edac.

Signed-off-by: Liao Xuan <[email protected]>
Get LLC ID from ApicId[3].

Signed-off-by: Liao Xuan <[email protected]>
Add root and DF F1/F3/F4 device IDs for Hygon family 18h model
10h processors.

Signed-off-by: Liao Xuan <[email protected]>
Add Hygon family 18h model 10h processor support for amd64_edac.

Signed-off-by: Liao Xuan <[email protected]>
Liao Xuan added 2 commits May 29, 2024 08:35
Add 18H_M10H DF F3 device ID to get the temperature for Hygon
family 18h model 10h processor.

Signed-off-by: Liao Xuan <[email protected]>
Add the new PCI ID 0x1d94 0x14c9 for Hygon family 18h model 10h
HDA controller.

Signed-off-by: Liao Xuan <[email protected]>
@MingcongBai MingcongBai force-pushed the bai/kernel-rolling/hygon-support branch from b931cb6 to 05817e3 Compare May 29, 2024 00:35
@deepin-ci-robot
Copy link

deepin pr auto review

Intel HDA: Fix Hygon Dword Access

For Intel HDA, when using the Hygon driver, we need to disable the
access_sdnctl_in_dword flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

For the Hygon 18h series, we also need to disable the
hygon_dword_access flag, as it creates issues when accessing
the CCD registers.

@MingcongBai
Copy link
Contributor Author

Failed to build on amd64:

arch/x86/kernel/cpu/cacheinfo.c: In functioncacheinfo_hygon_init_llc_id’:                                                                                                                                          
arch/x86/kernel/cpu/cacheinfo.c:717:25: error: ‘cpu_llc_idundeclared (first use in this function); did you meanper_cpu_llc_id’?                                                                                  
  717 |                 per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;                                                                                                                                                   
      |                         ^~~~~~~~~~                                                                                                                                                                           
./include/linux/percpu-defs.h:219:54: note: in definition of macro__verify_pcpu_ptr219 |         const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL;    \                                                                                                                                    
      |                                                      ^~~                                                                                                                                                     
./include/linux/percpu-defs.h:269:35: note: in expansion of macroper_cpu_ptr269 | #define per_cpu(var, cpu)       (*per_cpu_ptr(&(var), cpu))                                                                                                                                                  
      |                                   ^~~~~~~~~~~                                                                                                                                                                
arch/x86/kernel/cpu/cacheinfo.c:717:17: note: in expansion of macroper_cpu717 |                 per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;                                                                                                                                                   
      |                 ^~~~~~~                                                                                                                                                                                      
arch/x86/kernel/cpu/cacheinfo.c:717:25: note: each undeclared identifier is reported only once for each function it appears in                                                                                       
  717 |                 per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;                                                                                                                                                   
      |                         ^~~~~~~~~~                                                                                                                                                                           
./include/linux/percpu-defs.h:219:54: note: in definition of macro__verify_pcpu_ptr219 |         const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL;    \                                                                                                                                    
      |                                                      ^~~                                                                                                                                                     
./include/linux/percpu-defs.h:269:35: note: in expansion of macroper_cpu_ptr269 | #define per_cpu(var, cpu)       (*per_cpu_ptr(&(var), cpu))                                                                                                                                                  
      |                                   ^~~~~~~~~~~                                                     
arch/x86/kernel/cpu/cacheinfo.c:717:17: note: in expansion of macroper_cpu717 |                 per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;                                        
      |                 ^~~~~~~                                                                           
In file included from ././include/linux/compiler_types.h:151,                                             
                 from <command-line>:                                                                     
arch/x86/kernel/cpu/cacheinfo.c:717:37: error: ‘cpuundeclared (first use in this function)              
  717 |                 per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;                                        
      |                                     ^~~                                                           
./include/linux/compiler-gcc.h:35:33: note: in definition of macroRELOC_HIDE35 |         (typeof(ptr)) (__ptr + (off));                                  \                         
      |                                 ^~~                                                               
./include/linux/percpu-defs.h:236:9: note: in expansion of macroSHIFT_PERCPU_PTR236 |         SHIFT_PERCPU_PTR((ptr), per_cpu_offset((cpu)));                 \
      |         ^~~~~~~~~~~~~~~~
./include/linux/percpu-defs.h:236:33: note: in expansion of macroper_cpu_offset236 |         SHIFT_PERCPU_PTR((ptr), per_cpu_offset((cpu)));                 \
      |                                 ^~~~~~~~~~~~~~
./include/linux/percpu-defs.h:269:35: note: in expansion of macroper_cpu_ptr269 | #define per_cpu(var, cpu)       (*per_cpu_ptr(&(var), cpu))
      |                                   ^~~~~~~~~~~
arch/x86/kernel/cpu/cacheinfo.c:717:17: note: in expansion of macroper_cpu717 |                 per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
      |                 ^~~~~~~
arch/x86/kernel/cpu/cacheinfo.c:717:45: error: ‘struct cpuinfo_x86has no member namedapicid717 |                 per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
      |                                             ^~
arch/x86/kernel/cpu/cacheinfo.c:733:53: error: ‘struct cpuinfo_x86has no member namedapicid733 |                         per_cpu(cpu_llc_id, cpu) = c->apicid >> bits;
      |                                                     ^~

@Avenger-285714 @opsiff

matrix-wsk pushed a commit to matrix-wsk/kernel-6.6 that referenced this pull request Sep 4, 2024
[ Upstream commit 0f022d3 ]

When the mirred action is used on a classful egress qdisc and a packet is
mirrored or redirected to self we hit a qdisc lock deadlock.
See trace below.

[..... other info removed for brevity....]
[   82.890906]
[   82.890906] ============================================
[   82.890906] WARNING: possible recursive locking detected
[   82.890906] 6.8.0-05205-g77fadd89fe2d-dirty deepin-community#213 Tainted: G        W
[   82.890906] --------------------------------------------
[   82.890906] ping/418 is trying to acquire lock:
[   82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at:
__dev_queue_xmit+0x1778/0x3550
[   82.890906]
[   82.890906] but task is already holding lock:
[   82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at:
__dev_queue_xmit+0x1778/0x3550
[   82.890906]
[   82.890906] other info that might help us debug this:
[   82.890906]  Possible unsafe locking scenario:
[   82.890906]
[   82.890906]        CPU0
[   82.890906]        ----
[   82.890906]   lock(&sch->q.lock);
[   82.890906]   lock(&sch->q.lock);
[   82.890906]
[   82.890906]  *** DEADLOCK ***
[   82.890906]
[..... other info removed for brevity....]

Example setup (eth0->eth0) to recreate
tc qdisc add dev eth0 root handle 1: htb default 30
tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \
     action mirred egress redirect dev eth0

Another example(eth0->eth1->eth0) to recreate
tc qdisc add dev eth0 root handle 1: htb default 30
tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \
     action mirred egress redirect dev eth1

tc qdisc add dev eth1 root handle 1: htb default 30
tc filter add dev eth1 handle 1: protocol ip prio 2 matchall \
     action mirred egress redirect dev eth0

We fix this by adding an owner field (CPU id) to struct Qdisc set after
root qdisc is entered. When the softirq enters it a second time, if the
qdisc owner is the same CPU, the packet is dropped to break the loop.

Reported-by: Mingshuai Ren <[email protected]>
Closes: https://lore.kernel.org/netdev/[email protected]/
Fixes: 3bcb846 ("net: get rid of spin_trylock() in net_tx_action()")
Fixes: e578d9c ("net: sched: use counter to break reclassify loops")
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Victor Nogueira <[email protected]>
Reviewed-by: Pedro Tammela <[email protected]>
Tested-by: Jamal Hadi Salim <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants