Skip to content

Commit

Permalink
up to 6.1.12
Browse files Browse the repository at this point in the history
  • Loading branch information
unifreq committed Feb 15, 2023
1 parent df384fe commit 29c263d
Show file tree
Hide file tree
Showing 132 changed files with 1,211 additions and 615 deletions.
92 changes: 92 additions & 0 deletions Documentation/admin-guide/hw-vuln/cross-thread-rsb.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@

.. SPDX-License-Identifier: GPL-2.0
Cross-Thread Return Address Predictions
=======================================

Certain AMD and Hygon processors are subject to a cross-thread return address
predictions vulnerability. When running in SMT mode and one sibling thread
transitions out of C0 state, the other sibling thread could use return target
predictions from the sibling thread that transitioned out of C0.

The Spectre v2 mitigations protect the Linux kernel, as it fills the return
address prediction entries with safe targets when context switching to the idle
thread. However, KVM does allow a VMM to prevent exiting guest mode when
transitioning out of C0. This could result in a guest-controlled return target
being consumed by the sibling thread.

Affected processors
-------------------

The following CPUs are vulnerable:

- AMD Family 17h processors
- Hygon Family 18h processors

Related CVEs
------------

The following CVE entry is related to this issue:

============== =======================================
CVE-2022-27672 Cross-Thread Return Address Predictions
============== =======================================

Problem
-------

Affected SMT-capable processors support 1T and 2T modes of execution when SMT
is enabled. In 2T mode, both threads in a core are executing code. For the
processor core to enter 1T mode, it is required that one of the threads
requests to transition out of the C0 state. This can be communicated with the
HLT instruction or with an MWAIT instruction that requests non-C0.
When the thread re-enters the C0 state, the processor transitions back
to 2T mode, assuming the other thread is also still in C0 state.

In affected processors, the return address predictor (RAP) is partitioned
depending on the SMT mode. For instance, in 2T mode each thread uses a private
16-entry RAP, but in 1T mode, the active thread uses a 32-entry RAP. Upon
transition between 1T/2T mode, the RAP contents are not modified but the RAP
pointers (which control the next return target to use for predictions) may
change. This behavior may result in return targets from one SMT thread being
used by RET predictions in the sibling thread following a 1T/2T switch. In
particular, a RET instruction executed immediately after a transition to 1T may
use a return target from the thread that just became idle. In theory, this
could lead to information disclosure if the return targets used do not come
from trustworthy code.

Attack scenarios
----------------

An attack can be mounted on affected processors by performing a series of CALL
instructions with targeted return locations and then transitioning out of C0
state.

Mitigation mechanism
--------------------

Before entering idle state, the kernel context switches to the idle thread. The
context switch fills the RAP entries (referred to as the RSB in Linux) with safe
targets by performing a sequence of CALL instructions.

Prevent a guest VM from directly putting the processor into an idle state by
intercepting HLT and MWAIT instructions.

Both mitigations are required to fully address this issue.

Mitigation control on the kernel command line
---------------------------------------------

Use existing Spectre v2 mitigations that will fill the RSB on context switch.

Mitigation control for KVM - module parameter
---------------------------------------------

By default, the KVM hypervisor mitigates this issue by intercepting guest
attempts to transition out of C0. A VMM can use the KVM_CAP_X86_DISABLE_EXITS
capability to override those interceptions, but since this is not common, the
mitigation that covers this path is not enabled by default.

The mitigation for the KVM_CAP_X86_DISABLE_EXITS capability can be turned on
using the boolean module parameter mitigate_smt_rsb, e.g.:
kvm.mitigate_smt_rsb=1
1 change: 1 addition & 0 deletions Documentation/admin-guide/hw-vuln/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ are configurable at compile, boot or run time.
core-scheduling.rst
l1d_flush.rst
processor_mmio_stale_data.rst
cross-thread-rsb.rst
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
VERSION = 6
PATCHLEVEL = 1
SUBLEVEL = 11
SUBLEVEL = 12
EXTRAVERSION =
NAME = Hurr durr I'ma ninja sloth

Expand Down
4 changes: 2 additions & 2 deletions arch/arm64/boot/dts/amlogic/meson-axg.dtsi
Original file line number Diff line number Diff line change
Expand Up @@ -1885,7 +1885,7 @@
sd_emmc_b: sd@5000 {
compatible = "amlogic,meson-axg-mmc";
reg = <0x0 0x5000 0x0 0x800>;
interrupts = <GIC_SPI 217 IRQ_TYPE_EDGE_RISING>;
interrupts = <GIC_SPI 217 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
clocks = <&clkc CLKID_SD_EMMC_B>,
<&clkc CLKID_SD_EMMC_B_CLK0>,
Expand All @@ -1897,7 +1897,7 @@
sd_emmc_c: mmc@7000 {
compatible = "amlogic,meson-axg-mmc";
reg = <0x0 0x7000 0x0 0x800>;
interrupts = <GIC_SPI 218 IRQ_TYPE_EDGE_RISING>;
interrupts = <GIC_SPI 218 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
clocks = <&clkc CLKID_SD_EMMC_C>,
<&clkc CLKID_SD_EMMC_C_CLK0>,
Expand Down
6 changes: 3 additions & 3 deletions arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi
Original file line number Diff line number Diff line change
Expand Up @@ -2318,7 +2318,7 @@
sd_emmc_a: sd@ffe03000 {
compatible = "amlogic,meson-axg-mmc";
reg = <0x0 0xffe03000 0x0 0x800>;
interrupts = <GIC_SPI 189 IRQ_TYPE_EDGE_RISING>;
interrupts = <GIC_SPI 189 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
clocks = <&clkc CLKID_SD_EMMC_A>,
<&clkc CLKID_SD_EMMC_A_CLK0>,
Expand All @@ -2330,7 +2330,7 @@
sd_emmc_b: sd@ffe05000 {
compatible = "amlogic,meson-axg-mmc";
reg = <0x0 0xffe05000 0x0 0x800>;
interrupts = <GIC_SPI 190 IRQ_TYPE_EDGE_RISING>;
interrupts = <GIC_SPI 190 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
clocks = <&clkc CLKID_SD_EMMC_B>,
<&clkc CLKID_SD_EMMC_B_CLK0>,
Expand All @@ -2342,7 +2342,7 @@
sd_emmc_c: mmc@ffe07000 {
compatible = "amlogic,meson-axg-mmc";
reg = <0x0 0xffe07000 0x0 0x800>;
interrupts = <GIC_SPI 191 IRQ_TYPE_EDGE_RISING>;
interrupts = <GIC_SPI 191 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
clocks = <&clkc CLKID_SD_EMMC_C>,
<&clkc CLKID_SD_EMMC_C_CLK0>,
Expand Down
6 changes: 3 additions & 3 deletions arch/arm64/boot/dts/amlogic/meson-gx.dtsi
Original file line number Diff line number Diff line change
Expand Up @@ -611,21 +611,21 @@
sd_emmc_a: mmc@70000 {
compatible = "amlogic,meson-gx-mmc", "amlogic,meson-gxbb-mmc";
reg = <0x0 0x70000 0x0 0x800>;
interrupts = <GIC_SPI 216 IRQ_TYPE_EDGE_RISING>;
interrupts = <GIC_SPI 216 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
};

sd_emmc_b: mmc@72000 {
compatible = "amlogic,meson-gx-mmc", "amlogic,meson-gxbb-mmc";
reg = <0x0 0x72000 0x0 0x800>;
interrupts = <GIC_SPI 217 IRQ_TYPE_EDGE_RISING>;
interrupts = <GIC_SPI 217 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
};

sd_emmc_c: mmc@74000 {
compatible = "amlogic,meson-gx-mmc", "amlogic,meson-gxbb-mmc";
reg = <0x0 0x74000 0x0 0x800>;
interrupts = <GIC_SPI 218 IRQ_TYPE_EDGE_RISING>;
interrupts = <GIC_SPI 218 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
};
};
Expand Down
4 changes: 2 additions & 2 deletions arch/arm64/boot/dts/mediatek/mt8195.dtsi
Original file line number Diff line number Diff line change
Expand Up @@ -1966,7 +1966,7 @@
};

vdosys0: syscon@1c01a000 {
compatible = "mediatek,mt8195-mmsys", "syscon";
compatible = "mediatek,mt8195-vdosys0", "mediatek,mt8195-mmsys", "syscon";
reg = <0 0x1c01a000 0 0x1000>;
mboxes = <&gce0 0 CMDQ_THR_PRIO_4>;
#clock-cells = <1>;
Expand Down Expand Up @@ -2101,7 +2101,7 @@
};

vdosys1: syscon@1c100000 {
compatible = "mediatek,mt8195-mmsys", "syscon";
compatible = "mediatek,mt8195-vdosys1", "syscon";
reg = <0 0x1c100000 0 0x1000>;
#clock-cells = <1>;
};
Expand Down
2 changes: 0 additions & 2 deletions arch/arm64/boot/dts/rockchip/rk3399.dtsi
Original file line number Diff line number Diff line change
Expand Up @@ -2251,13 +2251,11 @@
pcfg_input_pull_up: pcfg-input-pull-up {
input-enable;
bias-pull-up;
drive-strength = <2>;
};

pcfg_input_pull_down: pcfg-input-pull-down {
input-enable;
bias-pull-down;
drive-strength = <2>;
};

clock {
Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/boot/dts/rockchip/rk3568-rock-3a.dts
Original file line number Diff line number Diff line change
Expand Up @@ -642,7 +642,7 @@
disable-wp;
pinctrl-names = "default";
pinctrl-0 = <&sdmmc0_bus4 &sdmmc0_clk &sdmmc0_cmd &sdmmc0_det>;
sd-uhs-sdr104;
sd-uhs-sdr50;
vmmc-supply = <&vcc3v3_sd>;
vqmmc-supply = <&vccio_sd>;
status = "okay";
Expand Down
6 changes: 4 additions & 2 deletions arch/powerpc/kernel/interrupt.c
Original file line number Diff line number Diff line change
Expand Up @@ -50,16 +50,18 @@ static inline bool exit_must_hard_disable(void)
*/
static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
{
bool must_hard_disable = (exit_must_hard_disable() || !restartable);

/* This must be done with RI=1 because tracing may touch vmaps */
trace_hardirqs_on();

if (exit_must_hard_disable() || !restartable)
if (must_hard_disable)
__hard_EE_RI_disable();

#ifdef CONFIG_PPC64
/* This pattern matches prep_irq_for_idle */
if (unlikely(lazy_irq_pending_nocheck())) {
if (exit_must_hard_disable() || !restartable) {
if (must_hard_disable) {
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
__hard_RI_enable();
}
Expand Down
8 changes: 5 additions & 3 deletions arch/riscv/kernel/probes/kprobes.c
Original file line number Diff line number Diff line change
Expand Up @@ -65,16 +65,18 @@ static bool __kprobes arch_check_kprobe(struct kprobe *p)

int __kprobes arch_prepare_kprobe(struct kprobe *p)
{
unsigned long probe_addr = (unsigned long)p->addr;
u16 *insn = (u16 *)p->addr;

if (probe_addr & 0x1)
if ((unsigned long)insn & 0x1)
return -EILSEQ;

if (!arch_check_kprobe(p))
return -EILSEQ;

/* copy instruction */
p->opcode = *p->addr;
p->opcode = (kprobe_opcode_t)(*insn++);
if (GET_INSN_LENGTH(p->opcode) == 4)
p->opcode |= (kprobe_opcode_t)(*insn) << 16;

/* decode instruction */
switch (riscv_probe_decode_insn(p->addr, &p->ainsn.api)) {
Expand Down
3 changes: 2 additions & 1 deletion arch/riscv/kernel/stacktrace.c
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
fp = (unsigned long)__builtin_frame_address(0);
sp = current_stack_pointer;
pc = (unsigned long)walk_stackframe;
level = -1;
} else {
/* task blocked in __switch_to */
fp = task->thread.s[0];
Expand All @@ -41,7 +42,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
unsigned long low, high;
struct stackframe *frame;

if (unlikely(!__kernel_text_address(pc) || (level++ >= 1 && !fn(arg, pc))))
if (unlikely(!__kernel_text_address(pc) || (level++ >= 0 && !fn(arg, pc))))
break;

/* Validate frame pointer */
Expand Down
4 changes: 3 additions & 1 deletion arch/riscv/mm/cacheflush.c
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,10 @@ void flush_icache_pte(pte_t pte)
{
struct page *page = pte_page(pte);

if (!test_and_set_bit(PG_dcache_clean, &page->flags))
if (!test_bit(PG_dcache_clean, &page->flags)) {
flush_icache_all();
set_bit(PG_dcache_clean, &page->flags);
}
}
#endif /* CONFIG_MMU */

Expand Down
1 change: 1 addition & 0 deletions arch/x86/include/asm/cpufeatures.h
Original file line number Diff line number Diff line change
Expand Up @@ -463,5 +463,6 @@
#define X86_BUG_MMIO_UNKNOWN X86_BUG(26) /* CPU is too old and its MMIO Stale Data status is unknown */
#define X86_BUG_RETBLEED X86_BUG(27) /* CPU is affected by RETBleed */
#define X86_BUG_EIBRS_PBRSB X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
#define X86_BUG_SMT_RSB X86_BUG(29) /* CPU is vulnerable to Cross-Thread Return Address Predictions */

#endif /* _ASM_X86_CPUFEATURES_H */
9 changes: 7 additions & 2 deletions arch/x86/kernel/cpu/common.c
Original file line number Diff line number Diff line change
Expand Up @@ -1235,6 +1235,8 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
#define MMIO_SBDS BIT(2)
/* CPU is affected by RETbleed, speculating where you would not expect it */
#define RETBLEED BIT(3)
/* CPU is affected by SMT (cross-thread) return predictions */
#define SMT_RSB BIT(4)

static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS),
Expand Down Expand Up @@ -1266,8 +1268,8 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {

VULNBL_AMD(0x15, RETBLEED),
VULNBL_AMD(0x16, RETBLEED),
VULNBL_AMD(0x17, RETBLEED),
VULNBL_HYGON(0x18, RETBLEED),
VULNBL_AMD(0x17, RETBLEED | SMT_RSB),
VULNBL_HYGON(0x18, RETBLEED | SMT_RSB),
{}
};

Expand Down Expand Up @@ -1385,6 +1387,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
!(ia32_cap & ARCH_CAP_PBRSB_NO))
setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB);

if (cpu_matches(cpu_vuln_blacklist, SMT_RSB))
setup_force_cpu_bug(X86_BUG_SMT_RSB);

if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
return;

Expand Down
43 changes: 32 additions & 11 deletions arch/x86/kvm/x86.c
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,10 @@ module_param(enable_pmu, bool, 0444);
bool __read_mostly eager_page_split = true;
module_param(eager_page_split, bool, 0644);

/* Enable/disable SMT_RSB bug mitigation */
bool __read_mostly mitigate_smt_rsb;
module_param(mitigate_smt_rsb, bool, 0444);

/*
* Restoring the host value for MSRs that are only consumed when running in
* usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU
Expand Down Expand Up @@ -4435,10 +4439,15 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = KVM_CLOCK_VALID_FLAGS;
break;
case KVM_CAP_X86_DISABLE_EXITS:
r |= KVM_X86_DISABLE_EXITS_HLT | KVM_X86_DISABLE_EXITS_PAUSE |
KVM_X86_DISABLE_EXITS_CSTATE;
if(kvm_can_mwait_in_guest())
r |= KVM_X86_DISABLE_EXITS_MWAIT;
r = KVM_X86_DISABLE_EXITS_PAUSE;

if (!mitigate_smt_rsb) {
r |= KVM_X86_DISABLE_EXITS_HLT |
KVM_X86_DISABLE_EXITS_CSTATE;

if (kvm_can_mwait_in_guest())
r |= KVM_X86_DISABLE_EXITS_MWAIT;
}
break;
case KVM_CAP_X86_SMM:
/* SMBASE is usually relocated above 1M on modern chipsets,
Expand Down Expand Up @@ -6214,15 +6223,26 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
break;

if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
kvm_can_mwait_in_guest())
kvm->arch.mwait_in_guest = true;
if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT)
kvm->arch.hlt_in_guest = true;
if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
kvm->arch.pause_in_guest = true;
if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
kvm->arch.cstate_in_guest = true;

#define SMT_RSB_MSG "This processor is affected by the Cross-Thread Return Predictions vulnerability. " \
"KVM_CAP_X86_DISABLE_EXITS should only be used with SMT disabled or trusted guests."

if (!mitigate_smt_rsb) {
if (boot_cpu_has_bug(X86_BUG_SMT_RSB) && cpu_smt_possible() &&
(cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
pr_warn_once(SMT_RSB_MSG);

if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
kvm_can_mwait_in_guest())
kvm->arch.mwait_in_guest = true;
if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT)
kvm->arch.hlt_in_guest = true;
if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
kvm->arch.cstate_in_guest = true;
}

r = 0;
break;
case KVM_CAP_MSR_PLATFORM_INFO:
Expand Down Expand Up @@ -13730,6 +13750,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_msr_protocol_exit);
static int __init kvm_x86_init(void)
{
kvm_mmu_x86_module_init();
mitigate_smt_rsb &= boot_cpu_has_bug(X86_BUG_SMT_RSB) && cpu_smt_possible();
return 0;
}
module_init(kvm_x86_init);
Expand Down
Loading

0 comments on commit 29c263d

Please sign in to comment.