arm64: Fix bt command show wrong stacktrace on ramdump source #183
+37
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If we use crash to parse ramdump(Qcom phone device) rathen than vmcore. Start command should be like: crash vmlinux --kaslr=xxx DDRCS0_0.BIN@0x0000000080000000,... --machdep vabits_actual=39 Then We will see bt command show misleading backtrace information below:
crash> bt 16930
PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr"
#0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4
#1 [ffffffc034c43850] _kvm_nvhe$d.2314 at 6be732e004cf05a0
#2 [ffffffc034c438b0] _kvm_nvhe$d.2314 at 86c54c6004ceff80
#3 [ffffffc034c43950] _kvm_nvhe$d.2314 at 55d6f96003a7b120
#4 [ffffffc034c439f0] _kvm_nvhe$d.2314 at 9ccec46003a80a64
#5 [ffffffc034c43ac0] _kvm_nvhe$d.2314 at 8cf41e6003a945c4
#6 [ffffffc034c43b10] _kvm_nvhe$d.2314 at a8f181e00372c818
#7 [ffffffc034c43b40] _kvm_nvhe$d.2314 at 6dedde600372c0d0
#8 [ffffffc034c43b90] _kvm_nvhe$d.2314 at 62cc07e00373d0ac
#9 [ffffffc034c43c00] _kvm_nvhe$d.2314 at 72fb1de00373bedc
...
PC: 00000073f5294840 LR: 00000070d8f39ba4 SP: 00000070d4afd5d0
X29: 00000070d4afd600 X28: b4000071efcda7f0 X27: 00000070d4afe000
X26: 0000000000000000 X25: 00000070d9616000 X24: 0000000000000000
X23: 0000000000000000 X22: 0000000000000000 X21: 0000000000000000
X20: b40000728fd27520 X19: b40000728fd27550 X18: 000000702daba000
X17: 00000073f5294820 X16: 00000070d940f9d8 X15: 00000000000000bf
X14: 0000000000000000 X13: 00000070d8ad2fac X12: b40000718fce5040
X11: 0000000000000000 X10: 0000000000000070 X9: 0000000000000001
X8: 0000000000000062 X7: 0000000000000020 X6: 0000000000000000
X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000000000
X2: 0000000000000002 X1: 0000000000000080 X0: b40000728fd27550
ORIG_X0: b40000728fd27550 SYSCALLNO: ffffffff PSTATE: 40001000
By checking the raw data below, will see the lr (fp+8) data show the pointer which already been replaced by PAC prefix.
crash> bt -f
PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr"
#0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4
ffffffc034c437f0: ffffffc034c43850 6be732e004cf05a4
ffffffc034c43800: ffffffe006186108 a0ed07e004cf09c4
ffffffc034c43810: ffffff8a1a340000 ffffff8a8d343c00
ffffffc034c43820: ffffff89b3eada00 ffffff8b780db540
ffffffc034c43830: ffffff89b3eada00 0000000000000000
ffffffc034c43840: 0000000000000004 712b828118484a00
#1 [ffffffc034c43850] _kvm_nvhe$d.2314 at 6be732e004cf05a0
ffffffc034c43850: ffffffc034c438b0 86c54c6004ceff84
ffffffc034c43860: 000000708070f000 ffffffc034c43938
ffffffc034c43870: ffffff88bd822878 ffffff89b3eada00
...
So we check the CONFIG_ARM64_PTR_AUTH and CONFIG_ARM64_PTR_AUTH_KERNEL to double check if pac mechanism been enabled on this ramdump. Then we use vabits to figure it out.
Fix then show the right backtrace below:
crash> bt 16930
PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr"
#0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4
#1 [ffffffc034c43850] __schedule at ffffffe004cf05a0
#2 [ffffffc034c438b0] preempt_schedule_common at ffffffe004ceff80
#3 [ffffffc034c43950] unmap_page_range at ffffffe003a7b120
#4 [ffffffc034c439f0] unmap_vmas at ffffffe003a80a64
#5 [ffffffc034c43ac0] exit_mmap at ffffffe003a945c4
#6 [ffffffc034c43b10] __mmput at ffffffe00372c818
#7 [ffffffc034c43b40] mmput at ffffffe00372c0d0
#8 [ffffffc034c43b90] exit_mm at ffffffe00373d0ac
#9 [ffffffc034c43c00] do_exit at ffffffe00373bedc
PC: 00000073f5294840 LR: 00000070d8f39ba4 SP: 00000070d4afd5d0
X29: 00000070d4afd600 X28: b4000071efcda7f0 X27: 00000070d4afe000
X26: 0000000000000000 X25: 00000070d9616000 X24: 0000000000000000
X23: 0000000000000000 X22: 0000000000000000 X21: 0000000000000000
X20: b40000728fd27520 X19: b40000728fd27550 X18: 000000702daba000
X17: 00000073f5294820 X16: 00000070d940f9d8 X15: 00000000000000bf
X14: 0000000000000000 X13: 00000070d8ad2fac X12: b40000718fce5040
X11: 0000000000000000 X10: 0000000000000070 X9: 0000000000000001
X8: 0000000000000062 X7: 0000000000000020 X6: 0000000000000000
X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000000000
X2: 0000000000000002 X1: 0000000000000080 X0: b40000728fd27550
ORIG_X0: b40000728fd27550 SYSCALLNO: ffffffff PSTATE: 40001000
Let's use GENMASK to replace the pac pointer to fix it. gki related commit url here:
https://lore.kernel.org/all/[email protected]/