-
Notifications
You must be signed in to change notification settings - Fork 12
/
Copy pathHow-to-fix-a-bug-in-srcinv.txt
194 lines (169 loc) · 6.76 KB
/
How-to-fix-a-bug-in-srcinv.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
How to fix a bug in srcinv?
This is a working journal for testing centos kernel 4.18.0-193.
GCC version is 8.4.0(analysis side), 8.3.1(collect side).
1) parse phase 1: adjust pointers.
1.1)
Crash report message:
0x7fffd6ffe700 net/core/dev.c (99%)
0x7fffd6ffe700 net/core/dev.c (100%)
0x7fff92ffe700 net/ipv4/icmp.c (0%)
0x7fff93fff700 lib/stackdepot.c (0%)
0x7fffdeddd700 kernel/time/timekeeping.c (0%)
0x7fffddddc700 lib/asn1_decoder.c (0%)
0x7fffd7fff700 lib/oid_registry.c (0%)
0x7fffd6ffe700 lib/ucs2_string.c (0%)
0x7fffd5ffd700 kernel/time/ntp.c (0%)
0x7fffc3fff700 lib/ubsan.c (0%)
0x7fffc2ffe700 kernel/time/clocksource.c (0%)
0x7fffc1ffd700 drivers/base/property.c (0%)
0x7fffb3fff700 kernel/time/jiffies.c (0%)
0x7fffb2ffe700 lib/sbitmap.c (0%)
0x7fffb1ffd700 net/netlink/genetlink.c (0%)
0x7fffa3fff700 lib/argv_split.c (0%)
0x7fffa2ffe700 kernel/time/timer_list.c (0%)
0x7fffa1ffd700 lib/bug.c (0%)
PHASE1 processed(1461) total(2277)
PHASE1 processed(1461) total(2277)
0x7fff92ffe700 net/ipv4/icmp.c (16%)
0x7fff93fff700 lib/stackdepot.c (0%)
0x7fffdeddd700 kernel/time/timekeeping.c (21%)
0x7fffddddc700 lib/asn1_decoder.c (44%)
PHASE1 processed(1461) total(2277)
0x7fffd7fff700 lib/oid_registry.c (57%)
0x7fffd6ffe700 lib/ucs2_string.c (62%)
0x7fffd5ffd700 kernel/time/ntp.c (29%)
0x7fffc3fff700 lib/ubsan.c (55%)
0x7fffc2ffe700 kernel/time/clocksource.c (32%)
0x7fffc1ffd700 drivers/base/property.c (15%)
0x7fffb3fff700 kernel/time/jiffies.c (42%)
0x7fffb2ffe700 lib/sbitmap.c (26%)
0x7fffb1ffd700 net/netlink/genetlink.c (13%)
0x7fffa3fff700 lib/argv_split.c (40%)
0x7fffa2ffe700 kernel/time/timer_list.c (25%)
0x7fffa1ffd700 lib/bug.c (35%)
*** stack smashing detected ***: terminated
Aborted (core dumped)
Handle the crash:
Stack smashing. We may have exceed the recursion limit.
*) Find the resources we need to reproduce the crash.
Use getoffs command in analysis to get the 1460th and 1492th file offsets.
Use cp-file to copy the 32 files' contents, save to subresfile.0/subresfile.
Analysis the subresfile:
<1> analysis> parse subresfile 1 1
[done] take a preview of the resfile
PHASE1 processed(0) total(32)
0x7fffdeddd700 net/netlink/af_netlink.c (0%)
0x7fffddddc700 net/ipv4/icmp.c (0%)
0x7fffd7fff700 lib/stackdepot.c (0%)
0x7fffd6ffe700 kernel/time/timekeeping.c (0%)
0x7fffcbfff700 lib/oid_registry.c (0%)
0x7fffc3fff700 lib/ubsan.c (0%)
0x7fffcaffe700 lib/ucs2_string.c (0%)
0x7fffd5ffd700 lib/asn1_decoder.c (0%)
0x7fffc9ffd700 kernel/time/ntp.c (0%)
0x7fffc2ffe700 kernel/time/clocksource.c (0%)
0x7fffbaffe700 lib/sbitmap.c (0%)
0x7fffbbfff700 kernel/time/jiffies.c (0%)
0x7fffb2ffe700 kernel/time/timer_list.c (0%)
0x7fffb9ffd700 net/netlink/genetlink.c (0%)
0x7fffc1ffd700 drivers/base/property.c (0%)
0x7fffb3fff700 lib/argv_split.c (0%)
PHASE1 processed(0) total(32)
PHASE1 processed(0) total(32)
PHASE1 processed(0) total(32)
0x7fffdeddd700 net/netlink/af_netlink.c (8%)
0x7fffddddc700 net/ipv4/icmp.c (11%)
0x7fffd7fff700 lib/stackdepot.c (0%)
0x7fffd6ffe700 kernel/time/timekeeping.c (16%)
0x7fffcbfff700 lib/oid_registry.c (39%)
0x7fffc3fff700 lib/ubsan.c (45%)
0x7fffcaffe700 lib/ucs2_string.c (41%)
0x7fffd5ffd700 lib/asn1_decoder.c (38%)
0x7fffc9ffd700 kernel/time/ntp.c (21%)
0x7fffc2ffe700 kernel/time/clocksource.c (26%)
0x7fffbaffe700 lib/sbitmap.c (23%)
0x7fffbbfff700 kernel/time/jiffies.c (39%)
0x7fffb2ffe700 kernel/time/timer_list.c (24%)
0x7fffb9ffd700 net/netlink/genetlink.c (11%)
0x7fffc1ffd700 drivers/base/property.c (11%)
0x7fffb3fff700 lib/argv_split.c (40%)
*** stack smashing detected ***: terminated
Aborted (core dumped)
Okay, we found the resources.
*) Rebuild srcinv, modify SELF_CFLAGS_N to 0 and ANALYSIS_THREAD_N to 1.
*) Restore subresfile, then gdb ./si_core, enjoy a cup of coffee, we may
get this:
0x7fffdeda2700 lib/stackdepot.c (0%)
[Switching to Thread 0x7fffdeda2700 (LWP 132748)]
Thread 4 "si_core" hit Catchpoint 1 (signal SIGSEGV), 0x00007fffdf9e7274 in c_parse (buf=0x100000280, parse_mode=1) at c.cc:8298
8298 void *faddrs[obj_cnt];
(gdb) p faddrs
value requires 9150384 bytes, which is more than max-value-size
(gdb) p obj_cnt
$1 = 1143798
The obj_cnt seems to be too large(18M). The thread stack size we set is
16M or CONFIG_THREAD_STACKSZ.
*) Fix the crash.
Reset CONFIG_THREAD_STACKSZ, increasing the THREAD_STACKSZ should work.
Beside, add an extra check against the obj_cnt.
Recompile srcinv, analysis subresfile again.
1.2)
A new bug:
BUG_ON(obj_adjusted != obj_cnt);
There are some objs not adjusted yet, in lib/stackdepot.c.
How to fix it?
We know the bug is during parsing lib/stackdepot.c.
Modify analysis/gcc/c.cc or gdb ./si_core, find out the objs not adjusted.
Still have no idea.
Use cmdline in analysis to get the compile command.
Recompile lib/stackdepot.c, gdb /usr/bin/gcc, set follow-fork-mode child,
b ast_collect. Check the backtrace when the first not-adjusted obj
writen into buffer.
Looks like the tree_int_cst does have a type node. However, the analysis
set it NULL.
So, the issue is in __get_real_addr. We need to check the real_addr field
if found a match.
2) parse phase 4: mark data
A BUG_ON triggered in __do_constructor_init(). We could find the target var_list.
*) Locate the source file.
Please, enable HAVE_CLIB_DBG_FUNC. While a BUG_ON being triggered, it
shall show which thread this bug in, which could lead us to the source
file.
*) Use one_sibuf command in analysis mode.
Recompile srcinv with SELF_CFLAGS_N=0, gdb ./si_core, analysis.
This could save us one hour to wait for triggering the BUG.
3) dec phase issues
3.1) infinite loop case 0
Usually, the output message will show you the function cause the
infinite loop.
*) break at do_dec, memorize both the gs and the gimple_code
Use these gdb commands:
list do_dec
break line_num
commands The_breakpoint_number
p gs
p *gs
end
*) break at each dec_gimple_* function we found in the previous step.
*) find out the data we mishandle.
3.2) loop for UINT_MAX times
In linux kernel __fb_pad_aligned_buffer, there is a loop:
for (i = height; i--; ) {
for (j = 0; j < s_pitch; j++)
*dst++ = *src++;
dst += d_pitch;
}
The i-- statement has two gimples: GIMPLE_ASSIGN, GIMPLE_COND.
GIMPLE_ASSIGN:
op[0]: SSA_NAME, reference to VAR_DECL i.
op[1]: SSA_NAME, reference to VAR_DECL i.
op[2]: 0xffffffff.
GIMPLE_COND:
op[0]: the op[1] of previous GIMPLE_ASSIGN.
op[1]: 0.
When handle GIMPLE_ASSIGN, both the op[0] and op[1] will get the
data_state_rw of i. And then the GIMPLE_ASSIGN statement will
set 0xffffffff to the i. This leads the following GIMPLE_COND statement
mischecked.
So, we need to handle SSA_NAME properly.
This issue seems to be resolved in e50e21e244786872588af4e054a580d4c3883ee0.