-
Couldn't load subscription status.
- Fork 84
Description
UpdateCil updates IDs of statements (sid) and variables (vid), but not compinfo keys (ckey). I already wondered this in #598 but now here's concrete evidence that not updating them causes issues for incremental updating as well.
Reproduction
Analyze the following program incrementally (before vs after patching): https://github.com/goblint/bench/tree/b79d0997d2bf3444f6296301e0bfc7bdebc8788c/cve-bench/CVE-2019-19537.
In order to see the problem, 5cfb7e6 needs to be reverted, because it simply hides the problematic accesses. Nevertheless, similar structs and unions could be used in actual user code, which the access analysis shouldn't ignore.
Before
On initial analysis, there's the following memory location with four accesses:
[Warning][Race] Memory location (struct usb_interface).dev(119532124,usb_interface).mutex(1002340388,device).wait_lock(655119850,mutex).__annonCompField18(448810155,spinlock).rlock(344401721,__anonunion____missing_field_name_256509364).raw_lock(706607406,raw_spinlock).__annonCompField17(637443086,arch_spinlock).tickets(201271018,__anonunion____missing_field_name_45265226).head(313990837,__raw_tickets)(119532124) (race with conf. 80):
read with mhp:{tid=[], {usb_register_dev, usb_devnode}} (conf. 80) (file.c:68:2-68:32)
read with mhp:{tid=[], {usb_register_dev}} (conf. 80) (file.c:200:2-200:114)
write with mhp:{tid=[], {usb_register_dev, usb_devnode}} (conf. 80) (file.c:68:2-68:32)
write with mhp:{tid=[], {usb_register_dev}} (conf. 80) (file.c:200:2-200:114)
Here the offset fields have debug printout of the ckey and cname of the compinfo they belong to.
After
After patching, the result differs when analyzed in server mode vs without it.
In server mode
Surprisingly, the number of safe and total memory locations goes up, because the previous four accesses now are split between two slightly different memory locations:
[Success][Race] Memory location (struct usb_interface).dev(119532124,usb_interface).mutex(1002340388,device).wait_lock(655119850,mutex).__annonCompField18(448810155,spinlock).rlock(344401721,__anonunion____missing_field_name_256509364).raw_lock(706607406,raw_spinlock).__annonCompField17(637443086,arch_spinlock).tickets(757927913,__anonunion____missing_field_name_45265226).head(313990837,__raw_tickets)(119532124) (safe):
read with [mhp:{tid=[], {usb_register_dev}}, lock:{minor_rwsem}] (conf. 80) (file.c:201:2-201:114)
write with [mhp:{tid=[], {usb_register_dev}}, lock:{minor_rwsem}] (conf. 80) (file.c:201:2-201:114)
[Warning][Race] Memory location (struct usb_interface).dev(119532124,usb_interface).mutex(1002340388,device).wait_lock(655119850,mutex).__annonCompField18(448810155,spinlock).rlock(344401721,__anonunion____missing_field_name_256509364).raw_lock(706607406,raw_spinlock).__annonCompField17(637443086,arch_spinlock).tickets(201271018,__anonunion____missing_field_name_45265226).head(313990837,__raw_tickets)(119532124) (race with conf. 80):
read with mhp:{tid=[], {usb_register_dev, usb_devnode}} (conf. 80) (file.c:68:2-68:32)
write with mhp:{tid=[], {usb_register_dev, usb_devnode}} (conf. 80) (file.c:68:2-68:32)
The difference in the locations is that the second-to-last field offset tickets belongs to different structs: for reused accesses it's with key 201271018 and for new accesses it's 757927913. This ckey difference in one offset component is enough to consider them completely separate memory locations.
This result is unsound because the old racing accesses also race with the new accesses that have the extra lock.
Without server mode
Outside of server mode, all four accesses remain together under one memory location, as expected.
Explanation
The problem is that compinfo keys are assigned from nextCompinfoKey in CIL. In server mode, both before and after programs are parsed during a single Goblint execution, therefore the counter doesn't reset in between. Without server mode, the before and after programs are parsed during separate Goblint execution, thus both start with the same counter value.
Even outside of server mode it only works by coincidence that the number and order of defined structs/unions does not change, such that they get identical keys. I haven't tested this but I suspect if that were to change, this could also cause issues outside of server mode.
So clearly UpdateCil also needs to fix up compinfo keys.
Additional notes
The problematic struct-union-emalgamation in Linux kernel headers is this:
typedef struct arch_spinlock {
union {
__ticketpair_t head_tail;
struct __raw_tickets {
__ticket_t head, tail;
} tickets;
};
} arch_spinlock_t;In CIL:
- The keys are assigned incrementally here: https://github.com/goblint/cil/blob/da5e1941db81643ef0d2133960cd28a7de311131/src/cil.ml#L1498-L1499.
- This is also documented in the interface: https://github.com/goblint/cil/blob/da5e1941db81643ef0d2133960cd28a7de311131/src/cil.mli#L365-L369.
- Contradictorily, the
ckeyfield documentation in the implementation says they are based on a name hash: https://github.com/goblint/cil/blob/da5e1941db81643ef0d2133960cd28a7de311131/src/cil.ml#L353-L357. - Turns out that's what
Mergecildoes later: https://github.com/goblint/cil/blob/da5e1941db81643ef0d2133960cd28a7de311131/src/mergecil.ml#L1510.
This hashing explains why all the ckeys in the above output are so large. But it also poses issues for the incremental updating, because unlike sids and vids they aren't assigned sequentially (at least in the merged files that get compared).
And it also creates further confusion, because in both cases the corresponding cname is identical: __anonunion____missing_field_name_45265226. Nevertheless, somehow they end up with different keys. I'm not sure why, but might be because the anonymous union with a missing field name initially gets a different name and during merging the names are somehow unified but the keys are not?