Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent classification of syscalls with respect to side effects #400

Open
RyanGlScott opened this issue Jul 19, 2024 · 2 comments
Open
Labels

Comments

@RyanGlScott
Copy link
Contributor

archFnHasSideEffects classifies whether a primitive function has implicit side effects or not, with each architecture having different implementations. I was surprised to learn that each architecture classifies system calls differently vis-à-vis side effects. macaw-x86 classifies syscalls as having side effects:

X86Syscall {} -> True

But neither AArch32 nor RISC-V do:

-- no side effects... yet
armPrimFnHasSideEffects :: ARMPrimFn f tp -> Bool
armPrimFnHasSideEffects = const False

-- | The existing architecture specific functions have no side effects
riscvPrimFnHasSideEffects :: RISCVPrimFn rv f tp -> Bool
riscvPrimFnHasSideEffects = const False

This feels inconsistent. I would think that system calls should be treated as having side effects, although I don't have a concrete example where classifying them as side-effect–free would cause them to go wrong.

@sauclovian-g
Copy link
Contributor

It depends to some extent on which things you consider side effects, which is sort of a fraught question (e.g. is the effect of kill(getpid(), SIGSEGV) a side effect or just a program step?) but in general yes they do, and while going wrong is probably not that likely in an environment that isn't trying to model the system/kernel state, consider unlinking a file (which can cause a subsequent repeated syscall to behave differently) or, more extreme, forking a subprocess that updates a shared memory region.

@RyanGlScott
Copy link
Contributor Author

From Macaw's perspective, "side effectful" means "should always be considered as being demanded when performing CFG dependency analysis" (see this code). To add a further wrinkle, Macaw doesn't encode the semantics of any system calls directly, but instead provides a generic system call hook that lets users attach whatever semantics they want to any system call. As such, I think Macaw must be conservative here and assume that the side effects must be extreme enough to warrant returning True here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants