Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to define a mutation operator that takes arguments? #528

Open
AndrewC19 opened this issue Aug 22, 2022 · 2 comments
Open

Comments

@AndrewC19
Copy link

I would like to define a custom mutation operator that looks for particular variables in the program-under-test and replaces them with random constants. For example, I might want to replace all uses of a variable named x1 with some randomly sampled int.

Is there a straightforward way to achieve this using cosmic-ray? I have tried defining my own Operator sub-class but I need a way to pass the target variable to the mutation operator.

Thanks in advance.

@abingham
Copy link
Contributor

There is not currently a way to do that. I guess broadly speaking you'd want to be able to do some per-operator configuration in the configuration file and have it take effect when you exec your session. This seems totally reasonable. A few things come to mind which we'd need to think about.

  1. How will this argument information get communicated through distributors? Right now, the WorkItems that we send over the distributor don't have configuration information in them, so the workers - which could be running anywhere - don't know about it. We could send the complete configuration over to each worker somehow, or - more directly addressing your issue, and I think my preference - each WorkItem could also include some kind of "arguments" struct that the operator could interpret as it sees fit. I don't think this would be too hard, really.

  2. Do we need to support multiple sets of arguments for any given operators? That is, suppose you wanted to replace all variables named x1 and you wanted to replace all variables named x2? Do operators now become templates (in the C++ sense, sorta) where each set of arguments actually generates a new independent operator? Something like this could possibly be handled by our OperatorProvider system, but the providers would probably need to be handed the configuration. (and perhaps I'm just overthinking this).

There are probably other angles to this that I'm not seeing right now. Does any of this sound reasonable to you, or along the lines of what you had in mind? I don't know when I'd be able to work on this, but it feels like a good "airplane project"...I just need a long trip somewhere!

@AndrewC19
Copy link
Author

AndrewC19 commented Aug 24, 2022

An example might help with this decision.

I have implemented a VariableReplacement operator that can carry out two types of mutation:
(1) Replace all usages of a named variable with a constant (e.g. replace x in y=2*x+1 --> y=2*10+1).
(2) Replace usages of a named variable in the declaration of a second named variable (e.g. replacing x in statements of y, such that y=2x+1 --> y=2*10+1 but j=2x+1 does not change).

Here's the implementation:

"""Implementation of the variable-replacement operator."""
from .operator import Operator
from parso.python.tree import Name, Number
from random import randint


class VariableReplacer(Operator):
    """An operator that replaces usages of named variables."""

    def __init__(self, cause_variable, effect_variable=None):
        self.cause_variable = cause_variable
        self.effect_variable = effect_variable

    def mutation_positions(self, node):
        """Mutate usages of the specified cause variable. If an effect variable is also
        specified, then only mutate usages of the cause variable in definitions of the
        effect variable."""

        if isinstance(node, Name) and node.value == self.cause_variable:

            # Confirm that name node is used on right hand side of the expression
            expr_node = node.search_ancestor('expr_stmt')
            if expr_node:
                cause_variables = expr_node.get_rhs().children
                if node in cause_variables:
                    mutation_position = (node.start_pos, node.end_pos)

                    # If an effect variable is specified, confirm that it appears on left hand
                    # side of the expression
                    if self.effect_variable:
                        effect_variable_names = [v.value for v in expr_node.get_defined_names()]
                        if self.effect_variable in effect_variable_names:
                            yield mutation_position

                    # If no effect variable is specified, any occurrence of the cause variable
                    # on the right hand side of an expression can be mutated
                    else:
                        yield mutation_position

    def mutate(self, node, index):
        """Replace cause variable with random constant."""
        assert isinstance(node, Name)

        return Number(start_pos=node.start_pos, value=str(randint(-100, 100)))

    @classmethod
    def examples(cls):
        return (
            # for cause_variable='x'
            ('y = x + z', 'y = 10 + z'),
            # for cause_variable='x' and effect_variable='y'
            ('j = x + z\ny = x + z', 'j = x + z\ny = -2 + z'),
            # for cause_variable='x' and effect_variable='j',
            ('j = x + z\ny = x + z', 'j = 1 + z\ny = x + z'),
            # for cause_variable='x'
            ('y = 2*x + 10 + j + x**2', 'y=2*10 + 10 + j + -4**2'),
        )

The class works if I manually modify src/cosmic_ray/commands/init.py and src/cosmic_ray/mutating.py to instantiate the operator with arguments, and creates mutations such as:

# cause_variable='x'
# effect_variable='y'
--- mutation diff ---
--- acalculator.py
+++ bcalculator.py
@@ -1,5 +1,5 @@
 def mul(x, z):
     j = x * z
-    y = x * j
+    y =28 * j
     return y

Now we need a way to pass two variables to instances of VariableReplacement. From a user perspective, I would suggest that this should be handled in the TOML config file under a table [cosmic-ray.operators]. This should enable me to specify a list of operators that I want to apply and, if they require arguments, to specify these. I imagine something like the following:

[[cosmic-ray.operators]]
name = "variable_replacer"
args = [{ cause_variable = "x", effect_variable = "y"},
        { cause_variable = "x", effect_variable = "j"}]

[[cosmic-ray.operators]]
name = "number_replacer"

Then, for every unique set of arguments defined for an operator, a WorkItem should be created that initalises and applies the mutation. If this table isn't specified, it would make sense to use the current behaviour (run all mutation operators that can be applied to the program).

What are your thoughts?

AndrewC19 pushed a commit to AndrewC19/cosmic-ray that referenced this issue Aug 26, 2022
abingham pushed a commit that referenced this issue Sep 3, 2022
* Created a variable replacer mutation operator

* Fixed typo in variable replacer operator examples

* Work item now stores parameter information

* Work DB now stores operator args and applies them

* Initialisation only tries mutation operators without args if none are specified

* Mutate and test now robust to lack of operator_args

* Added a variable inserter mutation operator

* Updated VariableInserter documentation

* Removed unneccessary comment from variable inserter

* VariableReplacer now replaces all usages of variable in statement

* Example dataclass added and tests updated

* Added exception to TypeError to catch operator args typos

* Refactored getting operator args in init

* Config is now loaded with a configuration or None
AndrewC19 pushed a commit to AndrewC19/cosmic-ray that referenced this issue Sep 8, 2022
AndrewC19 pushed a commit to AndrewC19/cosmic-ray that referenced this issue Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants