Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C# 12: Primary constructors. #15474

Merged
merged 16 commits into from
Feb 14, 2024

Conversation

michaelnebel
Copy link
Contributor

@michaelnebel michaelnebel commented Jan 30, 2024

In this PR we introduce library and data flow support for primary constructors.
Below we see an example of class C1 with a primary constructor defined on line 1.

public class C1(object o, string s)
{
    public object Obj => o;
    public string Str => s;

    public C1(object o) : this(o, "default") { }

    public void SetStr(string s1)
    {
        s = s1;
    }
}

The QL library now contains a class called PrimaryConstructor which contains exactly the primary constructors defined in source code. If it turns our that this implementation is insufficient we can extend the extractor to specifically extract whether a constructor is a primary constructor by inspecting the declaring syntax references using the the same "hack" as roslyn.

Before discussing the details of the data flow implementation for primary constructors is is worth noting that the example class declaration of C1 is roughly equivalent to (this is also what you will approximately get if you compile and de-compile the code):

public class C1
{
    private object vObj;
    public object Obj => vObj;

    private string vStr;
    public string Str => vStr;

    public C1(object o, string s)
    {
        vObj = o;
        vStr = s;
    }

    public C1(object o) : this(o, "default") { }

    public void SetStr(string s1)
    {
        vStr = s1;
    }
}

That is, we can think of class with a primary constructor as having a backing field for each primary constructor parameter and that the primary constructor as having a (synthetic) body that assigns the provided arguments to the backing fields.
Furthermore, each primary constructor parameter access (read/write) corresponds to reading or writing the corresponding backing field.
We use these observations to model data flow for primary constructors by introducing the following new content and dataflow node types

Content

  • PrimaryConstructorParameterContent: As we don't have a "real" backing field for each primary constructor parameter we introduce a new content type that enables us to create "synthetic" backing fields for each primary constructor parameter.

The data-flow implementation introduces two new types of data-flow nodes:

  • PrimaryConstructorThisAccessNode: Used for this access and synthesises the body of the primary constructor (that is, dataflow from the primary constructor parameters into their synthetic backing fields).
  • InstanceParameterAccessNode: Used to synthesise reads and writes of the parameter synthetic backing fields. That is, for all primary constructor parameter accesses (reads and writes) within a class declaration we need to do the corresponding update to the synthetic backing fields.

The dataflow implementation has some limitations and doesn't cover e.g.

  • Subtyping.

Huge thanks to @hvitved for outlining and helping with the implementation.

@github-actions github-actions bot added the C# label Jan 30, 2024
@michaelnebel michaelnebel force-pushed the csharp/primaryconstructors branch 5 times, most recently from bbb6a78 to 0698879 Compare February 6, 2024 13:27
@michaelnebel michaelnebel force-pushed the csharp/primaryconstructors branch from 0698879 to b1adbf6 Compare February 9, 2024 14:57
@michaelnebel michaelnebel force-pushed the csharp/primaryconstructors branch from 8c08baa to aed5080 Compare February 12, 2024 12:27
@michaelnebel
Copy link
Contributor Author

DCA was uneventful; Even though it is a bit discouraging that we didn't find more results, doing improvements like this should help us avoid our dataflow analysis to deteriorate over time when the new language feature becomes more widely adopted.

@michaelnebel michaelnebel marked this pull request as ready for review February 13, 2024 08:04
@michaelnebel michaelnebel requested a review from a team as a code owner February 13, 2024 08:04
@michaelnebel michaelnebel requested a review from hvitved February 13, 2024 08:04
Copy link
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, great work! Some (mostly cosmetic) suggestions.

/**
* Holds if this a primary constructor in source code.
*/
predicate isPrimary() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just inline this into the charpred of PrimaryConstructor.

Comment on lines 48 to 49
pa instanceof ParameterRead and
result = pa.getAControlFlowNode()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively

Suggested change
pa instanceof ParameterRead and
result = pa.getAControlFlowNode()
result = pa.(ParameterRead).getAControlFlowNode()

* Gets the control flow node used for data flow purposes for the primary constructor
* parameter access `pa`.
*/
private ControlFlow::Node getPrimaryConstructorParameterCfn(ParameterAccess pa) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to getAPrimaryConstructorParameterCfn.

@@ -37,6 +38,21 @@ predicate isArgumentNode(ArgumentNode arg, DataFlowCall c, ArgumentPosition pos)
arg.argumentOf(c, pos)
}

/**
* Gets the control flow node used for data flow purposes for the primary constructor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gets a

Comment on lines 51 to 52
pa instanceof ParameterWrite and
exists(AssignExpr ae | pa = ae.getLValue() and result = ae.getAControlFlowNode())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only takes direct assignments into account, and e.g. not assignments via out arguments (perhaps we should add a test for that). AssignableDefinitions are meant to cover all cases:

Suggested change
pa instanceof ParameterWrite and
exists(AssignExpr ae | pa = ae.getLValue() and result = ae.getAControlFlowNode())
pa = any(AssignableDefinition def | result = def.getAControlFlowNode()).getTargetAccess()

Comment on lines 2150 to 2153
exists(InstanceParameterAccessNode n | n = node1 |
n.getUnderlyingControlFlowNode() = node2.(ExprNode).getControlFlowNode() and
n.getParameter() = c.(PrimaryConstructorParameterContent).getParameter()
) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
exists(InstanceParameterAccessNode n | n = node1 |
n.getUnderlyingControlFlowNode() = node2.(ExprNode).getControlFlowNode() and
n.getParameter() = c.(PrimaryConstructorParameterContent).getParameter()
) and
node1 = any(InstanceParameterAccessNode n |
n.getUnderlyingControlFlowNode() = node2.(ExprNode).getControlFlowNode() and
n.getParameter() = c.(PrimaryConstructorParameterContent).getParameter() and
n.isPreUpdate()
) and

Comment on lines 2219 to 2221
exists(Node n1 |
primaryConstructorParameterStore(_, c, n1) and n = n1.(PostUpdateNode).getPreUpdateNode()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
exists(Node n1 |
primaryConstructorParameterStore(_, c, n1) and n = n1.(PostUpdateNode).getPreUpdateNode()
)
n = any(PostUpdateNode n1 | primaryConstructorParameterStore(_, c, n1)).getPreUpdateNode()

private class InstanceParameterAccessPostUpdateNode extends PostUpdateNode,
InstanceParameterAccessNode
{
private ControlFlow::Node cfg;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfn

/** Gets the underlying parameter. */
Parameter getParameter() { result = p }

override string toString() { result = "parameter field " + p.getName() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just say result = "parameter " + p.getName()

Comment on lines 1 to 3
/**
* @name Test for primary constructors
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add this.

Comment on lines 1806 to 1807
* the first access to o (1) is modeled as this.o_backing_field and
* the second access to o (2) is modeled as this.o_backing_field = value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing backticks.

* the first access to o (1) is modeled as this.o_backing_field and
* the second access to o (2) is modeled as this.o_backing_field = value.
* Both models need a pre-update this node, and the latter need an additional post-update this access,
* all of which are represented by an InstanceParameterAccessNode node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing backticks.

* ```csharp
* public class C(object o) { }
* ```
* we synthesize the primary constructor for C as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing backticks.

* ```csharp
* public C(object o) => this.o_backing_field = o;
* ```
* The synthesized (pre/post-update) this access is represented an PrimaryConstructorThisAccessNode node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing backticks.

Copy link
Contributor Author

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvitved : Your changes look plausible to me.

@michaelnebel michaelnebel requested a review from hvitved February 13, 2024 14:42
@michaelnebel
Copy link
Contributor Author

DCA still looks good.

@michaelnebel michaelnebel merged commit bafea91 into github:main Feb 14, 2024
21 checks passed
@michaelnebel michaelnebel deleted the csharp/primaryconstructors branch February 14, 2024 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants