Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes for for loops #4562

Merged
merged 36 commits into from
May 17, 2024
Merged

Changes for for loops #4562

merged 36 commits into from
May 17, 2024

Conversation

ChrisDodd
Copy link
Contributor

Pulling together changes from #4120 and #4558 for for loops

// FIXME -- should we try to directly convert this to a ForStatement in the parser?
class ForInStatement : Statement, ISimpleNamespace {
NullOK Declaration_Variable decl;
// the declaration will be moved to the enclosing block by MoveDeclarations, but
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not recall all the details, but iirc the change in MoveDeclarations was necessary for nested loops like + some shadowing:

action a0(bit<8> x, bit<8> y) {
        // Reduce
        bit<64> idx = -1;
        foreach (bit<8> i in 0 .. x) {
            foreach (bit<8> j in i .. y) {
                idx = foo(j);
                if (idx == -1)
                    break;
            }
        }
    }

@fruffy fruffy added core Topics concerning the core segments of the compiler (frontend, midend, parser) p4-spec Topics related to the P4 specification (https://github.com/p4lang/p4-spec/). labels Mar 27, 2024
@asl
Copy link
Contributor

asl commented Mar 29, 2024

@ChrisDodd Consider the following P4 code:

bit<64> _popcount(in bit<64> val) {
    bit<64> n = 0;
    bit<64> v = val;
    for (bit<64> popcnti in 1 .. 64w63) {
        if (v == 0)
            return n;
        n = n + 1;
        v = v & (v - 1);
    }
    return n;
}

There are two issues with existing PR:

  1. Warning about popcnti being uninitialized.
  2. v = v & (v - 1); is removed by simplifyUseDef (n = n + 1 is not removed as return n is considered as a valid use of that def).

My solution for 1. was factoring out loop index variable and loop container into a separate ForEachHeader node. So we can explicitly make it read and write loop index variable and has read index variable defined after header node.

For 2. I marked all defs in the loop body to be used by loop itself.

But maybe you might have some different solution for this issue as you know the underlying code much better than myself :)

@ChrisDodd
Copy link
Contributor Author

I've added support for loops for frontend DefUse, midend DefUse, midend localCopyprop and general ControlFlowVisitor use. Midend Defuse uses that generic ControlFlowVisitor, but it turned out localCopyprop didn't work properly for that, so it needed an additional stuff -- basically due ti the fact that it did analysis and transform in one pass, it could not use an iterative loop closure, so instead needed one analysis pass of the loop body to remove potential copies that are dependent on the loop body, followed by the normal analysis+rewrite pass to to copyprop into and within the loop body.

At this point, the simple test cases I've looked at look good, though it is still giving the spurious "uninitialized variable" for the loop var in for..in loops. There are a number of hacky things in a coupld of places that probably could be done better too.

@ChrisDodd
Copy link
Contributor Author

There are two issues with existing PR:

  1. Warning about popcnti being uninitialized.
  2. v = v & (v - 1); is removed by simplifyUseDef (n = n + 1 is not removed as return n is considered as a valid use of that def).

I think issue 2 was likely coming from local copyprop -- I've fixed that so it doesn't incorrectly propagate into the loop the first time through. The warning is still an issue I need to look into, though.

@asl
Copy link
Contributor

asl commented Apr 1, 2024

I think issue 2 was likely coming from local copyprop -- I've fixed that so it doesn't incorrectly propagate into the loop the first time through. The warning is still an issue I need to look into, though.

@ChrisDodd Both issues are from simplifyUseDef. The warning appears because there is use of loop index variable before real def (it's in beforeStart state). As for the second – there is no use of v after the loop and therefore the last def (assignment) is considered redundant and therefore is removed.

@ChrisDodd
Copy link
Contributor Author

I think issue 2 was likely coming from local copyprop -- I've fixed that so it doesn't incorrectly propagate into the loop the first time through. The warning is still an issue I need to look into, though.

@ChrisDodd Both issues are from simplifyUseDef. The warning appears because there is use of loop index variable before real def (it's in beforeStart state). As for the second – there is no use of v after the loop and therefore the last def (assignment) is considered redundant and therefore is removed.

The warning is coming from FindUninitialized, which does its own flow analysis independent of DefUse (and likewise not using the builting visitor infrastructure for flow analysis), so needs to be updated for loops. But I don't see a problem with the v =... being removed -- it is still there after SimplifyDefUse, and also still there after LocalCopyprop (now). Updating frontend DefUse for loops (which SimplifyDefUse uses for removals) liekly fixed that.

@asl
Copy link
Contributor

asl commented Apr 2, 2024

But I don't see a problem with the v =... being removed -- it is still there after SimplifyDefUse, and also still there after LocalCopyprop (now). Updating frontend DefUse for loops (which SimplifyDefUse uses for removals) liekly fixed that.

Oh, indeed, you are right. I was looking into an old set of dumps. With your latest changes to frontend SimplifyDefUse indeed it is not removed anymore. So, this is a viable solution that could subsume what we are having downstream.

The warning is coming from FindUninitialized, which does its own flow analysis independent of DefUse (

I believe it does use DefUse, e.g. in registerUses: https://github.com/p4lang/p4c/blob/main/frontends/p4/simplifyDefUse.cpp#L1011:

        LOG3("LocationSet for '" << expression << "' is <<" << read << ">>");

        auto points = currentDefinitions->getPoints(read);

        if (reportUninitialized && !lhs && points->containsBeforeStart() &&
            hasUninitializedHeaderUnion(expression, currentDefinitions, read)) {
...

Here the points for loop index variable is BeforeStart (while recording the uses for loop index variable PathExpression):

  FU Registering uses for 'popcnti_0/popcnti;'
  FU Current point is (after) <AssignmentStatement>(600624)[[v_0/v = val;]] definitions are
    56 val/val=>{<Function>(614279) _popcount/2455 }
    57 n_0/n=>{<AssignmentStatement>(600618)[[n_0/n = 0;]] }
    58 v_0/v=>{<AssignmentStatement>(600624)[[v_0/v = val;]] }
    59 popcnti_0/popcnti=>{<BeforeStart> }
  LocationSet for 'popcnti_0/popcnti;' is <<59 popcnti_0/popcnti >>

hence the warning.

My solution was to introduce ForEachHeader node that does both def and use of loop index variable. So after this node we have proper def for loop index variable (and have to "silence" a warning only only in that single case).

@ChrisDodd ChrisDodd force-pushed the cdodd-issue-1261 branch 2 times, most recently from 3e0d5e2 to f1e234b Compare April 3, 2024 01:09
@asl
Copy link
Contributor

asl commented Apr 3, 2024

@ChrisDodd I am seeing null checks failed at https://github.com/p4lang/p4c/pull/4562/files#diff-6eaf37d31585aaaa76257fcc3f96a3d523ad39cd3adb88a225389df8b60814c8R959:

Compiler Bug: p4c/frontends/p4/def_use.h:204: Null location

(line 204 downstream is LocationSet::add).

So, apparently, storageMap contains empty location for the loop index declaration. The code in question is:

    action a0(bit<8> x, bit<8> y) {
        bit<64> idx = -1;
        for (bit<8> i1 in 0 .. x) {
            for (bit<8> j in i1 .. y) {
                idx = foo(j);
                if (idx == -1)
                    break;
            }
        }
    }

@ChrisDodd ChrisDodd force-pushed the cdodd-issue-1261 branch 2 times, most recently from f003fa6 to 101e00d Compare April 4, 2024 00:41
@ChrisDodd
Copy link
Contributor Author

@ChrisDodd I am seeing null checks failed at https://github.com/p4lang/p4c/pull/4562/files#diff-6eaf37d31585aaaa76257fcc3f96a3d523ad39cd3adb88a225389df8b60814c8R959:

Compiler Bug: p4c/frontends/p4/def_use.h:204: Null location

I don't see any Compiler Bugs with this code but I do see spurious "maybe uninitialized" warnings for i1 and j -- turns out there was a small bug in def_use for ForInStatements setting reachable defs for the body of the loop (was unioning in the state from before the loop, which should only be done for after the loop, not the loop body itself), and I'd missed a place in FindUninitialized setting the current point after the condition in ForIn loops. Things would be much simpler if these passes used the ControlFlowVisitor infrastructure which handles all this in one place, rather than having to reimplement it for every pass.

I've added your two testcase here, but more testcases would be nice

@asl
Copy link
Contributor

asl commented Apr 4, 2024

I've added your two testcase here, but more testcases would be nice

Let me see what I'm having downstream. I cannot submit them directly as they use private packages, etc. So, I will extract one-by-one

@asl
Copy link
Contributor

asl commented Apr 4, 2024

Here are the tests:

bit<8> foo(in bit<8> aa, in bit<8> bb) {
   // Do some magic to ensure aa / bb will not be simplified

    return aa - bb;
}

    action a0(bit<8> x, bit<8> y) {
        // Simple loops: with and w/o use of index variable.
        foreach (bit<8> i in 1 .. (x+y)) {
            foo(x, y);
        }

        foreach (bit<8> i in 1 .. (x+y)) {
            foo(x, y + i);
        }

        // And with constant range as well
        foreach (bit<8> i in 8w1 .. 42) {
            foo(x, y + i);
        }
    }

    action a1(bit<8> x, bit<8> y) {
        // Nested loops
        foreach (bit<8> i in 1 .. x) {
          foreach (bit<8> j in i .. y)
            foo(x, y + i);
        }
    }

     // Scope test    
    action a2(bit<8> x, bit<8> y) {
        bit<8> i = 10;
        foreach (bit<8> i in 1 .. x) {
            foo(x, y + i);
        }

        foreach (bit<8> k in i .. (x+y)) {
          foo(i, 2*i+x);
        }
    }

Same as break testcase above but with continue:

    action a1(bit<8> x, bit<8> y) {
        bit<64> idx = -1;
        foreach (bit<8> i in 0 .. x) {
            foreach (bit<8> j in 0 .. y) {
                idx = foo(j);
                if (idx == -1) {
                    continue;
                }
                idx = foo(i);
            }
        }
    }

I am afraid all other testcases are much tighter integrated downstream

@asl
Copy link
Contributor

asl commented Apr 4, 2024

turns out there was a small bug in def_use for ForInStatements setting reachable defs for the body of the loop (was unioning in the

Have you pushed these fixes? I do see only tests were added...

@ChrisDodd
Copy link
Contributor Author

Ok, there were some more issues with the flow analysis over loops that wer breaking the forloop2 test, so I've fixed that too.

Flow analysis over loops is non-trivial with a bunch of corner cases, so it is particularly good to have it in one place in the infrastructure rather than having to reimplement it for every visitor.

@ChrisDodd ChrisDodd force-pushed the cdodd-issue-1261 branch 2 times, most recently from 325675f to 7be3fc9 Compare April 8, 2024 02:02
@ChrisDodd ChrisDodd marked this pull request as ready for review April 8, 2024 02:02
@ChrisDodd ChrisDodd requested a review from fruffy April 8, 2024 02:02
@ChrisDodd
Copy link
Contributor Author

I'm still working on code to unroll loops to support targets that don't support loops natively, but this is already a pretty large change and has all the necessary support for targets that do suppoert loops.

@ChrisDodd
Copy link
Contributor Author

So loops in an action like this:

action a1(bit<8> x, bit<8> y) {
    // Nested loops
    foreach (bit<8> i in 1 .. x) {
      foreach (bit<8> j in i .. y)
        foo(x, y + i);
    }
}

where the loop count depends on an action data parameter (rather than a constant) can't be easily bounded (other than by the possible range of the type, so 255 here), so don't seem to fit in with the intent of p4lang/p4-spec#1261 or #4120. However, they do seem fine with general loops, which this PR supports. Unrolling this loop seems impractical, and would really require a different action for each possible loop count, deciding on which one to use based on the actual parameter value of any particular entry.

- allow more patterns of tests and increments
- deal properly with updates in the presence of break&continue
- single test to skip rest of loop after break rather than rechecking
  the flag every time.
- remove redundant inits of flags
@ChrisDodd ChrisDodd force-pushed the cdodd-issue-1261 branch 2 times, most recently from bcedfe3 to fa3380d Compare May 17, 2024 05:32
@ChrisDodd ChrisDodd added this pull request to the merge queue May 17, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 17, 2024
@ChrisDodd ChrisDodd added this pull request to the merge queue May 17, 2024
Merged via the queue into p4lang:main with commit b51b341 May 17, 2024
17 checks passed
@ChrisDodd ChrisDodd deleted the cdodd-issue-1261 branch May 17, 2024 08:52
Copy link
Contributor

@zzmic zzmic May 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @ChrisDodd. I was wondering what compiler you've used to compile the examples you provided (e.g., ./testdata/p4_16_samples/forloop1.p4) since it's apparent that ”BMV2 does not support loops” (so I assume that p4c-bm2-psa and p4c-bm2-ss won’t work). Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The p4test compiler compiles these loops

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Topics concerning the core segments of the compiler (frontend, midend, parser) p4-spec Topics related to the P4 specification (https://github.com/p4lang/p4-spec/).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants