Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Add guards to prevent OOB when streaming buffers with ConditionalDimension #2150

Merged
merged 5 commits into from
Jul 7, 2023

Conversation

ccuetom
Copy link
Contributor

@ccuetom ccuetom commented Jun 22, 2023

Using the time derivative of a streamed buffer with a ConditionalDimension results in OOB within init_to_host. This was not captured in existing tests.

A new test has been added test_gpu_common.py::test_streaming_multi_input_conddim_foward that fails with this OOB; and a new guard is now introduced in Buffering to prevent it.

for i in range(u.save):
u.data[i, :] = i

expr = u.dt2 + 3.*u.dt(x0=time_sub - time_sub.spacing)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this is just u.dtl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to know - I adapted the code from the next test there

Copy link
Contributor

@mloubout mloubout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, is there some standard cond dim cases that were broken as well non related to your issue?

@@ -589,7 +589,7 @@ def test_conddim_w_shifting():
op1 = Operator(eqns, opt='buffering')

# Check generated code
assert len(retrieve_iteration_tree(op1)) == 3
assert len(retrieve_iteration_tree(op1)) == 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what changed in the code since it seems to change existing tests not related to init_to_host

Copy link
Contributor Author

@ccuetom ccuetom Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests also generate initialisation code that is affected by the addition of the guard. The loop:

  if (time_m <= time_M)
  {
    /* Begin section0 */
    START_TIMER(section0)
    for (int db0 = 0; db0 <= 0; db0 += 1)
    {
      for (int x = x_m; x <= x_M; x += 1)
      {
        for (int y = y_m; y <= y_M; y += 1)
        {
          ub0[(db0 + (time_M / factor))%(1)][x][y] = u[db0 + (time_M / factor)][x][y];
        }
      }
    }
    STOP_TIMER(section0,timers)
    /* End section0 */
  }

becomes:

  if (time_m <= time_M)
  {
    /* Begin section0 */
    START_TIMER(section0)
    for (int db0 = 0; db0 <= 0; db0 += 1)
    {
      if (0 <= db0 + (time_M / factor))
      {
        for (int x = x_m; x <= x_M; x += 1)
        {
          for (int y = y_m; y <= y_M; y += 1)
          {
            ub0[(db0 + (time_M / factor))%(1)][x][y] = u[db0 + (time_M / factor)][x][y];
          }
        }
      }
    }
    STOP_TIMER(section0,timers)
    /* End section0 */
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that what causes the bump from 3 to 4 iteration nests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the iteration nest before adding the guard:

IterationTree(<WithProperties[affine,parallel]::Iteration db0; (0, 0, 1)>, <WithProperties[affine,parallel]::Iteration x; (x_m, x_M, 1)>, <WithProperties[affine,parallel]::Iteration y; (y_m, y_M, 1)>)
IterationTree(<WithProperties[affine,sequential]::Iteration time[t0,t1]; (time_m, time_M, 1)>,)
IterationTree(<WithProperties[affine,sequential]::Iteration time[t0,t1]; (time_m, time_M, 1)>, <WithProperties[affine,parallel,parallel=]::Iteration x; (x_m, x_M, 1)>, <WithProperties[affine,parallel,parallel=]::Iteration y; (y_m, y_M, 1)>)

and after adding the guard:

IterationTree(<WithProperties[affine,parallel]::Iteration db0; (0, 0, 1)>,)
IterationTree(<WithProperties[affine,parallel]::Iteration db0; (0, 0, 1)>, <WithProperties[affine,parallel]::Iteration x; (x_m, x_M, 1)>, <WithProperties[affine,parallel]::Iteration y; (y_m, y_M, 1)>)
IterationTree(<WithProperties[affine,sequential]::Iteration time[t0,t1]; (time_m, time_M, 1)>,)
IterationTree(<WithProperties[affine,sequential]::Iteration time[t0,t1]; (time_m, time_M, 1)>, <WithProperties[affine,parallel,parallel=]::Iteration x; (x_m, x_M, 1)>, <WithProperties[affine,parallel,parallel=]::Iteration y; (y_m, y_M, 1)>)

Iteration tree IterationTree(<WithProperties[affine,parallel]::Iteration db0; (0, 0, 1)>,) appears as a result. Not sure if this is expected or if you want me to probe into it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it's OK!

@ccuetom
Copy link
Contributor Author

ccuetom commented Jun 23, 2023

Looks good to me, is there some standard cond dim cases that were broken as well non related to your issue?

@mloubout Issue popped up with personal code that does something similar to this, so I haven't seen anything else that broke

@codecov
Copy link

codecov bot commented Jun 23, 2023

Codecov Report

Merging #2150 (746afeb) into master (3353683) will decrease coverage by 0.04%.
The diff coverage is 54.76%.

@@            Coverage Diff             @@
##           master    #2150      +/-   ##
==========================================
- Coverage   87.10%   87.07%   -0.04%     
==========================================
  Files         223      223              
  Lines       39893    39916      +23     
  Branches     5172     5173       +1     
==========================================
+ Hits        34750    34758       +8     
- Misses       4567     4580      +13     
- Partials      576      578       +2     
Impacted Files Coverage Δ
tests/test_gpu_common.py 1.44% <0.00%> (-0.04%) ⬇️
devito/passes/clusters/buffering.py 92.15% <100.00%> (+0.08%) ⬆️
tests/test_buffering.py 98.70% <100.00%> (ø)

... and 3 files with indirect coverage changes

@@ -177,6 +179,9 @@ def callback(self, clusters, prefix, cache=None):
lhs = b.indexed[dims]._subs(dim, b.firstidx.b)
rhs = b.function[dims]._subs(dim, b.firstidx.f)

if dim.is_Conditional:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about we place the guard regardless of whether it's a ConditionalDimension or a plain Dimension?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

second question.

Since db0 starts, by construction, at 0 and can only take positive values, how about this as a guard instead:

guards[pd] = GuardBound(0, b.firstidx.f - b.xd)  # aka: time_M / factor >= 0

then later on, you would have to use andg to chain this guard with the other time_M >= time_m guard

this would generalise to the no-ConditionalDimension case as well

this way, we would generate neater code in my opinion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something, but I think the guard does need to be on db0 - we need to check that db0 + (time_m / factor) - 1 >= 0 or db0 + (time_M / factor) - 1 >= 0 depending on whether it's forward or backward.

Using guards[pd] = GuardBound(0, b.firstidx.f - b.xd) # aka: time_M / factor >= 0 tests fail as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right I just got confused

However, I still think this generalisation is feasible:

what about we place the guard regardless of whether it's a ConditionalDimension or a plain Dimension?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should not be a problem. It will make more tests trip on the number of iteration trees, but happy to make the changes

@@ -226,8 +231,13 @@ def callback(self, clusters, prefix, cache=None):
ispace = b.readfrom
properties = c.properties.sequentialize(d)

guards = c.guards
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: guards are typically computed before properties

Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all these changes. Appreciate that. Approved!

@mloubout
Copy link
Contributor

mloubout commented Jul 5, 2023

Rebase and GTG

@ccuetom ccuetom force-pushed the add-streaming-guards branch from 5903ad1 to 746afeb Compare July 5, 2023 14:49
@FabioLuporini
Copy link
Contributor

Many thanks! Merging...

@FabioLuporini FabioLuporini merged commit 474b10c into devitocodes:master Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants