Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Add skewing pass towards Temporal Blocking #1620

Merged
merged 17 commits into from
Apr 7, 2021
Merged

Conversation

georgebisbas
Copy link
Contributor

@georgebisbas georgebisbas commented Mar 5, 2021

This is the first step towards Temporal Blocking automation.

This PR aims to add skewing as a pass to:

  • change iteration bounds
  • skew accesses

A new Queue subclass is implemented.

  • skewinner option is added to skew or not the innermost loop

No changes are happening in the computation order. Nothing is optimized.
Codegen tests have been added, covering simple time-stepping computations, autotuning, and subdomains.

@georgebisbas georgebisbas added WIP Still work in progress RFC compiler labels Mar 5, 2021
@georgebisbas georgebisbas self-assigned this Mar 5, 2021
@georgebisbas georgebisbas force-pushed the tilable branch 2 times, most recently from 0750bc9 to 299d305 Compare March 5, 2021 17:34
Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, have you considered using an implicit tree visitors for Clusters (ie a subclass of Queue) as for example done here https://github.com/devitocodes/devito/blob/master/devito/passes/clusters/blocking.py#L12

?

A few questions:

  • Did you understand how a Queue works as for example in Blocking?
  • Did you think it's not necessary here? As in, do you think that working on individual clusters in isolation is the way to go? (Haven't really thought about it, honestly, for this specific pass)

The other key comment I have is the following: when you implement a pass (and, in general, any algorithm in your life), you need to think not once, not ten times, but better one hundred times about the complexity of your implementation. Useful questions to ask yourself to drive the implementation "are all these lists/dict/... really necessary?" , "can I get away with fewer data structures and/or loops?" , etc. For example here you have both skew_dims and skewable -- that doesn't make much sense to me. Also remove from skewable is hardly necessary... I'm fairly sure there's a neater way of achieving this. You also have two nested ifs... first one is if i.dim not in skew_dims while the nested one is if index < .... and i.dim in skewable . I hardly believe that comparing indices and checking again for the presence of a dimension in skewable is the simplest way to go

.github/workflows/examples.yml Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/core/cpu.py Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
tests/test_skewing.py Outdated Show resolved Hide resolved
tests/test_skewing.py Outdated Show resolved Hide resolved
tests/test_skewing.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/core/cpu.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
skewable.append(i.dim)

if len(skew_dims) > 1:
raise warning("More than 1 dimensions that can be skewed.\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No nested skewing? :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning or error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No nested skewing? :D

why not?

devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
tests/test_skewing.py Show resolved Hide resolved
tests/test_skewing.py Outdated Show resolved Hide resolved
tests/test_skewing.py Outdated Show resolved Hide resolved
devito/core/cpu.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
skewable.append(i.dim)

if len(skew_dims) > 1:
raise warning("More than 1 dimensions that can be skewed.\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No nested skewing? :D

why not?

devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
devito/passes/clusters/temporal.py Outdated Show resolved Hide resolved
tests/test_ir.py Outdated Show resolved Hide resolved
tests/test_skewing.py Outdated Show resolved Hide resolved
@georgebisbas georgebisbas force-pushed the tilable branch 5 times, most recently from d0f6636 to b611b90 Compare March 17, 2021 07:55
@devitocodes devitocodes deleted a comment from codecov bot Mar 17, 2021
@codecov
Copy link

codecov bot commented Mar 17, 2021

Codecov Report

Merging #1620 (25d76cc) into master (48069da) will increase coverage by 0.04%.
The diff coverage is 86.45%.

❗ Current head 25d76cc differs from pull request most recent head 6436702. Consider uploading reports for the commit 6436702 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1620      +/-   ##
==========================================
+ Coverage   86.50%   86.55%   +0.04%     
==========================================
  Files         216      216              
  Lines       32479    32598     +119     
  Branches     4279     4296      +17     
==========================================
+ Hits        28097    28215     +118     
+ Misses       3901     3899       -2     
- Partials      481      484       +3     
Impacted Files Coverage Δ
benchmarks/user/advisor/roofline.py 0.00% <0.00%> (ø)
benchmarks/user/advisor/run_advisor.py 0.00% <ø> (ø)
devito/ir/clusters/cluster.py 97.74% <ø> (ø)
examples/misc/linalg.py 0.00% <0.00%> (ø)
devito/passes/iet/misc.py 91.56% <50.00%> (-3.31%) ⬇️
devito/passes/clusters/blocking.py 98.49% <96.36%> (-1.51%) ⬇️
tests/test_skewing.py 98.91% <98.91%> (ø)
devito/core/cpu.py 100.00% <100.00%> (ø)
devito/core/gpu.py 94.70% <100.00%> (+0.17%) ⬆️
devito/ir/clusters/analysis.py 95.78% <100.00%> (-2.02%) ⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48069da...6436702. Read the comment docs.

@georgebisbas georgebisbas requested a review from mloubout March 17, 2021 08:25
@georgebisbas georgebisbas changed the title [RFC] Add skewing pass towards Temporal Blocking Add skewing pass towards Temporal Blocking Mar 17, 2021
@georgebisbas georgebisbas removed the WIP Still work in progress label Mar 17, 2021
@georgebisbas georgebisbas marked this pull request as ready for review March 17, 2021 10:19
@georgebisbas georgebisbas removed the RFC label Mar 17, 2021
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
tests/test_autotuner.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/core/cpu.py Outdated Show resolved Hide resolved
devito/core/gpu.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
intervals.append(i)
# Since we are here, prefix is skewable and nested under a
# SEQUENTIAL loop. Do not skew innermost loop
# TODO: In case of subdomains or perfect loops nests with more
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether this should rather be fixed in this PR

devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/core/gpu.py Outdated Show resolved Hide resolved
devito/core/gpu.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Outdated Show resolved Hide resolved
devito/passes/clusters/blocking.py Show resolved Hide resolved
------
In case of skewing, if 'blockinner' is enabled, the innermost loop is also skewed.
"""
processed = preprocess(clusters, options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was pointed out a few times, "processed" is only at the end, when it needs no more processing... anyway, nitpicking

@@ -88,9 +98,14 @@ def callback(self, clusters, prefix):
exprs = [uxreplace(e, {d: bd}) for e in c.exprs]

# The new Cluster properties
# TILABLE property is dropped after the blocking.
Copy link
Contributor

@FabioLuporini FabioLuporini Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# comments don't take full stop at end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after the blocking -> after blocking

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like a useless comment though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -103,6 +118,28 @@ def callback(self, clusters, prefix):
return processed


def preprocess(clusters, options):
# Preprocess: heuristic: drop TILABLE from innermost Dimensions to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could maybe be turned into a docstring now that preprocess has its own function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some nitpicks, addressable in the next PR

This now looks great ! Thanks for the patience :)

@FabioLuporini FabioLuporini changed the title Add skewing pass towards Temporal Blocking compiler: Add skewing pass towards Temporal Blocking Apr 7, 2021
@FabioLuporini FabioLuporini merged commit cc15576 into master Apr 7, 2021
@FabioLuporini FabioLuporini deleted the tilable branch April 7, 2021 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants