Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Add blockrelax tests and refresh advisor profiling #1929

Merged
merged 2 commits into from
Jun 7, 2022

Conversation

georgebisbas
Copy link
Contributor

  • Avoid autotuning when children's block shapes are the same as the parent's block shapes
  • Slightly refresh advisor

@georgebisbas georgebisbas self-assigned this May 31, 2022
# We must be able to do thread pinning, otherwise any results would be
# meaningless. Currently, we only support doing that via numactl
# Thread pinning is strongly recommended for reliable results.
# We support thread pinning via numactl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or KMP_AFFINITY?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"we support" doesn't also make much sense. We don't support anything, thread/process placement is up to the user (though we might one day devise an option to do that automatically at least for openmp since there are special for-loop clauses that allow you to specify the thread affinity for that loop. But not today)

KMP_AFFINITY is also intel stuff

The OpenMP standard env vars for thread pinning are OMP_PLACES and OMP_PROC_BIND , IIRC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rephrased a bit. Hope its better now

stepper = None
timesteps = 1
elif len(steppers) == 1:
if len(steppers) == 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove that 0 case?

Copy link
Contributor

@FabioLuporini FabioLuporini Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if CI doesn't fail, either test suite is flawed or we somewhat changed things over time such that it's never the case.

Now, the second case to me is unlikely...one could definitely try auto-tuning without a time loop. Hence, the test suite requires an update !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 0 case is removed since devito does not apply loop blocking for non time-iterative computations.

# Heuristic: TILABLE not worth it if not within a SEQUENTIAL Dimension

CI passes.

Idea: blockrelax should allow blocking non-time-iterative loops, right? It skips the heuristics. Then we need to add some tests for this.

i.e. block relax on matrix multiplication example?
I need to check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me yeah

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests updated with blockrelax in linear algebra.
Need to add some tests that check the structure too

devito/core/autotuning.py Show resolved Hide resolved
@codecov
Copy link

codecov bot commented May 31, 2022

Codecov Report

Merging #1929 (5e0f6d0) into master (5e0f6d0) will not change coverage.
The diff coverage is n/a.

❗ Current head 5e0f6d0 differs from pull request most recent head 6b45b07. Consider uploading reports for the commit 6b45b07 to get more accurate results

@@           Coverage Diff           @@
##           master    #1929   +/-   ##
=======================================
  Coverage   89.60%   89.60%           
=======================================
  Files         211      211           
  Lines       35941    35941           
  Branches     5414     5414           
=======================================
  Hits        32205    32205           
  Misses       3232     3232           
  Partials      504      504           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e0f6d0...6b45b07. Read the comment docs.

@FabioLuporini FabioLuporini changed the title compiler: reduce autotuning compiler: Reduce autotuning Jun 1, 2022
@FabioLuporini
Copy link
Contributor

if all(v <= i and i % v == 0 for _, i in bs):
# To be a valid block size, it must be smaller than
# and divide evenly the parent's block size.
# Blocksizes equal to the parent's block size are not included
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also why not? it would be sort-of like defaulting to the non-hierarchical case no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that if someone asks for >1 levels would not be interested in that....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in, why would you remove a bunch of points from the exploration space?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well in order to reduce the time needed. But I agree that for typical space blocking, this is cheap, so I am happy to bring them back

@georgebisbas georgebisbas changed the title compiler: Reduce autotuning compiler: Add blockrelax tests and refresh advisor profiling Jun 6, 2022
Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good here

@mloubout mloubout merged commit e93792a into master Jun 7, 2022
@mloubout mloubout deleted the reduce_autotuning branch June 7, 2022 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants