Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XLA Profiler integration #8014

Merged
merged 10 commits into from
Jun 28, 2021
Merged

Conversation

kaushikb11
Copy link
Contributor

@kaushikb11 kaushikb11 commented Jun 17, 2021

What does this PR do?

Could approach the XLA Profiler in two ways:

  1. Programmatic Capture (on-demand profiling session)
  2. Manual capture via TensorBoard - Open TensorBoard with the profile endpoint, and click the “CAPTURE PROFILE” button in the upper left. Enter “localhost:9012” (default) as the profile service URL (this is the address of the profiler server you started in the previous step). Enter the number of milliseconds you’d like to profile for and click “CAPTURE”. The page will refresh with insights after the capture is finished.

Would be so much more fun with Programmatic Capture (But with the current XLA API, finding issues with xp.trace). Not really convinced to hack it around. (Second iteration)

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

@codecov
Copy link

codecov bot commented Jun 17, 2021

Codecov Report

Merging #8014 (5c07863) into master (2a372e3) will decrease coverage by 5%.
The diff coverage is 41%.

@@           Coverage Diff           @@
##           master   #8014    +/-   ##
=======================================
- Coverage      93%     88%    -5%     
=======================================
  Files         211     212     +1     
  Lines       13440   13491    +51     
=======================================
- Hits        12474   11854   -620     
- Misses        966    1637   +671     

@kaushikb11 kaushikb11 self-assigned this Jun 17, 2021
@kaushikb11 kaushikb11 added the feature Is an improvement or enhancement label Jun 17, 2021
@kaushikb11 kaushikb11 marked this pull request as ready for review June 17, 2021 18:56
Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you update with master it should get the fix for the failing docs build

@awaelchli awaelchli added profiler accelerator: tpu Tensor Processing Unit labels Jun 18, 2021
@awaelchli awaelchli added this to the v1.4 milestone Jun 18, 2021
@mergify mergify bot removed the has conflicts label Jun 19, 2021
Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codecov won't reach these lines due to the skipped test? not sure, but you may have to exclude it

@kaushikb11 kaushikb11 added the ready PRs ready to be merged label Jun 28, 2021
Copy link
Contributor

@tchaton tchaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

Copy link
Member

@ethanwharris ethanwharris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 😃 some issues with the docs

Copy link
Member

@ethanwharris ethanwharris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 😃

@kaushikb11 kaushikb11 merged commit 2f3c65e into Lightning-AI:master Jun 28, 2021
@kaushikb11 kaushikb11 deleted the xla/profiler branch June 28, 2021 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accelerator: tpu Tensor Processing Unit feature Is an improvement or enhancement profiler ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants