Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-94759: Collect C-level coverage using llvm-cov #94760

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mdboom
Copy link
Contributor

@mdboom mdboom commented Jul 11, 2022

This adds a Github Action to collect C-level coverage and publish it to a Github Pages repository.

You can see an example coverage output.

There are two admin things that would need to happen for this to work:

  • Create a new repository in the python organization for the results, e.g. cpython-c-coverage
  • Create an organization-scoped token to publish to this repository and use it from this action

@mdboom mdboom requested a review from gpshead as a code owner July 11, 2022 19:31
@mdboom mdboom marked this pull request as draft July 11, 2022 19:32
Copy link
Member

@gpshead gpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be great to have! Publishing to github pages is interesting and is as good a place to start as any.

It is unfortunate that GitHub doesn't support rendering coverage metadata on top of their existing source tree view and within PR code views.

run: |
llvm-profdata-12 merge -sparse python*.profraw -o python.profdata
llvm-cov-12 show -format=html -output-dir=cpython-coverage -instr-profile=python.profdata -show-branches=count -show-regions python .
- name: Publish coverage-report
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does every run clobber the previous run's report? What about CI runs on PRs and CI runs from release branches? Can this be configured to commit the coverage results into a branch in the repo matching the reponame+branchname? Or do branches not render in gh-pages? in which case it'd need to be subdirectories in the repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this clobbers the previous run. That is easily changed with a flag, but then we could run into Github repository size limits (each result is around 100MB of HTML). It's probably possible to keep the last N commits, but the tool I'm using to publish doesn't support that directly.

This is currently configured to just run on the main branch once a day. We could do multiple branches that publish to subdirectories if you think there's a good use case for that. (Github pages only publishes a single branch, but subdirectories would work).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we would get some funding to run a persistent VM with public hostname from ... let's say Azure, then it would be trivial to create a buildbot worker and serve the LCOV results from HTTP server. They are static HTTP and JS files on the file system after all. Just saying :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind main branch only and once a day. Anyone working on coverage is presumably doing their own local coverage runs while creating PRs.

Regarding a buildbot configured to host the results, while I could simply set one up it'd probably make more sense for mdboom to do that and be an admin given who's driving this work. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd be happy to admin that if it comes to it. I think it's fine to go with the Github Action here as an MVP, and if the amount of history or frequency of runs isn't good enough, we can revisit migrating to our own VM down the road.

@arhadthedev
Copy link
Member

What is the benefit to run the suit daily? If the report is used as a guide for additions into the tests, a weekly run seems to be enough.

@brettcannon
Copy link
Member

What is the benefit to run the suite daily? If the report is used as a guide for additions into the tests, a weekly run seems to be enough.

If it's daily then it's up-to-date more than the weekly. More up-to-date is better than less up-to-date 😉. There's also the benefit that people will have a better idea of what to work on in terms of coverage than not as someone may put in a lot of work for something that's already been fixed earlier in the week.

Not to say that a weekly coverage run isn't good as well, but daily is just better if we can make it happen.

@tiran
Copy link
Member

tiran commented Jul 12, 2022

Is there any particular reasons why you are using clang and custom instructions for LCOV instead of make coverage-report? The latter uses gcc and lcov tool.

@gpshead
Copy link
Member

gpshead commented Jul 12, 2022

Is there any particular reasons why you are using clang and custom instructions for LCOV instead of make coverage-report? The latter uses gcc and lcov tool.

Which one produces more useful output? (I really don't care which we use even though everything I do at work is clang/llvm based - I just want line and branch coverage data)

@brettcannon
Copy link
Member

@tiran the idea of using clang comes from faster-cpython/ideas#426 where I believe branch coverage was considered the big win.

Maybe coverage-report should get updated to use LLVM? Or make it configurable for either clang or gcc?

@tiran
Copy link
Member

tiran commented Jul 12, 2022

GCC and LCOV can do branch coverge, too. I'll have a branch ready by tomorrow.

I would welcome to have LLVM coverage support in Makefile, too.

@mdboom
Copy link
Contributor Author

mdboom commented Jul 12, 2022

Is there any particular reasons why you are using clang and custom instructions for LCOV instead of make coverage-report? The latter uses gcc and lcov tool.

It was an attempt to get branch coverage, not realizing lcov could do it. I still think llvm-cov displays branches within subexpressions in a more readable way, however. I don't know if using gcov for collection + llvm-cov for output would give us the best of both worlds. (Using gcc for the collection would be preferable, probably, since that's what we test with on Linux).

@mdboom mdboom closed this Jul 12, 2022
@mdboom mdboom reopened this Jul 12, 2022
@mdboom
Copy link
Contributor Author

mdboom commented Jul 13, 2022

I don't know if using gcov for collection + llvm-cov for output would give us the best of both worlds.

It doesn't look like this is possible. There is an llvm-cov gcov command that takes the gcc coverage data as input and outputs human-readable text file reports. But I can't find a way to use gcc coverage data and output the nice llvm-cov html reports.

@brettcannon wrote:

Maybe coverage-report should get updated to use LLVM? Or make it configurable for either clang or gcc?

@tiran wrote:

I would welcome to have LLVM coverage support in Makefile, too.

I think that's a good idea, but a fair chunk of fiddly work. The Makefile doesn't have any direct support for building with clang on Linux. It works if you set CC, obviously, but you have to do that prior to re-running configure. So I don't think it's possible to just have make coverage-report-llvm work (since the Makefile can't call configure). We could, probably, make a task that errors out if clang isn't being used.

@brettcannon
Copy link
Member

It works if you set CC, obviously, but you have to do that prior to re-running configure

Can you have the coverage-report target do the right thing based on what CC was set to?

@mdboom
Copy link
Contributor Author

mdboom commented Jul 13, 2022

I just refactored this so there is a new Makefile target coverage-report-llvm that uses llvm-cov to generate a coverage report -- and then the Github Action just uses that.

Can you have the coverage-report target do the right thing based on what CC was set to?

I hadn't thought of this -- it's probably doable. Given that you have to completely reconfigure to switch between each "mode" that probably makes sense.

@mdboom
Copy link
Contributor Author

mdboom commented Jul 13, 2022

I've updated this to automatically choose lcov or llvm-cov depending on the compiler in use.

@mdboom mdboom requested review from gpshead and tiran July 13, 2022 18:57
@mdboom mdboom force-pushed the c-level-coverage branch 2 times, most recently from eb15335 to 58e38af Compare July 18, 2022 14:33
configure.ac Outdated Show resolved Hide resolved
@mdboom
Copy link
Contributor Author

mdboom commented Aug 24, 2022

@erlend-aasland wrote:

Also: please don't force push :) The GitHub UI don't play well with force-pushes. Using git merge --no-ff main makes for a better review experience.

I tried this and unfortunately it created a bunch of co-authored commits for everything that had happened in the meantime and then requested review from ~20 codeowners. I went back to rebasing/force pushing for a cleaner review of the actually relevant commits. Going to read the relevant parts of the devguide for next time...

@mdboom
Copy link
Contributor Author

mdboom commented Oct 31, 2022

Are there any additional comments on this?

@brettcannon brettcannon removed their request for review November 25, 2022 22:20
@encukou
Copy link
Member

encukou commented Mar 19, 2024

Create a new repository in the python organization for the results, e.g. cpython-c-coverage

There's a process for this now.
I don't think this repo needs to start under a personal GitHub account -- moving the config and creating a new token would be more trouble than it's worth. That leaves asking the SC as the next step, see here for example: python/steering-council#236

Adding LLVM support to coverage-report would work better as a separate PR, but I won't block on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants