-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File total tokens added to report output #1036
Conversation
That is a good idea; I will have a look at it. @jazerix could you run |
@tsaglam Certainly! I've also renamed the two new variables to align with the current naming scheme. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor suggestion
core/src/main/java/de/jplag/reporting/jsonfactory/ComparisonReportWriter.java
Outdated
Show resolved
Hide resolved
@TwoOfTwelve could you take a look at this PR? Resolve the conflict and maybe even test if it affects the performance if we run it for a large dataset? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me.
I measured the runtime of the entire application using a bunch of data. Without the changes it took 0.91012 seconds on average, with the changes 0.91279 seconds.
I don't think, this will have a noticeable impact on the performance.
Kudos, SonarCloud Quality Gate passed! |
This PR adds the count of tokens in each file such that you get an idea of how much of each file is covered by a match. This is especially nice if you are not using the report viewing but ingesting the files in other ways.
Alternatively, this metric can also be converted to a similarity using the same formula found in
JPlagComparison.java
I'm aware that it doesn't give you the complete picture, but having a sense of perspective on each file is nice.
Let me know what you think 😃