File total tokens added to report output #1036

jazerix · 2023-04-20T15:59:40Z

This PR adds the count of tokens in each file such that you get an idea of how much of each file is covered by a match. This is especially nice if you are not using the report viewing but ingesting the files in other ways.

Alternatively, this metric can also be converted to a similarity using the same formula found in JPlagComparison.java

I'm aware that it doesn't give you the complete picture, but having a sense of perspective on each file is nice.

Let me know what you think 😃

tsaglam · 2023-04-21T07:40:48Z

That is a good idea; I will have a look at it.

@jazerix could you run mvn spotless:apply to fix your formatting?

jazerix · 2023-04-21T09:39:04Z

@tsaglam Certainly! I've also renamed the two new variables to align with the current naming scheme.

dfuchss

One minor suggestion

core/src/main/java/de/jplag/reporting/jsonfactory/ComparisonReportWriter.java

tsaglam · 2023-05-10T11:54:13Z

@TwoOfTwelve could you take a look at this PR? Resolve the conflict and maybe even test if it affects the performance if we run it for a large dataset?

TwoOfTwelve

The code looks good to me.

I measured the runtime of the entire application using a bunch of data. Without the changes it took 0.91012 seconds on average, with the changes 0.91279 seconds.
I don't think, this will have a noticeable impact on the performance.

sonarcloud · 2023-05-10T15:03:35Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
No Duplication information

File total tokens added to report output

9ebf6f5

tsaglam added enhancement Issue/PR that involves features, improvements and other changes minor Minor issue/feature/contribution/change labels Apr 21, 2023

Renamed variables and ran mvn:spotless

f2aae56

dfuchss reviewed May 1, 2023

View reviewed changes

core/src/main/java/de/jplag/reporting/jsonfactory/ComparisonReportWriter.java Outdated Show resolved Hide resolved

Switched to Objects.equals

4622601

dfuchss approved these changes May 1, 2023

View reviewed changes

tsaglam requested a review from TwoOfTwelve May 10, 2023 11:52

TwoOfTwelve and others added 2 commits May 10, 2023 14:37

Merge branch 'develop' into develop

9f22784

Fixed formatting for spotless.

e060624

TwoOfTwelve approved these changes May 10, 2023

View reviewed changes

tsaglam merged commit 53253e5 into jplag:develop May 10, 2023

jazerix deleted the develop branch May 14, 2023 19:40

Kr0nox mentioned this pull request Feb 13, 2024

remove duplicate info about token count #1555

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File total tokens added to report output #1036

File total tokens added to report output #1036

jazerix commented Apr 20, 2023

tsaglam commented Apr 21, 2023 •

edited

Loading

jazerix commented Apr 21, 2023

dfuchss left a comment

tsaglam commented May 10, 2023 •

edited

Loading

TwoOfTwelve left a comment

sonarcloud bot commented May 10, 2023

File total tokens added to report output #1036

File total tokens added to report output #1036

Conversation

jazerix commented Apr 20, 2023

tsaglam commented Apr 21, 2023 • edited Loading

jazerix commented Apr 21, 2023

dfuchss left a comment

Choose a reason for hiding this comment

tsaglam commented May 10, 2023 • edited Loading

TwoOfTwelve left a comment

Choose a reason for hiding this comment

sonarcloud bot commented May 10, 2023

tsaglam commented Apr 21, 2023 •

edited

Loading

tsaglam commented May 10, 2023 •

edited

Loading