-
Notifications
You must be signed in to change notification settings - Fork 334
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2191 from jplag/develop
Merge develop into main
- Loading branch information
Showing
328 changed files
with
9,438 additions
and
6,060 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
import os | ||
import xml.etree.ElementTree as ET | ||
|
||
def get_all_pom_files(): | ||
pom_files = [] | ||
for root, dirs, files in os.walk("../../.."): | ||
for file in files: | ||
if file == "pom.xml": | ||
pom_files.append(os.path.join(root, file)) | ||
return pom_files | ||
|
||
# get content from a file as a string | ||
def get_file_content(file): | ||
with open(file, "r") as f: | ||
return f.read() | ||
|
||
# extract xml field artifact id from string | ||
def extract_artifact_id(xml): | ||
root = ET.fromstring(xml) | ||
return root.find("{http://maven.apache.org/POM/4.0.0}artifactId").text | ||
|
||
excluded_artifacts = ["coverage-report", "aggregator", "languages"] | ||
artifact_ids = [extract_artifact_id(get_file_content(file)) for file in get_all_pom_files()] | ||
print("All artifacts: " + str(artifact_ids)) | ||
filtered_artifact_ids = [artifact_id for artifact_id in artifact_ids if artifact_id not in excluded_artifacts] | ||
|
||
coverage_report_pom = "" | ||
with open("../../../coverage-report/pom.xml", "r") as f: | ||
coverage_report_pom = f.read() | ||
xml = ET.fromstring(coverage_report_pom) | ||
coverage_report_artifacts = [dependency.find("{http://maven.apache.org/POM/4.0.0}artifactId").text for dependency in xml.find("{http://maven.apache.org/POM/4.0.0}dependencies").findall("{http://maven.apache.org/POM/4.0.0}dependency")] | ||
print("Coverage report artifacts: " + str(coverage_report_artifacts)) | ||
|
||
only_in_coverage_report = [artifact_id for artifact_id in coverage_report_artifacts if artifact_id not in filtered_artifact_ids] | ||
print("Only in coverage report: " + str(only_in_coverage_report)) | ||
not_in_coverage_report = [artifact_id for artifact_id in filtered_artifact_ids if artifact_id not in coverage_report_artifacts] | ||
print("Not in coverage report: " + str(not_in_coverage_report)) | ||
|
||
if len(not_in_coverage_report) > 0: | ||
raise Exception("Some artifacts are not in the coverage report: " + str(not_in_coverage_report)) | ||
if len(only_in_coverage_report) > 0: | ||
raise Exception("Some artifacts are only in the coverage report: " + str(only_in_coverage_report)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
name: Check that all dependencies are in coverage report | ||
|
||
on: | ||
workflow_dispatch: | ||
push: | ||
paths: | ||
- ".github/workflows/verify-coverage-report.yml" | ||
- "./scripts/checkCoverage.py" | ||
- "**/pom.xml" | ||
pull_request: | ||
types: [opened, synchronize, reopened] | ||
paths: | ||
- ".github/workflows/verify-coverage-report..yml" | ||
- "./scripts/checkCoverage.py" | ||
- "**/pom.xml" | ||
|
||
jobs: | ||
check_coverage: | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- name: Checkout 🛎️ | ||
uses: actions/checkout@v4 | ||
|
||
- name: Run script | ||
working-directory: .github/workflows/scripts | ||
run: | | ||
python checkCoverage.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,18 +2,17 @@ | |
<img alt="JPlag logo" src="core/src/main/resources/de/jplag/logo-dark.png" width="350"> | ||
</p> | ||
|
||
# JPlag - Detecting Software Plagiarism | ||
# JPlag - Detecting Source Code Plagiarism | ||
[](https://github.com/jplag/jplag/actions/workflows/maven.yml) | ||
[](https://github.com/jplag/jplag/releases/latest) | ||
[](https://maven-badges.herokuapp.com/maven-central/de.jplag/jplag) | ||
[](https://github.com/jplag/jplag/blob/main/LICENSE) | ||
[](https://github.com/jplag/JPlag/pulse) | ||
[](https://sonarcloud.io/component_measures?metric=Coverage&view=list&id=jplag_JPlag) | ||
[](https://jplag.github.io/JPlag/) | ||
[](#download-and-installation) | ||
|
||
|
||
JPlag finds pairwise similarities among a set of multiple programs. It can reliably detect software plagiarism and collusion in software development, even when obfuscated. All similarities are calculated locally, and no source code or plagiarism results are ever uploaded to the internet. JPlag supports a large number of programming and modeling languages. | ||
JPlag finds pairwise similarities among a set of multiple programs. It can reliably detect software plagiarism and collusion in software development, even when obfuscated. All similarities are calculated locally; no source code or plagiarism results are ever uploaded online. JPlag supports a large number of programming and modeling languages. | ||
|
||
* 📈 [JPlag Demo](https://jplag.github.io/Demo/) | ||
|
||
|
@@ -46,14 +45,14 @@ All supported languages and their supported versions are listed below. | |
| [EMF Metamodel](https://www.eclipse.org/modeling/emf/) | 2.25.0 | emf | beta | EMF | | ||
| [EMF Model](https://www.eclipse.org/modeling/emf/) | 2.25.0 | emf-model | alpha | EMF | | ||
| [SCXML](https://www.w3.org/TR/scxml/) | 1.0 | scxml | alpha | XML | | ||
| Text (naive) | - | text | legacy | CoreNLP | | ||
| Text (naive, use with caution) | - | text | legacy | CoreNLP | | ||
|
||
## Download and Installation | ||
You need Java SE 21 to run or build JPlag. | ||
|
||
### Downloading a release | ||
* Download a [released version](https://github.com/jplag/jplag/releases). | ||
* In case you depend on the legacy version of JPlag we refer to the [legacy release v2.12.1](https://github.com/jplag/jplag/releases/tag/v2.12.1-SNAPSHOT) and the [legacy branch](https://github.com/jplag/jplag/tree/legacy). | ||
* In case you depend on the legacy version of JPlag, we refer to the [legacy release v2.12.1](https://github.com/jplag/jplag/releases/tag/v2.12.1-SNAPSHOT) and the [legacy branch](https://github.com/jplag/jplag/tree/legacy). | ||
|
||
### Via Maven | ||
JPlag is released on [Maven Central](https://search.maven.org/search?q=de.jplag), it can be included as follows: | ||
|
@@ -73,64 +72,98 @@ JPlag is released on [Maven Central](https://search.maven.org/search?q=de.jplag) | |
3. You will find the generated JARs in the subdirectory `cli/target`. | ||
|
||
## Usage | ||
JPlag can either be used via the CLI or directly via its Java API. For more information, see the [usage information in the wiki](https://github.com/jplag/JPlag/wiki/1.-How-to-Use-JPlag). If you are using the CLI, you can display your results via [jplag.github.io](https://jplag.github.io/JPlag/). No data will leave your computer! | ||
JPlag can either be used via the CLI or directly via its Java API. For more information, see the [usage information in the wiki](https://github.com/jplag/JPlag/wiki/1.-How-to-Use-JPlag). If you are using the CLI, the report viewer UI will launch automatically. No data will leave your computer! | ||
|
||
### CLI | ||
*Note that the [legacy CLI](https://github.com/jplag/jplag/blob/legacy/README.md) is varying slightly.* | ||
The language can either be set with the -l parameter or as a subcommand (`jplag [jplag options] <language name> [language options]`). A subcommand takes priority over the -l option. | ||
When using the subcommand, language-specific arguments can be set. A list of language-specific options can be obtained by requesting the help page of a subcommand (e.g. `jplag java -h`). | ||
Language-specific arguments can be set when using the subcommand. A list of language-specific options can be obtained by requesting the help page of a subcommand (e.g., `jplag java —h`). | ||
|
||
``` | ||
Parameter descriptions: | ||
[root-dirs[,root-dirs...]...] | ||
Root-directory with submissions to check for plagiarism. | ||
Root-directory with submissions to check for | ||
plagiarism. If mode is set to VIEW, this parameter | ||
can be used to specify a report file to open. In that | ||
case only a single file may be specified. | ||
-bc, --bc, --base-code=<baseCode> | ||
Path to the base code directory (common framework used in all submissions). | ||
-l, --language=<language> | ||
Select the language of the submissions (default: java). See subcommands below. | ||
-M, --mode=<{RUN, VIEW, RUN_AND_VIEW}> | ||
The mode of JPlag: either only run analysis, only open the viewer, or do both (default: null) | ||
-n, --shown-comparisons=<shownComparisons> | ||
The maximum number of comparisons that will be shown in the generated report, if set to -1 all comparisons will be shown (default: 500) | ||
Path to the base code directory (common framework used | ||
in all submissions). | ||
-l, --language=<language> | ||
Select the language of the submissions (default: java). | ||
See subcommands below. | ||
-M, --mode=<{RUN, VIEW, RUN_AND_VIEW, AUTO}> | ||
The mode of JPlag. One of: RUN, VIEW, RUN_AND_VIEW, | ||
AUTO (default: null). If VIEW is chosen, you can | ||
optionally specify a path to an existing report. | ||
-n, --shown-comparisons=<shownComparisons> | ||
The maximum number of comparisons that will be shown in | ||
the generated report, if set to -1 all comparisons | ||
will be shown (default: 2500) | ||
-new, --new=<newDirectories>[,<newDirectories>...] | ||
Root-directories with submissions to check for plagiarism (same as root). | ||
--normalize Activate the normalization of tokens. Supported for languages: Java, C++. | ||
Root-directories with submissions to check for | ||
plagiarism (same as root). | ||
--normalize Activate the normalization of tokens. Supported for | ||
languages: Java, C++. | ||
-old, --old=<oldDirectories>[,<oldDirectories>...] | ||
Root-directories with prior submissions to compare against. | ||
-r, --result-file=<resultFile> | ||
Name of the file in which the comparison results will be stored (default: results). Missing .zip endings will be automatically added. | ||
-t, --min-tokens=<minTokenMatch> | ||
Tunes the comparison sensitivity by adjusting the minimum token required to be counted as a matching section. A smaller value increases the sensitivity but might lead to more | ||
false-positives. | ||
Root-directories with prior submissions to compare | ||
against. | ||
-r, --result-file=<resultFile> | ||
Name of the file in which the comparison results will | ||
be stored (default: results). Missing .zip endings | ||
will be automatically added. | ||
-t, --min-tokens=<minTokenMatch> | ||
Tunes the comparison sensitivity by adjusting the | ||
minimum token required to be counted as a matching | ||
section. A smaller value increases the sensitivity | ||
but might lead to more false-positives. | ||
Advanced | ||
--csv-export Export pairwise similarity values as a CSV file. | ||
-d, --debug Store on-parsable files in error folder. | ||
-m, --similarity-threshold=<similarityThreshold> | ||
Comparison similarity threshold [0.0-1.0]: All comparisons above this threshold will be saved (default: 0.0). | ||
-p, --suffixes=<suffixes>[,<suffixes>...] | ||
comma-separated list of all filename suffixes that are included. | ||
-P, --port=<port> The port used for the internal report viewer (default: 1996). | ||
-s, --subdirectory=<subdirectory> | ||
-d, --debug Store on-parsable files in error folder. | ||
--log-level=<{ERROR, WARN, INFO, DEBUG, TRACE}> | ||
Set the log level for the cli. | ||
-m, --similarity-threshold=<similarityThreshold> | ||
Comparison similarity threshold [0.0-1.0]: All | ||
comparisons above this threshold will be saved | ||
(default: 0.0). | ||
--overwrite Existing result files will be overwritten. | ||
-p, --suffixes=<suffixes>[,<suffixes>...] | ||
comma-separated list of all filename suffixes that are | ||
included. | ||
-P, --port=<port> The port used for the internal report viewer (default: | ||
1996). | ||
-s, --subdirectory=<subdirectory> | ||
Look in directories <root-dir>/*/<dir> for programs. | ||
-x, --exclusion-file=<exclusionFileName> | ||
All files named in this file will be ignored in the comparison (line-separated list). | ||
-x, --exclusion-file=<exclusionFileName> | ||
All files named in this file will be ignored in the | ||
comparison (line-separated list). | ||
Clustering | ||
--cluster-alg, --cluster-algorithm=<{AGGLOMERATIVE, SPECTRAL}> | ||
Specifies the clustering algorithm (default: spectral). | ||
Specifies the clustering algorithm. Available | ||
algorithms: agglomerative, spectral (default: | ||
spectral). | ||
--cluster-metric=<{AVG, MIN, MAX, INTERSECTION}> | ||
The similarity metric used for clustering (default: average similarity). | ||
The similarity metric used for clustering. Available | ||
metrics: average similarity, minimum similarity, | ||
maximal similarity, matched tokens (default: average | ||
similarity). | ||
--cluster-skip Skips the cluster calculation. | ||
Subsequence Match Merging | ||
--gap-size=<maximumGapSize> | ||
Maximal gap between neighboring matches to be merged (between 1 and minTokenMatch, default: 6). | ||
--match-merging Enables merging of neighboring matches to counteract obfuscation attempts. | ||
Maximal gap between neighboring matches to be merged | ||
(between 1 and minTokenMatch, default: 6). | ||
--match-merging Enables merging of neighboring matches to counteract | ||
obfuscation attempts. | ||
--neighbor-length=<minimumNeighborLength> | ||
Minimal length of neighboring matches to be merged (between 1 and minTokenMatch, default: 2). | ||
Subcommands (supported languages): | ||
Minimal length of neighboring matches to be merged | ||
(between 1 and minTokenMatch, default: 2). | ||
--required-merges=<minimumRequiredMerges> | ||
Minimal required merges for the merging to be applied | ||
(between 1 and 50, default: 6). | ||
Languages: | ||
c | ||
cpp | ||
csharp | ||
|
@@ -141,10 +174,10 @@ Subcommands (supported languages): | |
javascript | ||
kotlin | ||
llvmir | ||
multi | ||
python3 | ||
rlang | ||
rust | ||
scala | ||
scheme | ||
scxml | ||
swift | ||
|
@@ -183,7 +216,7 @@ Please consider our [guidelines for contributions](https://github.com/jplag/JPla | |
|
||
## Contact | ||
If you encounter bugs or other issues, please report them [here](https://github.com/jplag/jplag/issues). | ||
For other purposes, you can contact us at [email protected] . | ||
If you are doing research related to JPlag, we would love to know what you are doing. Feel free to contact us! | ||
For other purposes, you can contact us at [email protected]. | ||
We would love to hear about your research related to JPlag. Feel free to contact us! | ||
|
||
### More information can be found in our [Wiki](https://github.com/jplag/JPlag/wiki)! |
Oops, something went wrong.