Skip to content

ANTLR grammar based C++ language module #937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Mar 14, 2023
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
6ee744c
Initial work on new C++ module
SirYwell Dec 6, 2022
215202d
Mark new C++ module as Experimental
SirYwell Dec 6, 2022
170b39a
Current LM state
SirYwell Dec 19, 2022
4297d37
Extract more tokens
SirYwell Dec 26, 2022
e6f1d34
Extract assignments for ++ and --
SirYwell Dec 27, 2022
8102700
Extract assign in initialization contexts
SirYwell Jan 4, 2023
9d9f718
Improve extraction of VARDEF and APPLY
SirYwell Jan 7, 2023
eb9a76f
Extract VARDEF for function parameters
SirYwell Jan 16, 2023
ccde4d4
Fix APPLY extraction
SirYwell Jan 22, 2023
df0234a
Clean up extraction code and tests
SirYwell Jan 22, 2023
415dc3c
Add more APPLY/VARDEF tests and fixes
SirYwell Jan 23, 2023
ceb98c6
Apply spotless
SirYwell Jan 23, 2023
d61af59
Remove unneeded code
SirYwell Jan 23, 2023
5f31cd3
Update comment
SirYwell Jan 23, 2023
950913b
Initial README design based on other language modules
SirYwell Jan 24, 2023
08472c1
Cleanup imports
SirYwell Jan 24, 2023
58cb140
Address review comments
SirYwell Feb 4, 2023
8a88b1d
Apply spotless
SirYwell Feb 4, 2023
9c84d42
Extract union type declarations
SirYwell Feb 5, 2023
b5714fb
Extract braced inits and fix token positions
SirYwell Feb 6, 2023
be976f3
Do not extract block start/end tokens
SirYwell Feb 8, 2023
faf0833
Increase CLI language counter
SirYwell Feb 8, 2023
26f3df9
Clean up no longer needed workaround
SirYwell Feb 9, 2023
fa6fe71
Remove unneeded special case for column selection
SirYwell Feb 12, 2023
c3a56c4
Clean up tests
SirYwell Feb 15, 2023
0e25055
Import de.jplag.Language
SirYwell Feb 15, 2023
a258c59
Add more Javadocs and import token types statically
SirYwell Feb 20, 2023
5d25116
Import ANTLR rule context classes statically
SirYwell Feb 20, 2023
f1e6a47
Rework extraction design to decrease duplicated code
SirYwell Mar 6, 2023
979e4ff
Rename shadowing variable
SirYwell Mar 14, 2023
8c57c80
Extract else at the end of the if block
SirYwell Mar 14, 2023
315e492
Merge nested ifs
SirYwell Mar 14, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions cli/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,11 @@
<artifactId>cpp</artifactId>
<version>${revision}</version>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>cpp2</artifactId>
<version>${revision}</version>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>golang</artifactId>
Expand Down
3 changes: 1 addition & 2 deletions cli/src/test/java/de/jplag/cli/LanguageTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
import java.util.Arrays;
import java.util.List;

import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;

import de.jplag.Language;
Expand All @@ -29,7 +28,7 @@ void testInvalidLanguage() throws Exception {
@Test
void testLoading() {
var languages = LanguageLoader.getAllAvailableLanguages();
Assertions.assertEquals(13, languages.size(), "Loaded Languages: " + languages.keySet());
assertEquals(14, languages.size(), "Loaded Languages: " + languages.keySet());
}

@Test
Expand Down
33 changes: 33 additions & 0 deletions languages/cpp2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# JPlag C++ language module

**Note**: This language module is meant to replace the existing C++ language module in the future.
While the old language module is based on lexer tokens, this language module uses a parse tree for token extraction.
The base package name of this language module and its identifier are `cpp2` currently, but this might change if the old
language module gets replaced.

The JPlag C++ frontend allows the use of JPlag with submissions in C/C++. <br>
It is based on the [C++ ANTLR4 grammar](https://github.com/antlr/grammars-v4/tree/master/cpp), licensed under MIT.

### C++ specification compatibility

The grammar definition targets C++14.

If the grammar is updated to a more recent<a href="#footnote-1"><sup>1</sup></a> syntax definition, this module should surely be updated as well.

### Token Extraction

The choice of tokens is intended to be similar to the Java language module.
While the Java language module is based on an AST, this language module uses a parse tree only.
There are differences, including:
- `import` is extracted in Java, while `using` is not extracted due to the fact that it can be placed freely in the code.

More syntactic elements of C/C++ may turn out to be helpful to include in the future, especially those that are newly introduced.

### Usage

To use the C++ frontend, add the `-l cpp2` flag in the CLI, or use a `JPlagOption` object with `new de.jplag.cpp2.CPPLanguage()` as `Language` in the Java API as described in the usage information in the [readme of the main project](https://github.com/jplag/JPlag#usage) and [in the wiki](https://github.com/jplag/JPlag/wiki/1.-How-to-Use-JPlag).

<br>

#### Footnotes
<section id="footnote-1"><sup>1 </sup>The grammar files are taken from grammar-v4, with the most recent modifications in <a href="https://github.com/antlr/grammars-v4/tree/fa4aff92b58e40bd337ab9f27217dc3feafbc32e/cpp">commit 6e10f7e</a> from May 2021.</section>
33 changes: 33 additions & 0 deletions languages/cpp2/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>de.jplag</groupId>
<artifactId>languages</artifactId>
<version>${revision}</version>
</parent>
<artifactId>cpp2</artifactId>

<dependencies>
<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr4-runtime</artifactId>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.antlr</groupId>
<artifactId>antlr4-maven-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>antlr4</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Loading