Skip to content

Commit

Permalink
Refined String-escaping behaviour (#22)
Browse files Browse the repository at this point in the history
## What is the goal of this PR?

During the recent refactor of Graql grammar, we accidentally the lexer for `STRING_` to be too strict: we disallowed `/` character. The reason for this was that the unescape-and-escape behaviour over `/` was not symmetrical. `unescape('/') -> '/'` and `escape('/') -> '\/'`, which means the user reads back characters that they did not write, and that's not acceptable. 

However, we need to bring back `/` characters in strings (issue #21) because it is widely used.

During the investigation, we discovered that there is no need for Graql to be escaping/unescaping strings that are provided by the user in the first place. Given that the String that is parsed by ANTLR is already a valid String, we can store it "as is" (with the exception for Regular Expression, that still requires a special escape operation). I.e. what the user writes, is what the user reads, for any Unicode character accepted by ANTLR.

Fixing the above also resulted in fixing our tests to expect the correct behaviour from Graql.

## What are the changes implemented in this PR?

1) Relaxed Graql grammar to accept `/` (fixes #21)
2) Remove `escapeString()` and `unescapeString()` from `StringUtil`, and all it's usage.
3) Introduced `escapeRegex()` and `escapeRegex()` in `StringUtil`, and used them in parsing and printing regex.
4) Fixed tests to comply with the newly refined behaviour of Graql string and regex parsing/printing.
5) Additionally, we reimplement `StringUtils.repeat()` with a simple `spaces(int len)` in `SyntaxError`, so we can remove the dependency to `org.apache.commons.lang`
  • Loading branch information
haikalpribadi authored Feb 27, 2019
1 parent 2dbf9b5 commit 8b21baf
Show file tree
Hide file tree
Showing 14 changed files with 80 additions and 61 deletions.
11 changes: 0 additions & 11 deletions dependencies/maven/artifacts/commons-lang/BUILD

This file was deleted.

1 change: 0 additions & 1 deletion dependencies/maven/dependencies.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,6 @@ def list_dependencies():
{"artifact": "com.google.errorprone:error_prone_annotations:2.0.18", "lang": "java", "sha1": "5f65affce1684999e2f4024983835efc3504012e", "sha256": "cb4cfad870bf563a07199f3ebea5763f0dec440fcda0b318640b1feaa788656b", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/com/google/errorprone/error_prone_annotations/2.0.18/error_prone_annotations-2.0.18.jar", "source": {"sha1": "220c1232fa6d13b427c10ccc1243a87f5f501d31", "sha256": "dbe7b49dd0584704d5c306b4ac7273556353ea3c0c6c3e50adeeca8df47047be", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/com/google/errorprone/error_prone_annotations/2.0.18/error_prone_annotations-2.0.18-sources.jar"} , "name": "com-google-errorprone-error_prone_annotations", "actual": "@com-google-errorprone-error_prone_annotations//jar", "bind": "jar/com/google/errorprone/error-prone-annotations"},
{"artifact": "com.google.guava:guava:23.0", "lang": "java", "sha1": "c947004bb13d18182be60077ade044099e4f26f1", "sha256": "7baa80df284117e5b945b19b98d367a85ea7b7801bd358ff657946c3bd1b6596", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/com/google/guava/guava/23.0/guava-23.0.jar", "source": {"sha1": "ed233607c5c11e1a13a3fd760033ed5d9fe525c2", "sha256": "37fe8ba804fb3898c3c8f0cbac319cc9daa58400e5f0226a380ac94fb2c3ca14", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/com/google/guava/guava/23.0/guava-23.0-sources.jar"} , "name": "com-google-guava-guava", "actual": "@com-google-guava-guava//jar", "bind": "jar/com/google/guava/guava"},
{"artifact": "com.google.j2objc:j2objc-annotations:1.1", "lang": "java", "sha1": "ed28ded51a8b1c6b112568def5f4b455e6809019", "sha256": "2994a7eb78f2710bd3d3bfb639b2c94e219cedac0d4d084d516e78c16dddecf6", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1.jar", "source": {"sha1": "1efdf5b737b02f9b72ebdec4f72c37ec411302ff", "sha256": "2cd9022a77151d0b574887635cdfcdf3b78155b602abc89d7f8e62aba55cfb4f", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1-sources.jar"} , "name": "com-google-j2objc-j2objc-annotations", "actual": "@com-google-j2objc-j2objc-annotations//jar", "bind": "jar/com/google/j2objc/j2objc-annotations"},
{"artifact": "commons-lang:commons-lang:2.6", "lang": "java", "sha1": "0ce1edb914c94ebc388f086c6827e8bdeec71ac2", "sha256": "50f11b09f877c294d56f24463f47d28f929cf5044f648661c0f0cfbae9a2f49c", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.6/commons-lang-2.6.jar", "source": {"sha1": "67313d715fbf0ea4fd0bdb69217fb77f807a8ce5", "sha256": "66c2760945cec226f26286ddf3f6ffe38544c4a69aade89700a9a689c9b92380", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.6/commons-lang-2.6-sources.jar"} , "name": "commons-lang-commons-lang", "actual": "@commons-lang-commons-lang//jar", "bind": "jar/commons-lang/commons-lang"},
{"artifact": "org.antlr:antlr4-runtime:4.7.1", "lang": "java", "sha1": "946f8aa9daa917dd81a8b818111bec7e288f821a", "sha256": "43516d19beae35909e04d06af6c0c58c17bc94e0070c85e8dc9929ca640dc91d", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/org/antlr/antlr4-runtime/4.7.1/antlr4-runtime-4.7.1.jar", "source": {"sha1": "1e68e18aa14f3229b95820d354a594846134af38", "sha256": "a33d52d0d64e68c60d5e3ae2c1098fe7200d57cff59032c19930fd9d487fc7d4", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/org/antlr/antlr4-runtime/4.7.1/antlr4-runtime-4.7.1-sources.jar"} , "name": "org-antlr-antlr4-runtime", "actual": "@org-antlr-antlr4-runtime//jar", "bind": "jar/org/antlr/antlr4-runtime"},
{"artifact": "org.codehaus.mojo:animal-sniffer-annotations:1.14", "lang": "java", "sha1": "775b7e22fb10026eed3f86e8dc556dfafe35f2d5", "sha256": "2068320bd6bad744c3673ab048f67e30bef8f518996fa380033556600669905d", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/org/codehaus/mojo/animal-sniffer-annotations/1.14/animal-sniffer-annotations-1.14.jar", "source": {"sha1": "886474da3f761d39fcbb723d97ecc5089e731f42", "sha256": "d821ae1f706db2c1b9c88d4b7b0746b01039dac63762745ef3fe5579967dd16b", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/org/codehaus/mojo/animal-sniffer-annotations/1.14/animal-sniffer-annotations-1.14-sources.jar"} , "name": "org-codehaus-mojo-animal-sniffer-annotations", "actual": "@org-codehaus-mojo-animal-sniffer-annotations//jar", "bind": "jar/org/codehaus/mojo/animal-sniffer-annotations"},
{"artifact": "org.hamcrest:hamcrest-core:1.3", "lang": "java", "sha1": "42a25dc3219429f0e5d060061f71acb49bf010a0", "sha256": "66fdef91e9739348df7a096aa384a5685f4e875584cce89386a7a47251c4d8e9", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar", "source": {"sha1": "1dc37250fbc78e23a65a67fbbaf71d2e9cbc3c0b", "sha256": "e223d2d8fbafd66057a8848cc94222d63c3cedd652cc48eddc0ab5c39c0f84df", "repository": "https://repo.maven.apache.org/maven2/", "url": "https://repo.maven.apache.org/maven2/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3-sources.jar"} , "name": "org-hamcrest-hamcrest-core", "actual": "@org-hamcrest-hamcrest-core//jar", "bind": "jar/org/hamcrest/hamcrest-core"},
Expand Down
5 changes: 0 additions & 5 deletions dependencies/maven/dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,6 @@ dependencies:
version: "23.0"
lang: java

commons-lang:
commons-lang:
version: "2.6"
lang: java

org.antlr:
antlr4-runtime:
version: "4.7.1" # sync version with @antlr4_runtime//jar
Expand Down
4 changes: 2 additions & 2 deletions grammar/Graql.g4
Original file line number Diff line number Diff line change
Expand Up @@ -302,8 +302,8 @@ DATE : 'date' ;
BOOLEAN_ : TRUE | FALSE ; // order of lexer declaration matters
TRUE : 'true' ;
FALSE : 'false' ;
STRING_ : '"' (~["\\/] | ESCAPE_SEQ_ )* '"'
| '\'' (~['\\/] | ESCAPE_SEQ_ )* '\'' ;
STRING_ : '"' (~["\\] | ESCAPE_SEQ_ )* '"'
| '\'' (~['\\] | ESCAPE_SEQ_ )* '\'' ;
INTEGER_ : ('+' | '-')? [0-9]+ ;
REAL_ : ('+' | '-')? [0-9]+ '.' [0-9]+ ;
DATE_ : DATE_FRAGMENT_ ;
Expand Down
1 change: 0 additions & 1 deletion java/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ java_library(
srcs = ["//grammar:graql-java"] + glob(["**/*.java"], exclude = ["**/test/**"]),
deps = [
# External dependencies
"//dependencies/maven/artifacts/commons-lang:commons-lang",
"//dependencies/maven/artifacts/com/google/guava:guava",
"//dependencies/maven/artifacts/com/google/code/findbugs:jsr305",
"//dependencies/maven/artifacts/org/antlr:antlr4-runtime", # sync version with @antlr4_runtime//jar
Expand Down
8 changes: 3 additions & 5 deletions java/parser/Parser.java
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@
import static graql.lang.Graql.not;
import static graql.lang.Graql.type;
import static graql.lang.util.Collections.triple;
import static graql.lang.util.StringUtil.unescapeRegex;
import static java.util.stream.Collectors.toList;

/**
Expand Down Expand Up @@ -973,9 +974,7 @@ public ValueProperty.Operation<?> visitComparison(GraqlParser.ComparisonContext

@Override
public String visitRegex(GraqlParser.RegexContext ctx) {
// Remove surrounding /.../
String unquoted = unquoteString(ctx.STRING_());
return unquoted.replaceAll("\\\\/", "/");
return unescapeRegex(unquoteString(ctx.STRING_()));
}

@Override
Expand Down Expand Up @@ -1022,8 +1021,7 @@ public Object visitLiteral(GraqlParser.LiteralContext ctx) {

private String getString(TerminalNode string) {
// Remove surrounding quotes
String unquoted = unquoteString(string);
return StringUtil.unescapeString(unquoted);
return unquoteString(string);
}

private String unquoteString(TerminalNode string) {
Expand Down
12 changes: 10 additions & 2 deletions java/parser/SyntaxError.java
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
package graql.lang.parser;

import graql.lang.exception.ErrorMessage;
import org.apache.commons.lang.StringUtils;

public class SyntaxError {

Expand All @@ -38,6 +37,15 @@ public SyntaxError(String queryLine, int line, int charPositionInLine, String ms
this.msg = msg;
}

private String spaces(int len) {
char ch = ' ';
char[] output = new char[len];
for (int i = len - 1; i >= 0; i--) {
output[i] = ch;
}
return new String(output);
}

@Override
public String toString() {
if (queryLine == null) {
Expand All @@ -49,7 +57,7 @@ public String toString() {
// match $
// ^
// blah blah antlr blah
String pointer = StringUtils.repeat(" ", charPositionInLine) + "^";
String pointer = spaces(charPositionInLine) + "^";
return ErrorMessage.SYNTAX_ERROR.getMessage(line, queryLine, pointer, msg);
}
}
Expand Down
51 changes: 42 additions & 9 deletions java/parser/test/ParserTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,15 @@ public void testSimpleQuery() {
assertQueryEquals(expected, parsed, query);
}

@Test
public void testParseStringWithSlash() {
String query = "match $x isa person, has name 'alice/bob'; get;";
GraqlGet parsed = Graql.parse(query).asGet();
GraqlGet expected = match(var("x").isa("person").has("name", "alice/bob")).get();

assertQueryEquals(expected, parsed, query.replace("'", "\""));
}

@Test
public void testRelationQuery() {
String query = "match\n" +
Expand Down Expand Up @@ -664,12 +673,13 @@ public void testDefineDataTypeQuery() {

@Test
public void testEscapeString() {
String unescaped = "This has \"double quotes\" and a single-quoted backslash: '\\'";
String escaped = "This has \\\"double quotes\\\" and a single-quoted backslash: \\'\\\\\\'";
// ANTLR will see this as a string that looks like:
// "This has \"double quotes\" and a single-quoted backslash: '\\'"
String input = "This has \\\"double quotes\\\" and a single-quoted backslash: \\'\\\\\\'";

String query = "insert $_ isa movie, has title \"" + escaped + "\";";
String query = "insert $_ isa movie, has title \"" + input + "\";";
GraqlInsert parsed = Graql.parse(query).asInsert();
GraqlInsert expected = insert(var().isa("movie").has("title", unescaped));
GraqlInsert expected = insert(var().isa("movie").has("title", input));

assertQueryEquals(expected, parsed, query);
}
Expand Down Expand Up @@ -1096,29 +1106,52 @@ public void whenParsingAggregateWithWrongName_Throw() {
parse("match $x isa name; get; hello $x;");
}

@Test
public void regexAttributeProperty() {
String query = "define digit sub attribute, regex '\\d';";
GraqlDefine parsed = parse(query);
GraqlDefine expected = define(type("digit").sub("attribute").regex("\\d"));
assertQueryEquals(expected, parsed, query.replace("'", "\""));
}

@Test
public void regexPredicateParsesCharacterClassesCorrectly() {
assertEquals(match(var("x").like("\\d")).get(), parse("match $x like '\\d'; get;"));
String query = "match $x like '\\d'; get;";
GraqlGet parsed = parse(query);
GraqlGet expected = match(var("x").like("\\d")).get();
assertQueryEquals(expected, parsed, query.replace("'", "\""));
}

@Test
public void regexPredicateParsesQuotesCorrectly() {
assertEquals(match(var("x").like("\"")).get(), parse("match $x like '\"'; get;"));
String query = "match $x like '\\\"'; get;";
GraqlGet parsed = parse(query);
GraqlGet expected = match(var("x").like("\\\"")).get();
assertQueryEquals(expected, parsed, query.replace("'", "\""));
}

@Test
public void regexPredicateParsesBackslashesCorrectly() {
assertEquals(match(var("x").like("\\\\")).get(), parse("match $x like '\\\\'; get;"));
String query = "match $x like '\\\\'; get;";
GraqlGet parsed = parse(query);
GraqlGet expected = match(var("x").like("\\\\")).get();
assertQueryEquals(expected, parsed, query.replace("'", "\""));
}

@Test
public void regexPredicateParsesNewlineCorrectly() {
assertEquals(match(var("x").like("\\n")).get(), parse("match $x like '\\n'; get;"));
String query = "match $x like '\\n'; get;";
GraqlGet parsed = parse(query);
GraqlGet expected = match(var("x").like("\\n")).get();
assertQueryEquals(expected, parsed, query.replace("'", "\""));
}

@Test
public void regexPredicateParsesForwardSlashesCorrectly() {
assertEquals(match(var("x").like("/")).get(), parse("match $x like '\\/'; get;"));
String query = "match $x like '\\/'; get;";
GraqlGet parsed = parse(query);
GraqlGet expected = match(var("x").like("/")).get();
assertQueryEquals(expected, parsed, query.replace("'", "\""));
}

@Test
Expand Down
4 changes: 2 additions & 2 deletions java/pattern/Disjunction.java
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@

package graql.lang.pattern;

import com.google.common.collect.ImmutableSet;
import com.google.common.collect.Sets;
import graql.lang.Graql;
import graql.lang.statement.Statement;
import graql.lang.statement.Variable;
import graql.lang.util.Collections;

import javax.annotation.CheckReturnValue;
import java.util.Iterator;
Expand Down Expand Up @@ -78,7 +78,7 @@ public Disjunction<Conjunction<Pattern>> getNegationDNF() {

@Override
public Set<Variable> variables() {
return getPatterns().stream().map(Pattern::variables).reduce(Sets::intersection).orElse(ImmutableSet.of());
return getPatterns().stream().map(Pattern::variables).reduce(Sets::intersection).orElse(Collections.set());
}

@Override
Expand Down
3 changes: 3 additions & 0 deletions java/pattern/Pattern.java
Original file line number Diff line number Diff line change
Expand Up @@ -79,4 +79,7 @@ default Set<Statement> statements() {
*/
@CheckReturnValue
default Negation<?> asNegation(){ throw new UnsupportedOperationException(); }

@Override
String toString();
}
6 changes: 4 additions & 2 deletions java/property/RegexProperty.java
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@

import graql.lang.Graql;
import graql.lang.statement.StatementType;
import graql.lang.util.StringUtil;

import static graql.lang.util.StringUtil.escapeRegex;
import static graql.lang.util.StringUtil.quoteString;

/**
* Represents the {@code regex} property on a AttributeType. This property can be queried and inserted.
Expand Down Expand Up @@ -49,7 +51,7 @@ public String keyword() {

@Override
public String property() {
return "\"" + StringUtil.escapeString(regex()) + "\"";
return quoteString(escapeRegex(regex()));
}

@Override
Expand Down
7 changes: 5 additions & 2 deletions java/property/ValueProperty.java
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@
import java.time.LocalDateTime;
import java.util.stream.Stream;

import static graql.lang.util.StringUtil.escapeRegex;
import static graql.lang.util.StringUtil.quoteString;

/**
* Represents the {@code value} property on an attribute.
* This property can be queried or inserted.
Expand Down Expand Up @@ -243,9 +246,9 @@ public java.lang.String toString() {

operation.append(comparator()).append(Graql.Token.Char.SPACE);
if (comparator().equals(Graql.Token.Comparator.LIKE)) {
operation.append("\"").append(value().replaceAll("/", "\\\\/")).append("\"");
operation.append(quoteString(escapeRegex(value())));
} else {
operation.append(StringUtil.quoteString(value()));
operation.append(quoteString(value()));
}

return operation.toString();
Expand Down
9 changes: 4 additions & 5 deletions java/query/test/GraqlQueryTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -116,15 +116,14 @@ public void testInsertQueryToString() {

@Test
public void testEscapeStrings() {
assertEquals("insert $x \"hello\\nworld\";", Graql.insert(var("x").val("hello\nworld")).toString());
assertEquals("insert $x \"hello\nworld\";", Graql.insert(var("x").val("hello\nworld")).toString());
assertEquals("insert $x \"hello\\nworld\";", Graql.insert(var("x").val("hello\\nworld")).toString());
}

@Test
public void testQuoteIds() {
assertEquals(
"match $a (\"hello\\tworld\");",
match(var("a").rel(type("hello\tworld"))).toString()
);
assertEquals("match $a (\"hello\tworld\");", match(var("a").rel(type("hello\tworld"))).toString());
assertEquals("match $a (\"hello\\tworld\");", match(var("a").rel(type("hello\\tworld"))).toString());
}

@Test
Expand Down
19 changes: 5 additions & 14 deletions java/util/StringUtil.java
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@

import graql.grammar.GraqlLexer;
import graql.lang.Graql;
import org.apache.commons.lang.StringEscapeUtils;

import java.text.DecimalFormat;
import java.text.DecimalFormatSymbols;
Expand All @@ -39,28 +38,20 @@ public class StringUtil {
Arrays.asList("min", "max", "median", "mean", "std", "sum", "count", "path", "cluster", "degrees", "members", "persist")
);

/**
* @param string the string to unescape
* @return the unescaped string, replacing any backslash escapes with the real characters
*/
public static String unescapeString(String string) {
return StringEscapeUtils.unescapeJavaScript(string);
public static String unescapeRegex(String regex) {
return regex.replaceAll("\\\\/", "/");
}

/**
* @param string the string to escape
* @return the escaped string, replacing any escapable characters with backslashes
*/
public static String escapeString(String string) {
return StringEscapeUtils.escapeJavaScript(string);
public static String escapeRegex(String regex) {
return regex.replaceAll("/", "\\\\/");
}

/**
* @param string a string to quote and escape
* @return a string, surrounded with double quotes and escaped
*/
public static String quoteString(String string) {
return "\"" + escapeString(string) + "\"";
return "\"" + string + "\"";
}

public static String escapeLabelOrId(String value) {
Expand Down

0 comments on commit 8b21baf

Please sign in to comment.