Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix the errorIfExists keywords not match for DSLSQLLexer #1908

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hellozepp
Copy link
Contributor

@hellozepp hellozepp commented Feb 28, 2023

Issue description

  1. In Byzer project, the DSLSQL grammar defines a subrule errorIfExists: 'errorIfExists' and a token for the keyword ERRORIfExists:'errorIfExists'. However, due to the implementation of CaseChangingStream in Byzer, which converts all characters to lowercase, the keyword ERRORIfExists cannot be recognized properly in some contexts.
  2. Furthermore, because both the subrule and the token exist, and according to the longest-match principle and the priority order, the statement rule save (overwrite | append | errorIfExists | ignore)* might match the keyword errorIfExists in the subrule instead of the token in some contexts, leading to incorrect parsing.

Steps to reproduce

  1. Define a SQL statement that uses the save rule, such as SAVE errorIfExists table_name as.

Proposed solution

To resolve this issue, we propose to unify the usage of the errorIfExists keyword by changing the token definition to ERRORIFEXISTS:'errorifexists' and using the lowercase string literal consistently throughout the DSLSQL grammar.

@hellozepp hellozepp force-pushed the fixErrorIfExistsNotMatch branch 2 times, most recently from 7253712 to eee6ce9 Compare February 28, 2023 15:43
@hellozepp
Copy link
Contributor Author

overwrite Test Case

set jsonStr1='''
{"features":[5.1,3.5,1.4,0.2],"label":0.0},
{"features":[5.1,3.5,1.4,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[4.4,2.9,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[4.7,3.2,1.3,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
''';

load jsonStr.`jsonStr1` as data;
save overwrite data as json.`/tmp/jack` where fileNum="10";

image

ignore Test Case

set jsonStr1='''
{"features":[5.1,3.5,1.4,0.2],"label":0.0},

''';

load jsonStr.`jsonStr1` as data;
save ignore data as json.`/tmp/jack` where fileNum="10";
load json.`/tmp/jack` as jackData;
select count(1) from jackData as output;

image

errorIfExists Test Case

set jsonStr1='''
{"features":[5.1,3.5,1.4,0.2],"label":0.0},
{"features":[5.1,3.5,1.4,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[4.4,2.9,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[4.7,3.2,1.3,0.2],"label":1.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
{"features":[5.1,3.5,1.4,0.2],"label":0.0}
''';

load jsonStr.`jsonStr1` as data;
save errorIfExists data as json.`/tmp/jack` where fileNum="10";

image

append Test Case

set jsonStr1='''
{"features":[5.1,3.5,1.4,0.2],"label":0.0},

''';

load jsonStr.`jsonStr1` as data;
save append data as json.`/tmp/jack`;
load json.`/tmp/jack` as jackData;
select count(1) from jackData as output;

image

@hellozepp hellozepp force-pushed the fixErrorIfExistsNotMatch branch from eee6ce9 to 1acb783 Compare February 28, 2023 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant