Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug with multiple java pragmas in one line #652

Open
tkilias opened this issue Jun 20, 2022 · 4 comments
Open

Bug with multiple java pragmas in one line #652

tkilias opened this issue Jun 20, 2022 · 4 comments
Assignees
Labels
bug Unwanted / harmful behavior

Comments

@tkilias
Copy link
Collaborator

tkilias commented Jun 20, 2022

Background

This one works everywhere (as you understand, any JAR can be used as we don't access its methods):

CREATE OR REPLACE JAVA SCALAR SCRIPT test.java_udf_1(col1 VARCHAR(2000000)) EMITS(col1 VARCHAR(2000000)) AS 
%jar /buckets/bucketfs1/jars/exajdbc.jar;
class JAVA_UDF_1 {
 static void run(ExaMetadata exa, ExaIterator ctx) throws Exception {
 	String host_name = ctx.getString("col1");
 }
}
/
;

and this one

CREATE OR REPLACE JAVA SCALAR SCRIPT test.java_udf_3(col1 VARCHAR(2000000)) EMITS(col1 VARCHAR(2000000)) AS %jar /buckets/bucketfs1/jars/exajdbc.jar; %jvmoption -Xms4m; class JAVA_UDF_3 {static void run(ExaMetadata exa, ExaIterator ctx) throws Exception {String host_name = ctx.getString("col1");}}
/
;

fails in 7.1.11 (complains on percent in the beginning of %jvmoption):

SQL Error [22002]: VM error: F-UDF-CL-LIB-1125: F-UDF-CL-SL-JAVA-1000: F-UDF-CL-SL-JAVA-1037: 
com.exasol.ExaCompilationException: F-UDF-CL-SL-JAVA-1158: /JAVA_UDF_3.java:2: error: class, interface, or enum expected
 %jvmoption -Xms4m; class JAVA_UDF_3 {static void run(ExaMetadata exa, ExaIterator ctx) throws Exception {String host_name = ctx.getString("host_name");}}
 ^
1 error
 (Session: 1735922023625195520)
@tkilias tkilias added the bug Unwanted / harmful behavior label Jun 20, 2022
@tkilias
Copy link
Collaborator Author

tkilias commented Jun 20, 2022

It seems the following example worked before, but not yet anymore. However, it actually shouldn't have worked, because of this comment:

CREATE JAVA SCALAR SCRIPT "java_udf4" (col1 VARCHAR(200000)) RETURNS VARCHAR(200000) AS
%jar /buckets/bfsdefault/jars/jar1.jar;  %jar /buckets/bfsdefault/jars/jar2.jar;  %jvmoption -Xms4m;  import test.test;  import ...

@tomuben
Copy link
Collaborator

tomuben commented Jun 21, 2022

RCA

This change changed the order how the options are being parsed. If the options are in one line and not in the exact parse order, the parser does not remove the options before forwarding the code to java compiler; and that causes the observed issue.

Next steps

Plan is to make the parser more robust and independent of the order of the languages specific options.

Rules for the new parser implementation

- option line starts with % and a name, is followed by values and ends with terminating character (e.g. ';')
- option lines must be placed at the beginning of the code, only empty lines or whitespaces are allowed in front of them
- multiple options after each other , optionally with whitespaces or line breaks in between
	- %...;   \n   %...;%...;
- language specific code comes **after** the options. This includes comments.
- Limitations
	- character ';' must not be part of value if ';' is a terminating character
	- Comments must not be located in front of or between options

Implementation details

Implement new parser using a simple state machine (likely character based) which checks for the pattern above.

@tkilias tkilias added the blocked:yes Currently blocked by another ticket label Jul 29, 2022
@tkilias
Copy link
Collaborator Author

tkilias commented Jul 29, 2022

Blocked because needs also changes in the DB

@ckunki
Copy link
Contributor

ckunki commented Oct 5, 2022

The currently accepted solution is only a workaround.
In general a more robust strategy for parsing UDF options is still required.
As some UDF options are evaluated in DB core and others in the script-languages-release, changes to both of these components need to be coordinated.
The current ticket is marked as "blocked" to signal the dependency to changes in the DB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unwanted / harmful behavior
Projects
None yet
Development

No branches or pull requests

4 participants