Skip to content

Commit

Permalink
Release v0.5.0 (#37)
Browse files Browse the repository at this point in the history
* Update ZetaSQL to version 2023.10.1

* Rewrite queries to fully quote all name paths before analysis

* Avoid no longer necessary nesting of tables in catalogs

* Update query rewritting to only re-quote name paths that refer to resources (i.e. tables, functions, etc)

* Avoid no longer necessary nesting of functions in catalogs

* Avoid no longer necessary nesting of TVFs in catalogs

* Remove unnecessary slf4j dependency

Closes #24

* Reduce code duplication in BigQueryCatalog

* Reduce the amount of nesting for procedures in catalogs

* Avoid duplicate code when creating different types of resources in catalogs

* Use the DOUBLE type kind for BigQuery FLOAT64 columns

Closes #25

* Added extractColumnLevelLineage for ResolvedQueryStmt
Closes #28

* Changed the access modifier of ParentColumnFinder to public
Closes #28

* Make ColumnLineageExtractor accept only concrete statement types

Previously, the API for ColumnLineageExtractor used the method
::extractColumnLevelLineage(ResolvedStatement). Since introducing
support for ResolvedQueryStmts, which needs specifying an output table
separately; maintaining the generic ResolvedStatement API required
making it confusing, since it would optionally need to accept an output
table.

This makes it so that teams building lineage applications need to
explicitly determine the statements they support and call the
corresponding ::extractColumnLevelLineage() method. Such as
::extractColumnLevelLineage(ResolvedInsertStmt) or
::extractColumnLevelLineage(ResolvedQueryStmt, String).

* Bump development version to 0.5.0-SNAPSHOT

* vuln-fix: Use HTTPS instead of HTTP to resolve deps CVE-2021-26291 (#30)

This fixes a security vulnerability in this project where the `pom.xml`
files were configuring Maven to resolve dependencies over HTTP instead of
HTTPS.

Weakness: CWE-829: Inclusion of Functionality from Untrusted Control Sphere
Severity: High
CVSS: 8.1
Detection: CodeQL & OpenRewrite (https://app.moderne.io/recipes/org.openrewrite.maven.security.UseHttpsForRepositories)

Reported-by: Jonathan Leitschuh <[email protected]>


Bug-tracker: JLLeitschuh/security-research#8
Detection: CodeQL (https://codeql.github.com/codeql-query-help/java/java-maven-non-https-url/) & OpenRewrite (https://app.moderne.io/recipes/org.openrewrite.maven.security.UseHttpsForRepositories)

Reported-by: Jonathan Leitschuh <[email protected]>


Bug-tracker: JLLeitschuh/security-research#8


Use this link to re-run the recipe: https://app.moderne.io/recipes/org.openrewrite.maven.security.UseHttpsForRepositories?organizationId=R29vZ2xl

Co-authored-by: Moderne <[email protected]>

* Make the type parser case insensitive (#35)

The type parser was previously case sensitive, while SQL types are case insensitive. This went unnoticed for a while since upper-cased types are usually always used, but is fundamentally incorrect.

Fixes #32

* Add reflection-based patching of GRPC's default max nesting depth (#36)

ZetaSQL's Java API uses a GRPC service to call into the actual C++ implementation of ZetaSQL. By default, the serialization logic of that communication allows for a nesting depth in protobuf messages of up to 100. However, long queries can exceed that level of nesting and as a result cannot be analyzed by default.

This implements a reflection-based patch that allows users to override
that limit to a greater number. This is brittle by design and should
be used with caution.

Fixes #31

* Upgrade to zetasql-2024-03-01 and bump deps (#33)

* Upgrade to zetasql-2024-03-01 and bump deps

* Enable all features

* Rollback Mockito to version 4.11.0

* Remove some v1.4 language options not supported by BigQuery

---------

Co-authored-by: Pablo Paglilla <[email protected]>

* Add missing @return in catalog Javadocs

* Update version to v0.5.0

---------

Co-authored-by: Dion Ricky Saputra <[email protected]>
Co-authored-by: Jonathan Leitschuh <[email protected]>
Co-authored-by: Moderne <[email protected]>
Co-authored-by: Erlend Hamnaberg <[email protected]>
  • Loading branch information
5 people committed May 6, 2024
1 parent 6df74bc commit c8c24d9
Show file tree
Hide file tree
Showing 28 changed files with 1,606 additions and 1,021 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ When analyzing queries using BigQuery semantics, you need to:
<dependency>
<groupId>com.google.zetasql.toolkit</groupId>
<artifactId>zetasql-toolkit-bigquery</artifactId>
<version>0.4.1</version>
<version>0.5.0</version>
</dependency>
```

Expand Down Expand Up @@ -113,7 +113,7 @@ Similarly, when analyzing queries using Spanner semantics, you need to:
<dependency>
<groupId>com.google.zetasql.toolkit</groupId>
<artifactId>zetasql-toolkit-spanner</artifactId>
<version>0.4.1</version>
<version>0.5.0</version>
</dependency>
```

Expand Down
21 changes: 5 additions & 16 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

<groupId>com.google.zetasql.toolkit</groupId>
<artifactId>zetasql-toolkit</artifactId>
<version>0.4.1</version>
<version>0.5.0</version>
<packaging>pom</packaging>

<name>${project.groupId}:${project.artifactId}</name>
Expand Down Expand Up @@ -63,11 +63,10 @@
<maven.deploy.skip>false</maven.deploy.skip>
<maven.test.skip>true</maven.test.skip>
<!-- Dependency versions -->
<zetasql.version>2023.04.1</zetasql.version>
<google.cloud.libraries.version>26.15.0</google.cloud.libraries.version>
<slf4j.version>1.7.25</slf4j.version>
<zetasql.version>2024.03.1</zetasql.version>
<google.cloud.libraries.version>26.37.0</google.cloud.libraries.version>
<!-- Testing dependency versions -->
<junit.version>5.9.3</junit.version>
<junit.version>5.10.2</junit.version>
<mockito.version>4.11.0</mockito.version>
<!-- Plugin versions -->
<maven.source.plugin.version>3.3.0</maven.source.plugin.version>
Expand Down Expand Up @@ -114,16 +113,6 @@
<artifactId>zetasql-jni-channel</artifactId>
<version>${zetasql.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>${slf4j.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>${slf4j.version}</version>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
Expand Down Expand Up @@ -205,7 +194,7 @@
<snapshotRepository>
<id>sonatype-nexus-snapshots</id>
<name>Sonatype Nexus Snapshots</name>
<url>http://oss.sonatype.org/content/repositories/snapshots</url>
<url>https://oss.sonatype.org/content/repositories/snapshots</url>
</snapshotRepository>
</distributionManagement>

Expand Down
4 changes: 2 additions & 2 deletions zetasql-toolkit-bigquery/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,12 @@
<parent>
<groupId>com.google.zetasql.toolkit</groupId>
<artifactId>zetasql-toolkit</artifactId>
<version>0.4.1</version>
<version>0.5.0</version>
<relativePath>../pom.xml</relativePath>
</parent>

<artifactId>zetasql-toolkit-bigquery</artifactId>
<version>0.4.1</version>
<version>0.5.0</version>

<name>${project.groupId}:${project.artifactId}</name>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ private TypeKind convertBigqueryTypeNameToTypeKind(StandardSQLTypeName bigqueryT
case INT64:
return TypeKind.TYPE_INT64;
case FLOAT64:
return TypeKind.TYPE_FLOAT;
return TypeKind.TYPE_DOUBLE;
case NUMERIC:
return TypeKind.TYPE_NUMERIC;
case BIGNUMERIC:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ class BigQueryBuiltIns {
ImmutableList.of(
// BQ.ABORT_SESSION([STRING])
new ProcedureInfo(
ImmutableList.of("BQ", "ABORT_SESSION"),
ImmutableList.of("BQ.ABORT_SESSION"),
new FunctionSignature(
new FunctionArgumentType(TypeFactory.createSimpleType(TypeKind.TYPE_STRING)),
ImmutableList.of(
Expand All @@ -149,7 +149,7 @@ class BigQueryBuiltIns {
-1)),
// BQ.JOBS.CANCEL(STRING)
new ProcedureInfo(
ImmutableList.of("BQ", "JOBS", "CANCEL"),
ImmutableList.of("BQ.JOBS.CANCEL"),
new FunctionSignature(
new FunctionArgumentType(TypeFactory.createSimpleType(TypeKind.TYPE_STRING)),
ImmutableList.of(
Expand All @@ -163,7 +163,7 @@ class BigQueryBuiltIns {
-1)),
// BQ.REFRESH_EXTERNAL_METADATA_CACHE(STRING)
new ProcedureInfo(
ImmutableList.of("BQ", "REFRESH_EXTERNAL_METADATA_CACHE"),
ImmutableList.of("BQ.REFRESH_EXTERNAL_METADATA_CACHE"),
new FunctionSignature(
new FunctionArgumentType(TypeFactory.createSimpleType(TypeKind.TYPE_STRING)),
ImmutableList.of(
Expand All @@ -177,7 +177,7 @@ class BigQueryBuiltIns {
-1)),
// BQ.REFRESH_MATERIALIZED_VIEW(STRING)
new ProcedureInfo(
ImmutableList.of("BQ", "REFRESH_MATERIALIZED_VIEW"),
ImmutableList.of("BQ.REFRESH_MATERIALIZED_VIEW"),
new FunctionSignature(
new FunctionArgumentType(TypeFactory.createSimpleType(TypeKind.TYPE_STRING)),
ImmutableList.of(
Expand All @@ -199,15 +199,9 @@ class BigQueryBuiltIns {
public static void addToCatalog(SimpleCatalog catalog) {
TYPE_ALIASES.forEach(catalog::addType);
FUNCTIONS.forEach(catalog::addFunction);

for (ProcedureInfo procedureInfo : PROCEDURES) {
List<String> namePath = procedureInfo.getNamePath();
String procedureName = namePath.get(namePath.size() - 1);
List<List<String>> procedurePaths =
ImmutableList.of(namePath, ImmutableList.of(procedureName), ImmutableList.of(String.join(".", namePath)));
PROCEDURES.forEach(procedure ->
CatalogOperations.createProcedureInCatalog(
catalog, procedurePaths, procedureInfo, CreateMode.CREATE_OR_REPLACE);
}
catalog, procedure.getFullName(), procedure, CreateMode.CREATE_DEFAULT));
}

private BigQueryBuiltIns() {}
Expand Down
Loading

0 comments on commit c8c24d9

Please sign in to comment.