Analyzer Raises Exception on Assignment of FLOAT64 to FLOAT Type Field #25

dion-ricky · 2023-11-07T02:57:24Z

Hi,

We have a merge query that updates/inserts rows into BQ table, but the analyzer raises exception on the assignment of FLOAT64 type value to FLOAT type field. Due to my limited knowledge of the inner working of ZetaSQL I have no idea how to fix it. I have looked around at the LanguageOptions but haven't tried anything yet. Here's a snippet of the query for example only:

MERGE INTO `bq-project.dataset.table_name` AS target
USING (
  SELECT
      1 AS id,
      0.99 AS amount,
      TIMESTAMP('2023-11-07 01:00:00') as date
) as source
ON target.id = source.id
WHEN MATCHED AND source.date >= target.date THEN
UPDATE SET id = source.id, amount = source.amount
WHEN NOT MATCHED BY TARGET THEN
INSERT ( id, amount ) VALUES ( source.id, source.amount )

Here's the exception raised by the Analyzer:

Exception in thread "main" com.google.zetasql.toolkit.AnalysisException: Value of type FLOAT64 cannot be assigned to amount which has type FLOAT [at xx:xx]
	at com.google.zetasql.toolkit.ZetaSQLToolkitAnalyzer$StatementAnalyzer.analyzeNextStatement(ZetaSQLToolkitAnalyzer.java:232)
	at com.google.zetasql.toolkit.ZetaSQLToolkitAnalyzer$StatementAnalyzer.next(ZetaSQLToolkitAnalyzer.java:211)
	at com.google.zetasql.toolkit.ZetaSQLToolkitAnalyzer$StatementAnalyzer.next(ZetaSQLToolkitAnalyzer.java:148)
	at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
	at com.example.data.Main.main(Main.java:28)

The text was updated successfully, but these errors were encountered:

Closes #25

ppaglilla · 2023-11-07T03:58:19Z

Thank you for reporting this!

This is happening because the toolkit is using the ZetaSQL FLOAT type (i.e. a 32 bit float) to represent BigQuery FLOAT64 columns, when in reality it should be using the DOUBLE type (i.e. an actual 64 bit float). I just fixed it in the version/v0.5.0 branch. I'm hoping that version releases to maven within a week.

In the meantime, I can offer this sort of hacky workaround. Before analyzing; you could find the table that has the FLOAT column, change that to a DOUBLE and replace the table in the catalog. Of course, this wouldn't be necessary once that new version releases.

// 1. Find the table that needs the column changed from FLOAT to DOUBLE
SimpleTable tableToUpdate = catalog.getZetaSQLCatalog()
    .getTable("bq-project.dataset.table_name", null);

// 2. Build the new column list, replacing FLOAT columns with DOUBLE columns
List<SimpleColumn> updatedColumns = tableToUpdate.getColumnList()
    .stream()
    .map(column -> {
      if (!column.getType().isFloat()) {
        return column;
      }

      return new SimpleColumn(
          tableToUpdate.getFullName(),
          column.getName(),
          TypeFactory.createSimpleType(TypeKind.TYPE_DOUBLE));
    })
    .collect(Collectors.toList());

// 3. Replace the table
SimpleTable updatedTable = new SimpleTable(tableToUpdate.getFullName(), updatedColumns);
catalog.register(updatedTable, CreateMode.CREATE_OR_REPLACE, CreateScope.CREATE_DEFAULT_SCOPE);

dion-ricky · 2023-11-07T05:31:03Z

Hi, thanks for you quick response and support. I've tried your code and it works. Looking forward to the release of 0.5.0.

dion-ricky · 2023-12-18T07:05:56Z

Hi @ppaglilla , I hope you're doing well. I've been monitoring for the release of v0.5.0, but I'm not seeing a lot of activities from you. Do you have updated estimate of the release of v0.5.0? Also I noticed that on branch version/v0.5.0 the output of column lineage didn't include dataset name and project name, just want to make sure that it's not broken on the next release 😄 .

dion-ricky · 2024-02-07T02:43:35Z

Hi @ppaglilla , not seeing a lot of update from you. I hope you're doing good. I just want to follow up about release v.0.5.0. It's been almost two months since last time I checked and I still don't see any new commits on branch version/v0.5.0. Can you give any update about this?

@return

* Update ZetaSQL to version 2023.10.1 * Rewrite queries to fully quote all name paths before analysis * Avoid no longer necessary nesting of tables in catalogs * Update query rewritting to only re-quote name paths that refer to resources (i.e. tables, functions, etc) * Avoid no longer necessary nesting of functions in catalogs * Avoid no longer necessary nesting of TVFs in catalogs * Remove unnecessary slf4j dependency Closes #24 * Reduce code duplication in BigQueryCatalog * Reduce the amount of nesting for procedures in catalogs * Avoid duplicate code when creating different types of resources in catalogs * Use the DOUBLE type kind for BigQuery FLOAT64 columns Closes #25 * Added extractColumnLevelLineage for ResolvedQueryStmt Closes #28 * Changed the access modifier of ParentColumnFinder to public Closes #28 * Make ColumnLineageExtractor accept only concrete statement types Previously, the API for ColumnLineageExtractor used the method ::extractColumnLevelLineage(ResolvedStatement). Since introducing support for ResolvedQueryStmts, which needs specifying an output table separately; maintaining the generic ResolvedStatement API required making it confusing, since it would optionally need to accept an output table. This makes it so that teams building lineage applications need to explicitly determine the statements they support and call the corresponding ::extractColumnLevelLineage() method. Such as ::extractColumnLevelLineage(ResolvedInsertStmt) or ::extractColumnLevelLineage(ResolvedQueryStmt, String). * Bump development version to 0.5.0-SNAPSHOT * vuln-fix: Use HTTPS instead of HTTP to resolve deps CVE-2021-26291 (#30) This fixes a security vulnerability in this project where the `pom.xml` files were configuring Maven to resolve dependencies over HTTP instead of HTTPS. Weakness: CWE-829: Inclusion of Functionality from Untrusted Control Sphere Severity: High CVSS: 8.1 Detection: CodeQL & OpenRewrite (https://app.moderne.io/recipes/org.openrewrite.maven.security.UseHttpsForRepositories) Reported-by: Jonathan Leitschuh <[email protected]> Bug-tracker: JLLeitschuh/security-research#8 Detection: CodeQL (https://codeql.github.com/codeql-query-help/java/java-maven-non-https-url/) & OpenRewrite (https://app.moderne.io/recipes/org.openrewrite.maven.security.UseHttpsForRepositories) Reported-by: Jonathan Leitschuh <[email protected]> Bug-tracker: JLLeitschuh/security-research#8 Use this link to re-run the recipe: https://app.moderne.io/recipes/org.openrewrite.maven.security.UseHttpsForRepositories?organizationId=R29vZ2xl Co-authored-by: Moderne <[email protected]> * Make the type parser case insensitive (#35) The type parser was previously case sensitive, while SQL types are case insensitive. This went unnoticed for a while since upper-cased types are usually always used, but is fundamentally incorrect. Fixes #32 * Add reflection-based patching of GRPC's default max nesting depth (#36) ZetaSQL's Java API uses a GRPC service to call into the actual C++ implementation of ZetaSQL. By default, the serialization logic of that communication allows for a nesting depth in protobuf messages of up to 100. However, long queries can exceed that level of nesting and as a result cannot be analyzed by default. This implements a reflection-based patch that allows users to override that limit to a greater number. This is brittle by design and should be used with caution. Fixes #31 * Upgrade to zetasql-2024-03-01 and bump deps (#33) * Upgrade to zetasql-2024-03-01 and bump deps * Enable all features * Rollback Mockito to version 4.11.0 * Remove some v1.4 language options not supported by BigQuery --------- Co-authored-by: Pablo Paglilla <[email protected]> * Add missing @return in catalog Javadocs * Update version to v0.5.0 --------- Co-authored-by: Dion Ricky Saputra <[email protected]> Co-authored-by: Jonathan Leitschuh <[email protected]> Co-authored-by: Moderne <[email protected]> Co-authored-by: Erlend Hamnaberg <[email protected]>

ppaglilla · 2024-05-06T16:47:01Z

I'm really sorry for this release being so delayed. Version 0.5.0 is available through maven now, see the release.

ppaglilla added a commit that referenced this issue Nov 7, 2023

Use the DOUBLE type kind for BigQuery FLOAT64 columns

a782a23

Closes #25

ppaglilla mentioned this issue May 6, 2024

Release v0.5.0 #37

Merged

ppaglilla closed this as completed in #37 May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analyzer Raises Exception on Assignment of FLOAT64 to FLOAT Type Field #25

Analyzer Raises Exception on Assignment of FLOAT64 to FLOAT Type Field #25

dion-ricky commented Nov 7, 2023

ppaglilla commented Nov 7, 2023 •

edited

Loading

dion-ricky commented Nov 7, 2023

dion-ricky commented Dec 18, 2023

dion-ricky commented Feb 7, 2024

ppaglilla commented May 6, 2024

Analyzer Raises Exception on Assignment of FLOAT64 to FLOAT Type Field #25

Analyzer Raises Exception on Assignment of FLOAT64 to FLOAT Type Field #25

Comments

dion-ricky commented Nov 7, 2023

ppaglilla commented Nov 7, 2023 • edited Loading

dion-ricky commented Nov 7, 2023

dion-ricky commented Dec 18, 2023

dion-ricky commented Feb 7, 2024

ppaglilla commented May 6, 2024

ppaglilla commented Nov 7, 2023 •

edited

Loading