Skip to content

sdfsdf#2

Merged
arkadiuszbalcerek merged 104 commits intoarkadiuszbalcerek:masterfrom
trinodb:master
Sep 12, 2022
Merged

sdfsdf#2
arkadiuszbalcerek merged 104 commits intoarkadiuszbalcerek:masterfrom
trinodb:master

Conversation

@arkadiuszbalcerek
Copy link
Copy Markdown
Owner

merging

Laonel and others added 30 commits September 5, 2022 19:38
Toxiproxy docker container in version 2.4.0 has arm64 architecture
support, which allows to run tests that use toxiproxy on Macbook Pro M1
Use of `Optional.orElseThrow` allows to verify the optional is not empty
before assigning to a variable. This subsequently removes need for
`Optional.get()` calls.
Ensure that only single thread
will execute sendUpdate
This makes further refactorings easier to read.
Since the split into spilling and non-spilling, creation of join operators had
some unnecessary casting. This commit cleans it and makes clear
spilling/non-spilling code paths in LocalExecutionPlanner.

Changes in this class are strictly mechanical. No logical changes are made.
The function is misnamed. It should have been $like all along.
$like_pattern is the name of the internal function to convert
from a varchar to the compiled form of a pattern (i.e., LikeMatcher).
Read UUID values to Int128ArrayBlock instead of
VariableWidthBlock to improve performance and
to use only type preferred by UUIDType.
This is only supported in iceberg connector
as only iceberg specification has explicit
UUID support for orc.

Benchmark                                         (compression)  Mode  Cnt  Score   Error  Units
BenchmarkColumnReaders.readUuidNoNull                      NONE  avgt   20  4.425 ± 0.044  ns/op
BenchmarkColumnReaders.readUuidNoNull                      ZLIB  avgt   20  4.393 ± 0.051  ns/op
BenchmarkColumnReaders.readUuidWithNull                    NONE  avgt   20  6.459 ± 0.127  ns/op
BenchmarkColumnReaders.readUuidWithNull                    ZLIB  avgt   20  6.465 ± 0.103  ns/op
BenchmarkColumnReaders.readVarbinaryUuidNoNull             NONE  avgt   20  4.965 ± 0.077  ns/op
BenchmarkColumnReaders.readVarbinaryUuidNoNull             ZLIB  avgt   20  4.932 ± 0.088  ns/op
BenchmarkColumnReaders.readVarbinaryUuidWithNull           NONE  avgt   20  6.966 ± 0.174  ns/op
BenchmarkColumnReaders.readVarbinaryUuidWithNull           ZLIB  avgt   20  6.895 ± 0.221  ns/op
The method was a left-over from before the utf8Slice()
convenience method existed.
Add requireNonNull on parquetPredicate and columnIndexStore in ParquetReader,
pass Optional.empty() from Iceberg for parquetPredicate param
A recursive rewrite call on the function argument was missing, so the
result was potentially not canonical.
Per code style rule
Support `ANALYZE` in Iceberg connector. This collects number distinct
values (NDV) of selected columns and stores that in table properties.
This is interim solution until Iceberg library has first-class
statistics files support.
Remove `ConnectorMetadata.getStatisticsCollectionMetadata`
and `ConnectorMetadata.getTableHandleForStatisticsCollection` that were
deprecated in cbe2dca.
findinpath and others added 29 commits September 9, 2022 10:15
This allows specifying the default for each connector separately
This removes `MetastoreTypeConfig` binding added in
7229738 and replaces it with an
annotated boolean, decoupling the `HiveMetadata` logic from metastore
implementations.
Call ExchangeDataSource#noMoreInputs only after all ExchangeOperator's
have been created and have received noMoreSplits from the engine
To further allow releasing query inputs as they are being consumed
In addition to that make sure the inputs are not retained in memory
longer than necessary. While it is unlikely that the number of inputs
will be large with pipelined execution it may no longer be true with
fault tolerant execution.
Different versions of ClickHouse may support different min/max values
for the same data type, you can refer to the table below:

| version | column type | min value           | max value            |
|---------|-------------|---------------------|----------------------|
| any     | UInt8       | 0                   | 255                  |
| any     | UInt16      | 0                   | 65535                |
| any     | UInt32      | 0                   | 4294967295           |
| any     | UInt64      | 0                   | 18446744073709551615 |
| < 21.4  | Date        | 1970-01-01          | 2106-02-07           |
| < 21.4  | DateTime    | 1970-01-01 00:00:00 | 2106-02-06 06:28:15  |
| >= 21.4 | Date        | 1970-01-01          | 2149-06-06           |
| >= 21.4 | DateTime    | 1970-01-01 00:00:00 | 2106-02-07 06:28:15  |

And when the value written to ClickHouse is out of range, ClickHouse
will store the incorrect result, so we introduced
`TrinoToClickHouseWriteChecker` to check the range of the written value
to prevent ClickHouse from storing the incorrect value.

Introducing `TrinoToClickHouseWriteChecker` is also a preparation for
supporting `DateTime[timezone]` and `DateTime64(precision, [timezone])`.

The next several commits will use `TrinoToClickHouseWriteChecker` to
verify the values written to ClickHouse.
This is a preparatory commit to use `TrinoToClickHouseWriteChecker` to
validate Date.
This is a preparatory commit to use `TrinoToClickHouseWriteChecker` to
validate DateTime.
@arkadiuszbalcerek arkadiuszbalcerek merged commit af32051 into arkadiuszbalcerek:master Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.