Skip to content

Enable writer scaling by default#10614

Merged
arhimondr merged 1 commit intotrinodb:masterfrom
radek-kondziolka:rk/enable_by_default_scale_writers
May 5, 2022
Merged

Enable writer scaling by default#10614
arhimondr merged 1 commit intotrinodb:masterfrom
radek-kondziolka:rk/enable_by_default_scale_writers

Conversation

@radek-kondziolka
Copy link
Copy Markdown
Contributor

@radek-kondziolka radek-kondziolka commented Jan 14, 2022

  • Add supportsReportingWrittenBytes method to Metadata interface
    • Implementing a supportsReportingWrittenBytes method for Iceberg and Hive connectors
  • Enable scaleWriters by default
  • Apply scaleWriters only for supporting connectors and tablesThe scaleWriters was being enabled by default but this strategy should be applied only for connectors that supports reporting written bytes (for now)
  • Add ValidateScaledWritersRightnessUsage to validate whether SCALED_WRITER_DISTRIBUTION partition scheme should be used actually.
  • Add tests for this feature and fix some other tests
    • The test for validator was added too

@cla-bot cla-bot bot added the cla-signed label Jan 14, 2022
@findepi
Copy link
Copy Markdown
Member

findepi commented Jan 14, 2022

cc @alexjo2144 @homar @losipiuk

@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch from 6fa2dc1 to d068d3c Compare January 14, 2022 13:41
@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch from d068d3c to fd86491 Compare January 14, 2022 14:13
@radek-kondziolka radek-kondziolka marked this pull request as ready for review January 17, 2022 08:14
@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch 4 times, most recently from 463d3e9 to 848ff39 Compare January 17, 2022 21:51
Copy link
Copy Markdown
Contributor

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my understanding of the most recent changes:

  • supportsReportingWrittenBytes replaces a connector capabilities flag. We chose this approach to allow connector return different results for different tables.
  • supportsReportingWrittenBytes has to accept TableMetadata, as TableHandle is not available for tables that are about to be created (e.g.: CREATE TABLE AS SELECT statement)
  • TableMetadata has to be included in WriterTarget which must be JSON serializable. The TableMetadata object has to be JSON serializable itself.
  • TableMetadata includes property map that is currently excluded from serialization. The property map is technically designed to store information such as table format. If this information is lost during serialization supportsReportingWrittenBytes may not function as intended. However at this moment supportsReportingWrittenBytes is invoked before the serialization round trip.

Relying on the fact that TableMetadata is not serialized / deserialized before making a supportsReportingWrittenBytes seems to be a little fragile. To make it less fragile we have to make sure the table properties stored in the TableMetadata are JSON serializable.

@radek-starburst, @martint How strongly do we feel about being able to return different results for different tables? Is there a real precedent when we would like to return different results in an existing connector? I'm just wondering whether it is worth messing with the table properties serialization to provide this capability?

@findepi
Copy link
Copy Markdown
Member

findepi commented Jan 18, 2022

Here is my understanding of the most recent changes: [...]

thanks @arhimondr for listing the issues with metadata / TableMetadata / serialization here.

Writer scaling is somewhat similar to (and currently exclusive with) writing layout.

@radek-starburst did you consider such approach?

@findepi findepi changed the title Rk/enable by default scale writers Enable writer scaling by default Jan 18, 2022
@findepi findepi added enhancement New feature or request performance labels Jan 18, 2022
@radek-kondziolka
Copy link
Copy Markdown
Contributor Author

radek-kondziolka commented Jan 18, 2022

Here is my understanding of the most recent changes:

* `supportsReportingWrittenBytes` replaces a connector capabilities flag. We chose this approach to allow connector return different results for different tables.

Firstly, thanks for you review :)

Yes

* `supportsReportingWrittenBytes` has to accept `TableMetadata`, as `TableHandle` is not available for tables that are about to be created (e.g.: CREATE TABLE AS SELECT statement)

Yes, exactly

* `TableMetadata` has to be included in `WriterTarget` which must be JSON serializable. The `TableMetadata` object has to be JSON serializable itself.

Yes

* `TableMetadata` includes property map that is currently excluded from serialization. The property map is technically designed to store information such as table format. If this information is lost during serialization `supportsReportingWrittenBytes` may not function as intended. However at this moment `supportsReportingWrittenBytes` is invoked before the serialization round trip.

No, it is not invoked before serialization round trip (not in tests). This object is serializable but not completelty serializable what means that just some fields will be null.

Relying on the fact that TableMetadata is not serialized / deserialized before making a supportsReportingWrittenBytes seems to be a little fragile. To make it less fragile we have to make sure the table properties stored in the TableMetadata are JSON serializable.

Yes, it is not so easy to make it completelty serializable because of Map<String, Object> but it is doable (I think so).

@radek-starburst, @martint How strongly do we feel about being able to return different results for different tables? Is there a real precedent when we would like to return different results in an existing connector? I'm just wondering whether it is worth messing with the table properties serialization to provide this capability?

For me, I cannot imagine a such situation that we can have different results for different tables.

@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch from 848ff39 to d13e1b4 Compare January 18, 2022 10:40
@radek-kondziolka
Copy link
Copy Markdown
Contributor Author

Here is my understanding of the most recent changes: [...]

thanks @arhimondr for listing the issues with metadata / TableMetadata / serialization here.

Writer scaling is somewhat similar to (and currently exclusive with) writing layout.

* Modeling -- support for scaling (support for reporting written bytes) could be informed about by the connector in the same place where it chooses the layout.

* Future development -- the fact that write layout precludes writer scaling is a bit limiting. At some point we could shift the requirement and make the partitioned writers adaptive too.
  See https://trinodb.slack.com/archives/CGB0QHWSW/p1642481584086800 cc @pangyifish @raunaqmorarka @losipiuk  @alexjo2144

@radek-starburst did you consider such approach?

@findepi , I did not. I could try to implement it as you suggest, but firstly, we would better wait for all to agree the common version.

@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch from d13e1b4 to dd68992 Compare January 18, 2022 10:51
@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch from dd68992 to 2ef4539 Compare February 17, 2022 11:04
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TableMetadata is a container object that describes the shape of a table (columns, properties, etc). Using it to identify a table is not appropriate, especially given that a caller can make up any TableMetadata it wants.

Tables should be identified either by name (catalog/schema/name) or by handle (i.e., and opaque identifier).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok @martint we go with that

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr @radek-starburst @martint

i happened to discuss this with @radek-starburst today

  • for existing tables (INSERT) the ConnectorTableHandle should be presented to the ConnectorMetadata to drive the decision. ConnectorTableHandle is the best info we can provide, so let's use it,
  • for new tables (CTAS) the ConnectorTableMetadata should be presented to the connector. It contains the best information possible. Showing only properties is not sufficient -- it allows connector to inspect eg partitioning, but eg doesn't allow to inspect types.
  • for UPDATE and DELETE nothing is needed since writer scaling is not applicable for these operations -- the operations happen within source split
  • for TABLE EXECUTE -- ideally we should provide ConnectorTableHandle + ConnectorTableExecuteHandle, since connector's decision may depend on the actual operation being executed. However, since we don't have such use-case today AND it is easy to backwards-compatibly add a new such API in the future, let's provide ConnectorTableHandle only as in INSERT case

Copy link
Copy Markdown
Contributor

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits

We usually try to avoid (but not always do) having commits like Implement class A. Generally the goal should be for every commit to be self-sufficient and represent a meaningful self containing change.

I would recommend squashing the commits into a single one with a simple title Enable scaled_writers by default for supported connectors

@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch from 209079b to 732a85d Compare May 4, 2022 06:14
@radek-kondziolka
Copy link
Copy Markdown
Contributor Author

I would recommend squashing the commits into a single one with a simple title Enable scaled_writers by default for supported connectors

Do you mean to squash all commits to the one?

@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch from 732a85d to 8f5bd98 Compare May 4, 2022 06:36
@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch 5 times, most recently from 1e0ae04 to f355e69 Compare May 4, 2022 11:55
Copy link
Copy Markdown
Contributor

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to squash all commits to the one?

Yes, they are all part of the same logical change

@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch 3 times, most recently from b43d561 to 87a6587 Compare May 4, 2022 16:11
@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch from 87a6587 to 8032524 Compare May 4, 2022 19:04
This PR enable scale writers by defult, adds
supportsReportingWrittenBytes method to Metadata
and implements a supportsReportingWrittenBytes method for
Iceberg, DeltaLake and Hive connectors.

Additionally, validator ValidateScaledWritersUsage
was added to validate whether the SCALED_WRITER_DISTRIBUTION
partition scheme should be used actually.

Adding tests for this feature and fix some other tests.
The test for validator was added too.
@radek-kondziolka radek-kondziolka force-pushed the rk/enable_by_default_scale_writers branch from 8032524 to 2e96713 Compare May 5, 2022 06:14
@arhimondr arhimondr merged commit 39c0273 into trinodb:master May 5, 2022
@github-actions github-actions bot added this to the 380 milestone May 5, 2022
@mosabua mosabua mentioned this pull request May 5, 2022
@mosabua
Copy link
Copy Markdown
Member

mosabua commented May 5, 2022

I think we need a release notes entry for this. Any suggestions?

@mosabua
Copy link
Copy Markdown
Member

mosabua commented May 5, 2022

First cut of docs update .. #12261

@arhimondr
Copy link
Copy Markdown
Contributor

@mosabua

I think we need a release notes entry for this. Any suggestions?

Core

* Scaled writers are now enabled by default for supporting connectors (such as Hive, Deltalake, Iceberg)

.setSystemProperty(TASK_WRITER_COUNT, "2")
.build();
getQueryRunner().execute(session, format("CREATE TABLE IF NOT EXISTS %s AS SELECT * FROM %s", "linetime_multiple_file_backed", "tpch.tiny.lineitem")).getMaterializedRows();
getQueryRunner().execute(session, format("CREATE TABLE IF NOT EXISTS %s AS SELECT * FROM %s", "orders_multiple_file_backed", "tpch.tiny.orders")).getMaterializedRows();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just use assertUpdate

Comment on lines +2547 to +2549
// We need to prepare tables for this test. The test is required to use tables that are backed by at lest two files
Session session = Session.builder(getSession())
.setSystemProperty(TASK_WRITER_COUNT, "2")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think writer count >= 2 guarantees number of files >= 2.
It still may be possible that all data ends up on a single machine and single thread (just not very likely).

Explicit partitioning, or using small target-file-size can guarantee 2+ files.

Also, if the test relies on # of files, it should explicitly validate the state of the table, eg via SELECT count(DISTINCT "$path").

cc @alexjo2144

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not know that. I was thinking that it depends on number of writers. I am going to analyze it more deeply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

6 participants