Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Sink Writer to accept blocks inplace of bytes #86

Merged
merged 14 commits into from
Jun 7, 2022

Conversation

tardunge
Copy link
Collaborator

No description provided.

@kelindar
Copy link
Collaborator

Looks like we need to fix some tests

@tardunge tardunge requested review from atris and kelindar January 4, 2022 08:00
@tardunge
Copy link
Collaborator Author

Tests are fixed.

* Add yaml inline tag to BaseSink

* Add inline tag to BaseSink

* Revert sample yaml configuration
@tardunge
Copy link
Collaborator Author

Looks like we need to fix some tests

Fixed

@kelindar
Copy link
Collaborator

kelindar commented May 7, 2022

There's a few conflicts @tardunge

@tardunge
Copy link
Collaborator Author

There's a few conflicts @tardunge

will fix.

tardunge and others added 3 commits May 19, 2022 17:31
* Add yaml inline tag to BaseSink

* Add inline tag to BaseSink

* Revert sample yaml configuration

* fix conflict
* Add yaml inline tag to BaseSink

* Add inline tag to BaseSink

* Revert sample yaml configuration

* fix conflict

* add contributors in readme
)

// FromBlockBy creates and returns a list of new block.Row for a block.
func FromBlockBy(blk Block, schema typeof.Schema) ([]Row, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be a method on a block instead?

@@ -33,6 +35,30 @@ type Block struct {
schema typeof.Schema `binary:"-"` // The cached schema of the block
}

// Create a base block for testing purpose
func Base() ([]Block, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this function exported or we can move it to a test file itself?

if err != nil {
return err
}
// buffer, err := s.merge(blocks, schema)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is commented out?

@@ -16,7 +19,20 @@ import (
)

// ToParquet merges multiple blocks together and outputs a key and merged Parquet data
func ToParquet(blocks []block.Block, schema typeof.Schema) ([]byte, error) {
func ToParquet(input interface{}) ([]byte, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit wary of removing schema. Semantically taking a schema from the first block is different from providing a schema to the function itself. Why do we need to remove schema?

@kelindar kelindar merged commit d40b2a4 into talariadb:master Jun 7, 2022
@kelindar
Copy link
Collaborator

kelindar commented Jun 7, 2022

@tardunge please address comments in a separate PR(s)

@tardunge
Copy link
Collaborator Author

tardunge commented Jun 8, 2022

Sure.

@atlas-comstock
Copy link
Collaborator

This MR introduced an issue that the Azure writer would not initiate. @tardunge

@tardunge
Copy link
Collaborator Author

This MR introduced an issue that the Azure writer would not initiate. @tardunge

Can you create an issue and post the error or behaviour you are encountering and also the expected behaviour?

@atlas-comstock
Copy link
Collaborator

@tardunge hi, here it is #103

jeffreylean added a commit to jeffreylean/talaria that referenced this pull request Jun 27, 2022
kelindar pushed a commit that referenced this pull request Jun 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants