Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort df.compileTimeSchema() columns according to df.schema() so they're easier to compare #990

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

koperagen
Copy link
Collaborator

This is quality of life improvement. I print both schemas when i test implementation of join operation. I used to get compileTimeSchema in somewhat random order, it makes diff useless. Now when columns are present in both schemas, they will go in a definite order

@koperagen koperagen added the enhancement New feature or request label Dec 6, 2024
@koperagen koperagen added this to the 0.16.0 milestone Dec 6, 2024
@koperagen koperagen requested a review from Jolanrensen December 6, 2024 12:35
@koperagen koperagen self-assigned this Dec 6, 2024
@koperagen koperagen force-pushed the compileTimeSchemaOrder branch from 3689c11 to adca94b Compare December 6, 2024 13:19
return compileSchema.sorted(order, path = root)
}

internal fun DataFrameSchema.putColumnsOrder(order: MutableMap<ColumnPath, Int>, path: ColumnPath) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinda strange to see DataFrameSchema be the receiver. I'd expect the mutable map to be the receiver, since that's what you're modifying. That could also enable the notation:

val order = buildMap<ColumnPath, Int> {
    putColumnsInOrderOf(runtimeSchema, path)
}

}
}

internal fun DataFrameSchema.sorted(order: Map<ColumnPath, Int>, path: ColumnPath): DataFrameSchema {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here it's okay to have it as a receiver :) I'd only name it "sortedBy(order)" to be more expressive.

Copy link
Collaborator

@Jolanrensen Jolanrensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some small syntactical tips, otherwise lgtm :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants