-
Notifications
You must be signed in to change notification settings - Fork 138
Add a typed 'substr' column method #263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a typed 'substr' column method #263
Conversation
|
Connects to #164 |
Codecov Report
@@ Coverage Diff @@
## master #263 +/- ##
==========================================
+ Coverage 96.21% 96.22% +<.01%
==========================================
Files 52 52
Lines 924 926 +2
Branches 9 11 +2
==========================================
+ Hits 889 891 +2
Misses 35 35
Continue to review full report at Codecov.
|
| /** | ||
| * An expression that returns a substring | ||
| * {{{ | ||
| * df.select(df('a).substr(0, 5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: in Spark, is it possible to we mix literals and columns? (.substr(0, df('a)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, there are only substr(startPos: Int, len: Int) and substr(startPos: Column, len: Column)
| * @param startPos expression for the starting position | ||
| * @param len expression for the length of the substring | ||
| */ | ||
| def substr[TT, W](startPos: ThisType[TT, Int], len: ThisType[TT, Int]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the first columns method we have that involves 3 columns. The way you wrote it here, with TT used in both startPos and len, you are forcing these two columns to come from the same dataset. Something like the following wouldn't typecheck:
ds1.joins(ds2)(ds1('a) === ds1('a).sustr(ds1('b), ds2('c))I know it's a contrive example, but to make the above working you could need something like the following:
def substr[TT1, TT2, W1, W2](startPos: ThisType[TT1, Int], len: ThisType[TT2, Int])
(implicit
i0: U =:= String,
i1: With.Aux[T, TT1, W1],
i2: With.Aux[W1, TT2, W2]
) = ...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense, I will fix it. thanks
|
LGTM, thanks! |
No description provided.