Skip to content

Conversation

@pgabara
Copy link
Contributor

@pgabara pgabara commented Mar 6, 2018

No description provided.

@pgabara
Copy link
Contributor Author

pgabara commented Mar 6, 2018

Connects to #164

@codecov-io
Copy link

codecov-io commented Mar 6, 2018

Codecov Report

Merging #263 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #263      +/-   ##
==========================================
+ Coverage   96.21%   96.22%   +<.01%     
==========================================
  Files          52       52              
  Lines         924      926       +2     
  Branches        9       11       +2     
==========================================
+ Hits          889      891       +2     
  Misses         35       35
Impacted Files Coverage Δ
dataset/src/main/scala/frameless/TypedColumn.scala 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 992248c...25145af. Read the comment docs.

/**
* An expression that returns a substring
* {{{
* df.select(df('a).substr(0, 5))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: in Spark, is it possible to we mix literals and columns? (.substr(0, df('a)))

Copy link
Contributor Author

@pgabara pgabara Mar 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, there are only substr(startPos: Int, len: Int) and substr(startPos: Column, len: Column)

* @param startPos expression for the starting position
* @param len expression for the length of the substring
*/
def substr[TT, W](startPos: ThisType[TT, Int], len: ThisType[TT, Int])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the first columns method we have that involves 3 columns. The way you wrote it here, with TT used in both startPos and len, you are forcing these two columns to come from the same dataset. Something like the following wouldn't typecheck:

ds1.joins(ds2)(ds1('a) === ds1('a).sustr(ds1('b), ds2('c))

I know it's a contrive example, but to make the above working you could need something like the following:

def substr[TT1, TT2, W1, W2](startPos: ThisType[TT1, Int], len: ThisType[TT2, Int])
  (implicit
    i0: U =:= String,
    i1: With.Aux[T, TT1, W1],
    i2: With.Aux[W1, TT2, W2]
  ) = ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, I will fix it. thanks

@OlivierBlanvillain
Copy link
Contributor

LGTM, thanks!

@OlivierBlanvillain OlivierBlanvillain merged commit 7be4d63 into typelevel:master Mar 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants