-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25202] [SQL] Implements split with limit sql function #22227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 22 commits
15362be
ceb3f41
e564a68
4e10733
5135cb2
e8c8c8c
8e16328
ca23ea3
96bc875
79599eb
a27c848
fa128db
a641106
7e4ba98
d80b1a1
d17d2df
64b0afc
4e84df0
b12ee88
69d2190
b5994ad
5c8f487
34ba74f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3404,19 +3404,27 @@ setMethod("collect_set", | |
| #' Equivalent to \code{split} SQL function. | ||
| #' | ||
| #' @rdname column_string_functions | ||
| #' @param limit determines the length of the returned array. | ||
| #' \itemize{ | ||
| #' \item \code{limit > 0}: length of the array will be at most \code{limit} | ||
| #' \item \code{limit <= 0}: the returned array can have any length | ||
| #' } | ||
| #' | ||
| #' @aliases split_string split_string,Column-method | ||
| #' @examples | ||
| #' | ||
| #' \dontrun{ | ||
| #' head(select(df, split_string(df$Sex, "a"))) | ||
| #' head(select(df, split_string(df$Class, "\\d"))) | ||
| #' head(select(df, split_string(df$Class, "\\d", 2))) | ||
|
||
| #' # This is equivalent to the following SQL expression | ||
| #' head(selectExpr(df, "split(Class, '\\\\d')"))} | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm i think L3418 shall be followed by L3420?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. good point - also the example should run in the order documented.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes will make that change @viirya @felixcheung |
||
| #' @note split_string 2.3.0 | ||
| setMethod("split_string", | ||
| signature(x = "Column", pattern = "character"), | ||
| function(x, pattern) { | ||
| jc <- callJStatic("org.apache.spark.sql.functions", "split", x@jc, pattern) | ||
| function(x, pattern, limit = -1) { | ||
| jc <- callJStatic("org.apache.spark.sql.functions", | ||
| "split", x@jc, pattern, as.integer(limit)) | ||
| column(jc) | ||
| }) | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1803,6 +1803,14 @@ test_that("string operators", { | |
| collect(select(df4, split_string(df4$a, "\\\\")))[1, 1], | ||
| list(list("[email protected] 1", "b")) | ||
| ) | ||
| expect_equal( | ||
| collect(select(df4, split_string(df4$a, "\\.", 2)))[1, 1], | ||
| list(list("a", "[email protected] 1\\b")) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. let's add a test for
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added for |
||
| ) | ||
| expect_equal( | ||
| collect(select(df4, split_string(df4$a, "b", 0)))[1, 1], | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't add all of those cases for R. One test case to check if they can be called should be good enough.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. per @felixcheung's I added back the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for context, we've had some cases in the past the wrong value is passed for an parameter - so let's at least get one with and one without any optional parameter
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @felixcheung just to confirm, things look ok here to you? We now have both with/without optional parameter test cases |
||
| list(list("a.", "@c.d 1\\", "")) | ||
| ) | ||
|
|
||
| l5 <- list(list(a = "abc")) | ||
| df5 <- createDataFrame(l5) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -394,12 +394,14 @@ public void substringSQL() { | |
|
|
||
| @Test | ||
| public void split() { | ||
| assertTrue(Arrays.equals(fromString("ab,def,ghi").split(fromString(","), -1), | ||
| new UTF8String[]{fromString("ab"), fromString("def"), fromString("ghi")})); | ||
| assertTrue(Arrays.equals(fromString("ab,def,ghi").split(fromString(","), 2), | ||
| new UTF8String[]{fromString("ab"), fromString("def,ghi")})); | ||
| assertTrue(Arrays.equals(fromString("ab,def,ghi").split(fromString(","), 2), | ||
| new UTF8String[]{fromString("ab"), fromString("def,ghi")})); | ||
| UTF8String[] negativeAndZeroLimitCase = | ||
| new UTF8String[]{fromString("ab"), fromString("def"), fromString("ghi"), fromString("")}; | ||
| assertTrue(Arrays.equals(fromString("ab,def,ghi,").split(fromString(","), 0), | ||
| negativeAndZeroLimitCase)); | ||
| assertTrue(Arrays.equals(fromString("ab,def,ghi,").split(fromString(","), -1), | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why should we change the existing tests? Just add one test to check
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @HyukjinKwon the last two were duplicates: And I also thought it better to include the case where you do get an empty string (adding one more instance of the regex at the end). Want me to revert? My view is it's more exhaustive of the expected behavior, and also easier to see that limit = -1 should behave exactly like limit = 0.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's fix the indentation to show less diff. |
||
| negativeAndZeroLimitCase)); | ||
| assertTrue(Arrays.equals(fromString("ab,def,ghi,").split(fromString(","), 2), | ||
| new UTF8String[]{fromString("ab"), fromString("def,ghi,")})); | ||
| } | ||
|
|
||
| @Test | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1671,18 +1671,32 @@ def repeat(col, n): | |
|
|
||
| @since(1.5) | ||
| @ignore_unicode_prefix | ||
| def split(str, pattern): | ||
| def split(str, pattern, limit=-1): | ||
| """ | ||
| Splits str around pattern (pattern is a regular expression). | ||
| Splits str around matches of the given pattern. | ||
|
|
||
| .. note:: pattern is a string represent the regular expression. | ||
| :param str: a string expression to split | ||
| :param pattern: a string representing a regular expression. The regex string should be | ||
| a Java regular expression. | ||
| :param limit: an integer which controls the number of times `pattern` is applied. | ||
|
|
||
| >>> df = spark.createDataFrame([('ab12cd',)], ['s',]) | ||
| >>> df.select(split(df.s, '[0-9]+').alias('s')).collect() | ||
| [Row(s=[u'ab', u'cd'])] | ||
| * ``limit > 0``: The resulting array's length will not be more than `limit`, and the | ||
| resulting array's last entry will contain all input beyond the last | ||
| matched pattern. | ||
| * ``limit <= 0``: `pattern` will be applied as many times as possible, and the resulting | ||
| array can be of any size. | ||
|
|
||
| .. versionchanged:: 3.0 | ||
| `split` now takes an optional `limit` field. If not provided, default limit value is -1. | ||
|
|
||
| >>> df = spark.createDataFrame([('oneAtwoBthreeC',)], ['s',]) | ||
| >>> df.select(split(df.s, '[ABC]', 2).alias('s')).collect() | ||
| [Row(s=[u'one', u'twoBthreeC'])] | ||
| >>> df.select(split(df.s, '[ABC]', -1).alias('s')).collect() | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's turn into this an example without limit argument. |
||
| [Row(s=[u'one', u'two', u'three', u''])] | ||
| """ | ||
| sc = SparkContext._active_spark_context | ||
| return Column(sc._jvm.functions.split(_to_java_column(str), pattern)) | ||
| return Column(sc._jvm.functions.split(_to_java_column(str), pattern, limit)) | ||
|
|
||
|
|
||
| @ignore_unicode_prefix | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -157,7 +157,7 @@ case class Like(left: Expression, right: Expression) extends StringRegexExpressi | |
| arguments = """ | ||
| Arguments: | ||
| * str - a string expression | ||
| * regexp - a string expression. The pattern string should be a Java regular expression. | ||
| * regexp - a string expression. The regex string should be a Java regular expression. | ||
|
|
||
| Since Spark 2.0, string literals (including regex patterns) are unescaped in our SQL | ||
| parser. For example, to match "\abc", a regular expression for `regexp` can be | ||
|
|
@@ -229,33 +229,53 @@ case class RLike(left: Expression, right: Expression) extends StringRegexExpress | |
|
|
||
|
|
||
| /** | ||
| * Splits str around pat (pattern is a regular expression). | ||
| * Splits str around matches of the given regex. | ||
| */ | ||
| @ExpressionDescription( | ||
| usage = "_FUNC_(str, regex) - Splits `str` around occurrences that match `regex`.", | ||
| usage = "_FUNC_(str, regex, limit) - Splits `str` around occurrences that match `regex`" + | ||
| " and returns an array with a length of at most `limit`", | ||
| arguments = """ | ||
| Arguments: | ||
| * str - a string expression to split. | ||
| * regex - a string representing a regular expression. The regex string should be a | ||
| Java regular expression. | ||
| * limit - an integer expression which controls the number of times the regex is applied. | ||
| * limit > 0: The resulting array's length will not be more than `limit`, | ||
| and the resulting array's last entry will contain all input | ||
| beyond the last matched regex. | ||
| * limit <= 0: `regex` will be applied as many times as possible, and | ||
| the resulting array can be of any size. | ||
| """, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about this formatting?; |
||
| examples = """ | ||
| Examples: | ||
| > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]'); | ||
| ["one","two","three",""] | ||
| > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', -1); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is better to keep original example for default value. |
||
| ["one","two","three",""] | ||
| > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', 2); | ||
| ["one","twoBthreeC"] | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add the netative case? |
||
| """) | ||
| case class StringSplit(str: Expression, pattern: Expression) | ||
| extends BinaryExpression with ImplicitCastInputTypes { | ||
| case class StringSplit(str: Expression, regex: Expression, limit: Expression) | ||
| extends TernaryExpression with ImplicitCastInputTypes { | ||
|
|
||
| override def left: Expression = str | ||
| override def right: Expression = pattern | ||
| override def dataType: DataType = ArrayType(StringType) | ||
| override def inputTypes: Seq[DataType] = Seq(StringType, StringType) | ||
| override def inputTypes: Seq[DataType] = Seq(StringType, StringType, IntegerType) | ||
| override def children: Seq[Expression] = str :: regex :: limit :: Nil | ||
|
|
||
| override def nullSafeEval(string: Any, regex: Any): Any = { | ||
| val strings = string.asInstanceOf[UTF8String].split(regex.asInstanceOf[UTF8String], -1) | ||
| def this(exp: Expression, regex: Expression) = this(exp, regex, Literal(-1)); | ||
|
|
||
| override def nullSafeEval(string: Any, regex: Any, limit: Any): Any = { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we still need to do some check on
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @viirya the underlying implementation of this method is |
||
| val strings = string.asInstanceOf[UTF8String].split( | ||
| regex.asInstanceOf[UTF8String], limit.asInstanceOf[Int]) | ||
| new GenericArrayData(strings.asInstanceOf[Array[Any]]) | ||
| } | ||
|
|
||
| override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { | ||
| val arrayClass = classOf[GenericArrayData].getName | ||
| nullSafeCodeGen(ctx, ev, (str, pattern) => | ||
| nullSafeCodeGen(ctx, ev, (str, regex, limit) => { | ||
| // Array in java is covariant, so we don't need to cast UTF8String[] to Object[]. | ||
| s"""${ev.value} = new $arrayClass($str.split($pattern, -1));""") | ||
| s"""${ev.value} = new $arrayClass($str.split($regex,$limit));""".stripMargin | ||
| }) | ||
| } | ||
|
|
||
| override def prettyName: String = "split" | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -2546,15 +2546,39 @@ object functions { | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| def soundex(e: Column): Column = withExpr { SoundEx(e.expr) } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * Splits str around pattern (pattern is a regular expression). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * Splits str around matches of the given regex. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @note Pattern is a string representation of the regular expression. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @param str a string expression to split | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @param regex a string representing a regular expression. The regex string should be | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * a Java regular expression. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @group string_funcs | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @since 1.5.0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| */ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| def split(str: Column, pattern: String): Column = withExpr { | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| StringSplit(str.expr, lit(pattern).expr) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| def split(str: Column, regex: String): Column = withExpr { | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we just keep it as
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The reason I changed it is that every time we mentioned just felt like unnecessary explanation needed if we called the variable
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea, I don't think we should change the name in case either makes sense in a way.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. by having it as Happy to revert as well of course
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this an API breaking change?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes it is for source compatibility in scala
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yea Scala is sensitive to parameter name, as the caller can do: so this is binary-compatible but not source-compatible. @HyukjinKwon can you help revert this line?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, but for the record such changes already have been made so far not only in SQL but SS sides if I am not remembering wrongly because users are expected to likely edit their source when they compile against Spark 3.0, and it doesn't break existing compiled apps. I am not sure why this one is special but sure it's easy to keep the compat with a minimal change. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| StringSplit(str.expr, Literal(regex), Literal(-1)) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * Splits str around matches of the given regex. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @param str a string expression to split | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @param regex a string representing a regular expression. The regex string should be | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * a Java regular expression. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @param limit an integer expression which controls the number of times the regex is applied. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * <ul> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * <li>limit greater than 0: The resulting array's length will not be more than limit, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * and the resulting array's last entry will contain all input beyond the last | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * matched regex.</li> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * <li>limit less than or equal to 0: `regex` will be applied as many times as | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * possible, and the resulting array can be of any size.</li> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we don't need
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was asked to do
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean we may not need ending tag
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah, I'll look into that
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @viirya throughout this repository, the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. Then it's fine. Thanks for looking at it. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * </ul> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's use the same way to make it multiple lines spark/sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala Lines 338 to 386 in e754887
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you can just:
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh I thought you wanted to have the explanations as sub bullets, will make that change @HyukjinKwon |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @group string_funcs | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * @since 3.0.0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| */ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| def split(str: Column, regex: String, limit: Int): Column = withExpr { | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| StringSplit(str.expr, Literal(regex), Literal(limit)) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we mention this is an optional param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
going to include this in the
@detailssection, as other functions likeltrimhandle optionality of one of its params there.