Skip to content

Missing Columns method #164

@OlivierBlanvillain

Description

@OlivierBlanvillain

Exhaustive status of the API implemented by frameless.TypedColumn compared to Spark's Column. It's split into two, the methods implemented directly on Columns, and the methods comings from org.apache.spark.sql.functions._

Column methods

Won't fix:

  • Column alias(String alias) inherently unsafe
  • Column apply(Object extraction) inherently unsafe
  • Column as(String alias) inherently unsafe
  • Column name(String alias) inherently unsafe

TODO / done:

  • Column asc_nulls_first()
  • Column asc_nulls_last()
  • Column desc_nulls_first()
  • Column desc_nulls_last()
  • void explain(boolean extended)
  • Column eqNullSafe(Object other)
  • Column getField(String fieldName)
  • Column getItem(Object key)
  • Column isNotNull()
  • Column isNull()
  • Column like(String literal)
  • Column over()
  • Column over(WindowSpec window)
  • Column rlike(String literal)
  • Column isNaN()
  • Column substr(Column startPos, Column len) (WIP Add a typed 'substr' column method #263)
  • Column substr(int startPos, int len) (WIP Add a typed 'substr' column method #263)
  • Column mod(Object other) (WIP Added mod operator to TypedColumn.scala #296)
  • Column between(Object lowerBound, Object upperBound)
  • Column multiply(Object other)
  • Column endsWith(String literal)
  • Column isin(Object... list) (add isin to TypedColumn with restriction to primitive types #254)
  • Column startsWith(Column other)
  • Column startsWith(String literal)
  • Column otherwise(Object value)
  • Column when(Column condition, Object value)
  • Column and(Column other)
  • Column contains(Object other)
  • Column or(Column other)
  • Column bitwiseAND(Object other)
  • Column bitwiseOR(Object other)
  • Column bitwiseXOR(Object other)
  • TypedColumn<Object,U> as(Encoder evidence$1) (as cast)
  • Column asc() (as sortAscending)
  • Column cast(DataType to)
  • Column desc() (as sortDescending)
  • Column divide(Object other)
  • boolean equals(Object that) (as ===)
  • Column equalTo(Object other) (as ===)
  • org.apache.spark.sql.catalyst.expressions.Expression expr()
  • Column geq(Object other) (as >=)
  • Column gt(Object other) (as >)
  • Column leq(Object other) (as <=)
  • Column lt(Object other) (as <)
  • Column minus(Object other)
  • Column notEqual(Object other) (as =!=)
  • Column plus(Object other)
  • String toString()

org.apache.spark.sql.functions

TODO / done:

  • Column col(String colName) to be implemented using shapeless.Witness
  • Column add_months(Column startDate, int numMonths)
  • Column array(String colName, String... colNames)
  • Column asc_nulls_first(String columnName)
  • Column asc_nulls_last(String columnName)
  • Column asc(String columnName)
  • Dataset broadcast(Dataset df)
  • Column ceil(String columnName)
  • Column coalesce(Column... e)
  • Column cume_dist()
  • Column current_date()
  • Column current_timestamp()
  • Column date_add(Column start, int days)
  • Column date_format(Column dateExpr, String format)
  • Column date_sub(Column start, int days)
  • Column datediff(Column end, Column start)
  • Column dayofmonth(Column e)
  • Column dayofyear(Column e)
  • Column decode(Column value, String charset)
  • Column dense_rank()
  • Column desc_nulls_first(String columnName)
  • Column desc_nulls_last(String columnName)
  • Column desc(String columnName)
  • Column encode(Column value, String charset)
  • Column expm1(String columnName)
  • Column expr(String expr)
  • Column factorial(Column e)
  • Column first(String columnName, boolean ignoreNulls)
  • Column floor(String columnName)
  • Column format_number(Column x, int d)
  • Column format_string(String format, Column... arguments)
  • Column from_json(Column e, StructType schema, scala.collection.immutable.Map<String,String> options)
  • Column from_unixtime(Column ut, String f)
  • Column from_utc_timestamp(Column ts, String tz)
  • Column get_json_object(Column e, String path)
  • Column greatest(String columnName, String... columnNames)
  • Column grouping_id(String colName, scala.collection.Seq colNames)
  • Column grouping(String columnName)
  • Column hash(Column... cols)
  • Column hash(scala.collection.Seq cols)
  • Column hex(Column column)
  • Column hour(Column e)
  • Column initcap(Column e)
  • Column input_file_name()
  • Column isnan(Column e)
  • Column isnull(Column e)
  • Column json_tuple(Column json, String... fields)
  • Column lag(String columnName, int offset, Object defaultValue)
  • Column last_day(Column e)
  • Column last(String columnName, boolean ignoreNulls)
  • Column lead(String columnName, int offset, Object defaultValue)
  • Column least(String columnName, String... columnNames)
  • Column lit(Object literal)
  • Column locate(String substr, Column str, int pos)
  • Column map(Column... cols)
  • Column map(scala.collection.Seq cols)
  • Column md5(Column e)
  • Column mean(String columnName)
  • Column minute(Column e)
  • Column monotonicallyIncreasingId()
  • Column month(Column e)
  • Column months_between(Column date1, Column date2)
  • Column nanvl(Column col1, Column col2)
  • Column next_day(Column date, String dayOfWeek)
  • Column ntile(int n)
  • Column percent_rank()
  • Column posexplode(Column e)
  • Column quarter(Column e)
  • Column radians(String columnName)
  • Column rand()
  • Column rand(long seed)
  • Column randn()
  • Column randn(long seed)
  • Column rank()
  • Column regexp_extract(Column e, String exp, int groupIdx)
  • Column repeat(Column str, int n)
  • Column rint(String columnName)
  • Column round(Column e, int scale)
  • Column row_number()
  • Column second(Column e)
  • Column signum(String columnName)
  • Column sort_array(Column e, boolean asc)
  • Column soundex(Column e)
  • Column spark_partition_id()
  • Column split(Column str, String pattern)
  • Column struct(Column... cols)
  • Column struct(scala.collection.Seq cols)
  • Column struct(String colName, scala.collection.Seq colNames)
  • Column struct(String colName, String... colNames)
  • Column substring_index(Column str, String delim, int count)
  • Column sumDistinct(Column e)
  • Column sumDistinct(String columnName)
  • Column to_date(Column e)
  • Column to_json(Column e, Map<String,String> options)
  • Column to_utc_timestamp(Column ts, String tz)
  • Column translate(Column src, String matchingString, String replaceString)
  • Column trunc(Column date, String format)
  • Column unbase64(Column e)
  • Column unhex(Column column)
  • Column unix_timestamp()
  • Column unix_timestamp(Column s)
  • Column unix_timestamp(Column s, String p)
  • Column var_pop(String columnName)
  • Column var_samp(String columnName)
  • Column weekofyear(Column e)
  • Column when(Column condition, Object value)
  • Column window(Column timeColumn, String windowDuration)
  • Column window(Column timeColumn, String windowDuration, String slideDuration)
  • Column window(Column timeColumn, String windowDuration, String slideDuration, String startTime)
  • Column year(Column e)
  • Column conv(Column num, int fromBase, int toBase)
  • Column degrees(String columnName)
  • Column negate(Column e)
  • Column not(Column e)
  • Column hypot(String leftName, String rightName)
  • Column log(double base, String columnName)
  • Column log(String columnName)
  • Column log10(Column e)
  • Column log1p(Column e)
  • Column log2(Column expr)
  • Column pmod(Column dividend, Column divisor)
  • Column pow(String leftName, String rightName)
  • Column bround(Column e, int scale)
  • Column cbrt(String columnName)
  • Column crc32(Column e)
  • Column exp(String columnName)
  • Column sha1(Column e)
  • Column sha2(Column e, int numBits)
  • Column shiftLeft(Column e, int numBits)
  • Column shiftRight(Column e, int numBits)
  • Column shiftRightUnsigned(Column e, int numBits)
  • Column sqrt(String colName)
  • Column cos(String columnName)
  • Column cosh(String columnName)
  • Column sin(String columnName)
  • Column sinh(String columnName)
  • Column tan(String columnName)
  • Column tanh(String columnName)
  • Column approxCountDistinct(String columnName, double rsd)
  • Column avg(String columnName)
  • Column callUDF(String udfName, Column... cols)
  • Column collect_list(String columnName) (as collectList)
  • Column collect_set(String columnName) (as collectSet)
  • Column corr(String columnName1, String columnName2)
  • Column count(Column e)
  • Column countDistinct(String columnName, String... columnNames)
  • Column explode(Column e)
  • Column first(String columnName)
  • Column last(String columnName)
  • Column max(String columnName)
  • Column min(String columnName)
  • Column size(Column e)
  • Column stddev(String columnName)
  • Column sum(Column e)
  • UserDefinedFunction udf(scala.Function0 f, scala.reflect.api.TypeTags.TypeTag evidence$1)
  • <RT,A1> UserDefinedFunction udf(scala.Function1<A1,RT> f, scala.reflect.api.TypeTags.TypeTag evidence$2, scala.reflect.api.TypeTags.TypeTag evidence$3)
  • UserDefinedFunction udf(Object f, DataType dataType)
  • Column variance(String columnName)
  • Column stddev_pop(String columnName)
  • Column stddev_samp(String columnName)
  • Column covar_pop(String columnName1, String columnName2)
  • Column covar_samp(String columnName1, String columnName2)
  • Column kurtosis(String columnName)
  • Column skewness(String columnName)
  • Column abs(Column e)
  • Column acos(String columnName)
  • Column array_contains(Column column, Object value)
  • Column ascii(Column e)
  • Column asin(String columnName)
  • Column atan(String columnName)
  • Column atan2(String leftName, String rightName)
  • Column base64(Column e)
  • Column bin(String columnName)
  • Column bitwiseNOT(Column e)
  • Column concat_ws(String sep, Column... exprs)
  • Column concat(Column... exprs)
  • Column instr(Column str, String substring)
  • Column length(Column e)
  • Column levenshtein(Column l, Column r)
  • Column lower(Column e)
  • Column lpad(Column str, int len, String pad)
  • Column ltrim(Column e)
  • Column regexp_replace(Column e, String pattern, String replacement)
  • Column reverse(Column str)
  • Column rpad(Column str, int len, String pad)
  • Column rtrim(Column e)
  • Column substring(Column str, int pos, int len)
  • Column trim(Column e)
  • Column upper(Column e)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions