Skip to content

Conversation

@cloud-fan
Copy link
Contributor

We need a new data type to represent time intervals. Because we can't determine how many days in a month, so we need 2 values for interval: a int months, a long microseconds.

The interval literal syntax looks like:
interval 3 years -4 month 4 weeks 3 second

Because we use number of 100ns as value of TimestampType, so it may not makes sense to support nano second unit.

@cloud-fan
Copy link
Contributor Author

cc @rxin

@SparkQA
Copy link

SparkQA commented Jul 5, 2015

Test build #36544 has finished for PR 7226 at commit 0502af9.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 5, 2015

Test build #36547 has finished for PR 7226 at commit 1adbae1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's create an abstract data type for IntervalType.

@rxin
Copy link
Contributor

rxin commented Jul 6, 2015

We should add a way to cast interval into string and string to interval too. That can go in a separate pull request though.

@SparkQA
Copy link

SparkQA commented Jul 6, 2015

Test build #36572 has finished for PR 7226 at commit 9129ab0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Jul 6, 2015

@cloud-fan actually upon further thought, I think it makes more sense to have a single IntervalType and internally represent it as a 12-byte value (1 long + 1 int). We can create a new class for it.

@cloud-fan cloud-fan changed the title [SPARK-8753][SQL][WIP] Create an IntervalType data type [SPARK-8753][SQL] Create an IntervalType data type Jul 7, 2015
@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36683 has finished for PR 7226 at commit 3c641a0.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36684 has finished for PR 7226 at commit 296dfb6.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36685 has finished for PR 7226 at commit 504456c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36691 has finished for PR 7226 at commit b1fc20f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Interval(months: Int, microseconds: Long)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a legit comparison actually -- i'm not sure if you can sort directly on interval data types this way.

(the problem is that the number of seconds or days can be greater than month)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just assume there are 30 days in a month and transform months to microseconds and compare them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't make it extend AtomicType here, as I haven't figured out how to compare intervals. 30 days and 1 months may have different compare result in different context.

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #36757 has finished for PR 7226 at commit ac348c3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Interval(months: Int, microseconds: Long) extends Serializable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when is this called? if it is called during analysis, it'd make more sense to throw AnalysisException, since that has better error reporting in Python.

@rxin
Copy link
Contributor

rxin commented Jul 8, 2015

looks good otherwise.

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #36787 has finished for PR 7226 at commit 43ccc80.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #36789 has finished for PR 7226 at commit 632062d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public final class Interval implements Serializable

@rxin
Copy link
Contributor

rxin commented Jul 8, 2015

Thanks - merging this in.

@asfgit asfgit closed this in 0ba98c0 Jul 8, 2015
@cloud-fan cloud-fan deleted the interval branch July 9, 2015 01:16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you implement to string here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liancheng
Copy link
Contributor

Should we make IntervalType an AtomicType?

@cloud-fan
Copy link
Contributor Author

hi @liancheng , I didn't make it extend AtomicType, as I haven't figured out how to compare intervals. 30 days and 1 months may have different compare result in different context.

@liancheng
Copy link
Contributor

@cloud-fan Thanks for the explanation :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants