Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support converting null TTLs to default TTL relative to write timestamp #66

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tarzanek
Copy link
Contributor

No description provided.

@tarzanek tarzanek force-pushed the TTLdiff branch 2 times, most recently from 34dfc8d to 4babb71 Compare January 27, 2022 20:46
@tarzanek
Copy link
Contributor Author

tarzanek commented Mar 7, 2022

before merging
please add a disclaimer that for safety reasons such a migration where we tamper with writetimestamps and TTLs should be always done into an empty table

@tarzanek
Copy link
Contributor Author

this merge is delayed, this approach uncovered some bugs in scylla, so merge will happen once Scylla has all the patches to support this injection

@tarzanek
Copy link
Contributor Author

all officially supported branches have now fixes, we should revisit this and merge

@julienrf julienrf self-assigned this Jul 30, 2024
Copy link
Collaborator

@julienrf julienrf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be interesting to explain in more details the motivation behind this feature: why would one create a TTL in the target table if it did not exist in the source table?

Also, would it be possible to add tests for this feature? You can draw inspiration from the tests in tests/src/test/scala/com/scylladb/migrator/scylla.

Last, you need to document the defaultTTL option in the documentation. See files docs/source/migrate-from-cassandra-or-parquet.rst and docs/source/configuration.rst.

@@ -28,6 +28,10 @@ source:
# Preserve TTLs and WRITETIMEs of cells in the source database. Note that this
# option is *incompatible* when copying tables with collections (lists, maps, sets).
preserveTimestamps: true
# optional, default TTL in seconds to use if source.preserveTimestamps AND IF original TTL is null, final TTL to be set, if 0 or unset it will be ignored
# final write TTL set will be relative to writetimestamp, so: defaultTTL - now - writetimestamp , rounded to milliseconds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# final write TTL set will be relative to writetimestamp, so: defaultTTL - now - writetimestamp , rounded to milliseconds
# final write TTL set will be relative to writetimestamp, so: defaultTTL - (now - writetimestamp), rounded to milliseconds

case ((ttl, writetime), fields) =>
case ((ttl, writetime), fields) => {
val writetimestamp = writetime.getOrElse(CassandraOption.Unset)
val baseTTL = if (defaultTTL > 0) defaultTTL else 0L
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be clearer to keep the Option in the defaultTTL parameter.

So, the parameter would have type defaultTTL: Option[Long], and here you would write val baseTTL = defaultTTL.getOrElse(0).

@@ -126,7 +128,17 @@ object Cassandra {

timestampsToFields
.map {
case ((ttl, writetime), fields) =>
case ((ttl, writetime), fields) => {
val writetimestamp = writetime.getOrElse(CassandraOption.Unset)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks suspicious because writetime has type Option[Long] and CassandraOption.Unset has a type that is disjoint from Long. See my comment below for a suggestion.

Comment on lines +134 to +138
val deltaTTL = if (writetimestamp == CassandraOption.Unset) {
0
} else {
baseTTL - (System.currentTimeMillis - writetimestamp.asInstanceOf[Long] / 1000) / 1000
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend to avoid using asInstanceOf. I suggest the following changes, which I believe are more readable:

Suggested change
val deltaTTL = if (writetimestamp == CassandraOption.Unset) {
0
} else {
baseTTL - (System.currentTimeMillis - writetimestamp.asInstanceOf[Long] / 1000) / 1000
}
val deltaTTL = writetime match {
case None => 0
case Some(writetimestamp) =>
baseTTL - (System.currentTimeMillis - writetimestamp / 1000) / 1000
}

}
// expire the record in 1s in case delta is negative, use given TTL if write timestamp wasn't found
val finalttl =
if (deltaTTL > 0) deltaTTL else if (deltaTTL == 0) baseTTL else 1L
Copy link
Collaborator

@julienrf julienrf Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here use assume that deltaTTL == 0 is always the case when the write timestamp wasn’t found. But it seems it could also happen e.g. if baseTTL == 5 and the write timestamp was 5 seconds ago.

Maybe use the type Option again to model the fact that the deltaTTL might be unset. So, you would have something like:

Suggested change
if (deltaTTL > 0) deltaTTL else if (deltaTTL == 0) baseTTL else 1L
deltaTTL.map(_.max(1L)).getOrElse(baseTTL)

@julienrf julienrf assigned tarzanek and unassigned julienrf Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants