-
Notifications
You must be signed in to change notification settings - Fork 8
Allow fetch to be called from zipline run outside of Driver.scala so that spark is not required
#306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
davidhan@Davids-MacBook-Pro: ~/zipline/chronon (davidhan/fetch_no_spark) $ java -cp $SERVICE:$CLOUD_GCP ai.chronon.online.FetcherMain fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl --conf-type=group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}' --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance
16:41:37.223 [main] INFO ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] ---
16:41:37.226 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
16:41:37.359 [main] INFO ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH
16:41:37.428 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
16:41:37.531 [pool-22-thread-1] INFO ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5)
16:41:37.547 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00
16:41:37.547 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Window Tails:
16:41:37.548 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_3d -> Some(2023-11-29 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_14d -> Some(2023-11-18 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_30d -> Some(2023-11-02 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_3d -> Some(2023-11-29 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_30d -> Some(2023-11-02 00:00:00)
16:41:37.549 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_last10 -> Some(unbounded)
16:41:37.576 [main] INFO ai.chronon.online.FetcherMain$ - --- [FETCHED RESULT] ---
{
"purchase_price_average_14d" : 72.5,
"purchase_price_average_30d" : 250.6,
"purchase_price_average_3d" : null,
"purchase_price_count_14d" : 2,
"purchase_price_count_30d" : 5,
"purchase_price_count_3d" : null,
"purchase_price_last10" : [ 76, 69, 367, 466, 275 ],
"purchase_price_sum_14d" : 145,
"purchase_price_sum_30d" : 1253,
"purchase_price_sum_3d" : null
}
16:41:37.576 [main] INFO ai.chronon.online.FetcherMain$ - Fetched in: 339.71775 ms
WalkthroughThis pull request introduces enhancements to Chronon's GCP integration and configuration handling across multiple files. The changes focus on improving the deployment and fetching mechanisms for services, with key modifications in Changes
Possibly related PRs
Suggested Reviewers
Poem
Warning Review ran into problems🔥 ProblemsGitHub Actions: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository. Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (8)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
online/src/main/scala/ai/chronon/online/FetcherMain.scala (1)
113-139: Potential long run without timeouts.The fetch loop may run indefinitely (especially when
loopis true) with minimal breaks. Consider graceful interruption or shorter intervals for production use.api/py/ai/chronon/repo/run.py (2)
477-484: Clarify fallback logic.Switching entrypoints between
FetcherMainandDrivercan be confusing. A brief comment explaining which modes use each would aid comprehension.
821-833: Rename method to be consistent.
download_service_jarstands out from the existingdownload_chronon_gcp_jar. Use consistent naming for clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (4)
api/py/ai/chronon/repo/run.py(6 hunks)distribution/build_and_upload_gcp_artifacts.sh(3 hunks)online/src/main/scala/ai/chronon/online/FetcherMain.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/Driver.scala(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: mutation_spark_tests
- GitHub Check: fetcher_spark_tests
- GitHub Check: table_utils_delta_format_spark_tests
- GitHub Check: other_spark_tests
- GitHub Check: scala_compile_fmt_fix
- GitHub Check: no_spark_scala_tests
- GitHub Check: join_spark_tests
🔇 Additional comments (11)
online/src/main/scala/ai/chronon/online/FetcherMain.scala (2)
31-70: Validate error handling.The trait defines many optional fields but doesn’t fully validate incompatible usage (e.g., missing
keyJsonalongsidekeyJsonFile). If an invalid combination is provided, the user might get a misleading error.
190-196: Check exit usage.
System.exit(0)abruptly halts the JVM. Ensure no essential cleanup is bypassed.api/py/ai/chronon/repo/run.py (3)
668-674: Validate GCP env variables.Ensure presence of project ID and instance ID. If missing, consider raising an error rather than silently continuing.
903-907: Parameter grouping.The
--is-gcpaddition is useful, but re-check that relevant code paths handle the new parameter consistently without duplication.
920-922: Combine jars carefully.Joining two jar paths with
:can be OS-dependent. For Windows, use;. Consider a safer approach for cross-platform usage.spark/src/main/scala/ai/chronon/spark/Driver.scala (2)
30-30: Good addition.Importing
FetcherMainclarifies concurrency with the main fetch logic. Ensures reuse of tested code.
637-639: Delegation is neat.Using
FetcherMain.run(args)avoids code duplication. This is a maintainability win.distribution/build_and_upload_gcp_artifacts.sh (4)
44-44: LGTM!Service JAR build command follows existing pattern.
48-48: LGTM!Service JAR path follows existing convention.
60-64: LGTM!Service JAR existence check matches existing pattern.
81-81: Verify GCP storage permissions.Upload command looks good, but ensure service JAR upload permissions are configured.
| def impl(props: Map[String, String]): Api = { | ||
| val urls = Array(new File(onlineJar()).toURI.toURL) | ||
| val cl = ScalaClassLoader.fromURLs(urls, this.getClass.getClassLoader) | ||
| val cls = cl.loadClass(onlineClass()) | ||
| val constructor = cls.getConstructors.apply(0) | ||
| val onlineImpl = constructor.newInstance(props) | ||
| onlineImpl.asInstanceOf[Api] | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure reflection is safe.
Dynamically loading classes via ScalaClassLoader can introduce security risks if the jar is untrusted. Confirm that only trusted jars are used.
| val result = Await.result(resultFuture, 5.seconds) | ||
| val awaitTimeMs = (System.nanoTime - startNs) / 1e6d | ||
|
|
||
| // treeMap to produce a sorted result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid fixed 5-second timeout.
Large data fetches could exceed 5 seconds. Consider configurable timeouts or a retry mechanism for robust handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
online/src/main/scala/ai/chronon/online/FetcherMain.scala (1)
113-191: Fetch logic is solid; consider configurable Await timeout.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (3)
api/py/ai/chronon/repo/run.py(6 hunks)online/src/main/scala/ai/chronon/online/FetcherMain.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/Driver.scala(2 hunks)
🧰 Additional context used
📓 Learnings (1)
spark/src/main/scala/ai/chronon/spark/Driver.scala (3)
Learnt from: chewy-zlai
PR: zipline-ai/chronon#62
File: spark/src/main/scala/ai/chronon/spark/stats/drift/SummaryUploader.scala:9-10
Timestamp: 2024-11-12T09:38:33.532Z
Learning: In Spark applications, when defining serializable classes, passing an implicit `ExecutionContext` parameter can cause serialization issues. In such cases, it's acceptable to use `scala.concurrent.ExecutionContext.Implicits.global`.
Learnt from: chewy-zlai
PR: zipline-ai/chronon#50
File: spark/src/main/scala/ai/chronon/spark/stats/drift/SummaryUploader.scala:37-40
Timestamp: 2024-11-12T09:38:33.532Z
Learning: Avoid using `Await.result` in production code; prefer handling `Future`s asynchronously when possible to prevent blocking.
Learnt from: nikhil-zlai
PR: zipline-ai/chronon#50
File: spark/src/main/scala/ai/chronon/spark/stats/drift/SummaryUploader.scala:19-47
Timestamp: 2024-11-12T09:38:33.532Z
Learning: In Scala, the `grouped` method on collections returns an iterator, allowing for efficient batch processing without accumulating all records in memory.
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: table_utils_delta_format_spark_tests
- GitHub Check: other_spark_tests
- GitHub Check: mutation_spark_tests
- GitHub Check: join_spark_tests
- GitHub Check: no_spark_scala_tests
- GitHub Check: scala_compile_fmt_fix
- GitHub Check: fetcher_spark_tests
- GitHub Check: enforce_triggered_workflows
🔇 Additional comments (13)
api/py/ai/chronon/repo/run.py (7)
411-412: Looks straightforward.
477-480: Mode-based entrypoint switch is correct.Also applies to: 482-482, 484-484
668-674: Conditional GCP args insertion is neat.
821-833: GCS download logic aligns with existing pattern.
903-903: CLI flag addition looks fine.
907-907: Parameter inclusion is consistent.
920-922: Validate Jar existence before concatenation.online/src/main/scala/ai/chronon/online/FetcherMain.scala (4)
31-101: Trait covers options cleanly; reflection approach is okay.
103-108: Subcommand integration is standard.
110-112: Thrift parse helper looks good.
193-197: Main method is straightforward.spark/src/main/scala/ai/chronon/spark/Driver.scala (2)
28-28: Importing FetcherMain is appropriate.
630-632: Delegation to FetcherMain simplifies fetch logic.
piyush-zlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarification q on the service jar, otherwise looks good to me
| return chronon_gcp_jar_destination_path | ||
|
|
||
|
|
||
| def download_service_jar(destination_dir: str, customer_id: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this for the fetch verb right? Fetch just uses the online & gcp jars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes it does. so i tried to just use the online assembly jar but was missing some deps like apache commons here https://github.com/zipline-ai/chronon/blob/main/build.sbt#L423
…o that spark is not required (#306) ## Summary Tested fetch with integration script below. (Also tested via Driver.scala and also works pasting further down) ``` <<<<<.....................................FETCH.....................................>>>>> + touch tmp_fetch.out + zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}' + tee /dev/tty tmp_fetch.out + grep -q purchase_price_average_14d Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False} 18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT 18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service. 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo] 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo] 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml] 18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath. 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT] 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE] 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender] 18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each. 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used 18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z 18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE] 18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration. 18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point 18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY 18:50:34.810 [main] INFO ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] --- 18:50:34.812 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:34.979 [main] INFO ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH 18:50:35.048 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:35.165 [pool-22-thread-1] INFO ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5) 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Window Tails: 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 18:50:35.180 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_30d -> Some(2023-11-02 00:00:00) + cat tmp_fetch.out + grep purchase_price_average_14d 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) + '[' 0 -ne 0 ']' + echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m' <<<<<.....................................SUCCEEDED!!!.....................................>>>>> ``` Test via Driver.scala: ``` davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}' --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. 2025-01-30 18:57:04 INFO FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] --- 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:58 - Window Tails: 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_last10 -> Some(unbounded) 2025-01-30 18:57:04 INFO FetcherMain$:174 - --- [FETCHED RESULT] --- { "purchase_price_average_14d" : 72.5, "purchase_price_average_30d" : 250.6, "purchase_price_average_3d" : null, "purchase_price_count_14d" : 2, "purchase_price_count_30d" : 5, "purchase_price_count_3d" : null, "purchase_price_last10" : [ 76, 69, 367, 466, 275 ], "purchase_price_sum_14d" : 145, "purchase_price_sum_30d" : 1253, "purchase_price_sum_3d" : null } 2025-01-30 18:57:04 INFO FetcherMain$:176 - Fetched in: 366.143459 ms ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Added support for Google Cloud Platform (GCP) integration in the application. - Introduced a new command-line tool for fetching data with flexible configuration options. - **Improvements** - Enhanced configuration handling for more robust parameter processing. - Streamlined jar downloading and deployment processes. - Improved command-line argument parsing for better usability. - Integrated service component into the build and upload process. - **Technical Updates** - Updated build and upload scripts to include service JAR artifacts. - Refactored fetcher functionality to centralize data retrieval logic. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…o that spark is not required (#306) ## Summary Tested fetch with integration script below. (Also tested via Driver.scala and also works pasting further down) ``` <<<<<.....................................FETCH.....................................>>>>> + touch tmp_fetch.out + zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}' + tee /dev/tty tmp_fetch.out + grep -q purchase_price_average_14d Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False} 18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT 18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service. 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo] 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo] 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml] 18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath. 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT] 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE] 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender] 18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each. 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used 18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z 18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE] 18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration. 18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point 18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY 18:50:34.810 [main] INFO ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] --- 18:50:34.812 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:34.979 [main] INFO ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH 18:50:35.048 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:35.165 [pool-22-thread-1] INFO ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5) 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Window Tails: 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 18:50:35.180 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_30d -> Some(2023-11-02 00:00:00) + cat tmp_fetch.out + grep purchase_price_average_14d 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) + '[' 0 -ne 0 ']' + echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m' <<<<<.....................................SUCCEEDED!!!.....................................>>>>> ``` Test via Driver.scala: ``` davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}' --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. 2025-01-30 18:57:04 INFO FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] --- 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:58 - Window Tails: 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_last10 -> Some(unbounded) 2025-01-30 18:57:04 INFO FetcherMain$:174 - --- [FETCHED RESULT] --- { "purchase_price_average_14d" : 72.5, "purchase_price_average_30d" : 250.6, "purchase_price_average_3d" : null, "purchase_price_count_14d" : 2, "purchase_price_count_30d" : 5, "purchase_price_count_3d" : null, "purchase_price_last10" : [ 76, 69, 367, 466, 275 ], "purchase_price_sum_14d" : 145, "purchase_price_sum_30d" : 1253, "purchase_price_sum_3d" : null } 2025-01-30 18:57:04 INFO FetcherMain$:176 - Fetched in: 366.143459 ms ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Added support for Google Cloud Platform (GCP) integration in the application. - Introduced a new command-line tool for fetching data with flexible configuration options. - **Improvements** - Enhanced configuration handling for more robust parameter processing. - Streamlined jar downloading and deployment processes. - Improved command-line argument parsing for better usability. - Integrated service component into the build and upload process. - **Technical Updates** - Updated build and upload scripts to include service JAR artifacts. - Refactored fetcher functionality to centralize data retrieval logic. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…o that spark is not required (#306) ## Summary Tested fetch with integration script below. (Also tested via Driver.scala and also works pasting further down) ``` <<<<<.....................................FETCH.....................................>>>>> + touch tmp_fetch.out + zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}' + tee /dev/tty tmp_fetch.out + grep -q purchase_price_average_14d Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False} 18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT 18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service. 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo] 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo] 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml] 18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath. 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT] 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE] 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender] 18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each. 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used 18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z 18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE] 18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration. 18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point 18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY 18:50:34.810 [main] INFO ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] --- 18:50:34.812 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:34.979 [main] INFO ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH 18:50:35.048 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:35.165 [pool-22-thread-1] INFO ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5) 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Window Tails: 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 18:50:35.180 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_30d -> Some(2023-11-02 00:00:00) + cat tmp_fetch.out + grep purchase_price_average_14d 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) + '[' 0 -ne 0 ']' + echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m' <<<<<.....................................SUCCEEDED!!!.....................................>>>>> ``` Test via Driver.scala: ``` davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}' --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. 2025-01-30 18:57:04 INFO FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] --- 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:58 - Window Tails: 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_last10 -> Some(unbounded) 2025-01-30 18:57:04 INFO FetcherMain$:174 - --- [FETCHED RESULT] --- { "purchase_price_average_14d" : 72.5, "purchase_price_average_30d" : 250.6, "purchase_price_average_3d" : null, "purchase_price_count_14d" : 2, "purchase_price_count_30d" : 5, "purchase_price_count_3d" : null, "purchase_price_last10" : [ 76, 69, 367, 466, 275 ], "purchase_price_sum_14d" : 145, "purchase_price_sum_30d" : 1253, "purchase_price_sum_3d" : null } 2025-01-30 18:57:04 INFO FetcherMain$:176 - Fetched in: 366.143459 ms ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Added support for Google Cloud Platform (GCP) integration in the application. - Introduced a new command-line tool for fetching data with flexible configuration options. - **Improvements** - Enhanced configuration handling for more robust parameter processing. - Streamlined jar downloading and deployment processes. - Improved command-line argument parsing for better usability. - Integrated service component into the build and upload process. - **Technical Updates** - Updated build and upload scripts to include service JAR artifacts. - Refactored fetcher functionality to centralize data retrieval logic. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…o that spark is not required (#306) ## Summary Tested fetch with integration script below. (Also tested via Driver.scala and also works pasting further down) ``` <<<<<.....................................FETCH.....................................>>>>> + touch tmp_fetch.out + zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}' + tee /dev/tty tmp_fetch.out + grep -q purchase_price_average_14d Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False} 18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT 18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service. 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo] 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo] 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml] 18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath. 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT] 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE] 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender] 18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each. 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used 18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z 18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE] 18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration. 18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point 18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY 18:50:34.810 [main] INFO ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] --- 18:50:34.812 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:34.979 [main] INFO ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH 18:50:35.048 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:35.165 [pool-22-thread-1] INFO ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5) 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Window Tails: 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 18:50:35.180 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_30d -> Some(2023-11-02 00:00:00) + cat tmp_fetch.out + grep purchase_price_average_14d 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) + '[' 0 -ne 0 ']' + echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m' <<<<<.....................................SUCCEEDED!!!.....................................>>>>> ``` Test via Driver.scala: ``` davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}' --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. 2025-01-30 18:57:04 INFO FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] --- 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:58 - Window Tails: 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_last10 -> Some(unbounded) 2025-01-30 18:57:04 INFO FetcherMain$:174 - --- [FETCHED RESULT] --- { "purchase_price_average_14d" : 72.5, "purchase_price_average_30d" : 250.6, "purchase_price_average_3d" : null, "purchase_price_count_14d" : 2, "purchase_price_count_30d" : 5, "purchase_price_count_3d" : null, "purchase_price_last10" : [ 76, 69, 367, 466, 275 ], "purchase_price_sum_14d" : 145, "purchase_price_sum_30d" : 1253, "purchase_price_sum_3d" : null } 2025-01-30 18:57:04 INFO FetcherMain$:176 - Fetched in: 366.143459 ms ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Added support for Google Cloud Platform (GCP) integration in the application. - Introduced a new command-line tool for fetching data with flexible configuration options. - **Improvements** - Enhanced configuration handling for more robust parameter processing. - Streamlined jar downloading and deployment processes. - Improved command-line argument parsing for better usability. - Integrated service component into the build and upload process. - **Technical Updates** - Updated build and upload scripts to include service JAR artifacts. - Refactored fetcher functionality to centralize data retrieval logic. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…o that spark is not required (#306) ## Summary Tested fetch with integration script below. (Also tested via Driver.scala and also works pasting further down) ``` <<<<<.....................................FETCH.....................................>>>>> + touch tmp_fetch.out + zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}' + tee /dev/tty tmp_fetch.out + grep -q purchase_price_average_14d Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False} 18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT 18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service. 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo] 18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo] 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml] 18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath. 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] 18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT] 18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE] 18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender] 18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each. 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used 18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight. 18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z 18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log 18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE] 18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE] 18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration. 18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point 18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY 18:50:34.810 [main] INFO ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] --- 18:50:34.812 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:34.979 [main] INFO ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH 18:50:35.048 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:35.165 [pool-22-thread-1] INFO ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5) 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Window Tails: 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 18:50:35.180 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_30d -> Some(2023-11-02 00:00:00) + cat tmp_fetch.out + grep purchase_price_average_14d 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) + '[' 0 -ne 0 ']' + echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m' <<<<<.....................................SUCCEEDED!!!.....................................>>>>> ``` Test via Driver.scala: ``` davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}' --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. 2025-01-30 18:57:04 INFO FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] --- 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:58 - Window Tails: 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_last10 -> Some(unbounded) 2025-01-30 18:57:04 INFO FetcherMain$:174 - --- [FETCHED RESULT] --- { "purchase_price_average_14d" : 72.5, "purchase_price_average_30d" : 250.6, "purchase_price_average_3d" : null, "purchase_price_count_14d" : 2, "purchase_price_count_30d" : 5, "purchase_price_count_3d" : null, "purchase_price_last10" : [ 76, 69, 367, 466, 275 ], "purchase_price_sum_14d" : 145, "purchase_price_sum_30d" : 1253, "purchase_price_sum_3d" : null } 2025-01-30 18:57:04 INFO FetcherMain$:176 - Fetched in: 366.143459 ms ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Added support for Google Cloud Platform (GCP) integration in the application. - Introduced a new command-line tool for fetching data with flexible configuration options. - **Improvements** - Enhanced configuration handling for more robust parameter processing. - Streamlined jar downloading and deployment processes. - Improved command-line argument parsing for better usability. - Integrated service component into the build and upload process. - **Technical Updates** - Updated build and upload scripts to include service JAR artifacts. - Refactored fetcher functionality to centralize data retrieval logic. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…o that spark is not required (#306) ## Summary Tested fetch with integration script below. (Also tested via Driver.scala and also works pasting further down) ``` <<<<<.....................................FETCH.....................................>>>>> + touch tmp_fetch.out + zipline run --mode fetch --conf-type group_bys --name quiour clientsstart/purchases.v1_test -k '{"user_id":"5"}' + tee /dev/tty tmp_fetch.out + grep -q purchase_price_average_14d Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False} 18:50:29,965 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - This is logbaour clients-classic version 0.1.0-SNAPSHOT 18:50:29,965 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service. 18:50:29,966 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logbaour clients.classic.joran.SerializedModelConfigurator 18:50:29,966 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logbaour clients.classic.joran.SerializedModelConfigurator 18:50:29,968 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - Could NOT find resource [logbaour clients-test.scmo] 18:50:29,968 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - Could NOT find resource [logbaour clients.scmo] 18:50:29,973 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - ch.qos.logbaour clients.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY 18:50:29,973 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator 18:50:29,973 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - Could NOT find resource [logbaour clients-test.xml] 18:50:29,975 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - Found resource [logbaour clients.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logbaour clients.xml] 18:50:29,976 |-WARN in ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logbaour clients.xml] occurs multiple times on the classpath. 18:50:29,976 |-WARN in ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logbaour clients.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logbaour clients.xml] 18:50:29,976 |-WARN in ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logbaour clients.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logbaour clients.xml] 18:50:29,977 |-INFO in ch.qos.logbaour clients.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logbaour clients.xml] is not of type file 18:50:30,009 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT] 18:50:30,009 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logbaour clients.core.ConsoleAppender] 18:50:30,013 |-INFO in ch.qos.logbaour clients.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logbaour clients.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,023 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - Processing appender named [FILE] 18:50:30,023 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logbaour clients.core.rolling.RollingFileAppender] 18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each. 18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used 18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file 18:50:30,028 |-INFO in ch.qos.logbaour clients.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'. 18:50:30,028 |-INFO in ch.qos.logbaour clients.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight. 18:50:30,028 |-INFO in ch.qos.logbaour clients.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z 18:50:30,028 |-INFO in ch.qos.logbaour clients.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logbaour clients.classic.encoder.PatternLayoutEncoder] for [encoder] property 18:50:30,030 |-INFO in ch.qos.logbaour clients.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log 18:50:30,030 |-INFO in ch.qos.logbaour clients.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log] 18:50:30,031 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE] 18:50:30,031 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logbaour clients.classic.AsyncAppender] 18:50:30,033 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logbaour clients.classic.AsyncAppender[ASYNCFILE] 18:50:30,033 |-INFO in ch.qos.logbaour clients.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logbaour clients.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logbaour clients.classic.AsyncAppender] 18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logbaour clients.classic.AsyncAppender[ASYNCSTDOUT] 18:50:30,034 |-INFO in ch.qos.logbaour clients.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender. 18:50:30,034 |-INFO in ch.qos.logbaour clients.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0 18:50:30,034 |-INFO in ch.qos.logbaour clients.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO 18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT] 18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.DefaultProcessor@14b0e127 - End of configuration. 18:50:30,035 |-INFO in ch.qos.logbaour clients.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallbaour clients point 18:50:30,035 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY 18:50:34.810 [main] INFO ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] --- 18:50:34.812 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:34.979 [main] INFO ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH 18:50:35.048 [main] INFO a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests 18:50:35.165 [pool-22-thread-1] INFO ai.chronon.online.Fetcher - Constructing response for groupBy: quiour clientsstart.purchases.v1_test for keys: Map(user_id -> 5) 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00 18:50:35.178 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - Window Tails: 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 18:50:35.180 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_30d -> Some(2023-11-02 00:00:00) + cat tmp_fetch.out + grep purchase_price_average_14d 18:50:35.179 [pool-22-thread-1] INFO a.c.a.w.SawtoothOnlineAggregator - purchase_price_average_14d -> Some(2023-11-18 00:00:00) + '[' 0 -ne 0 ']' + echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m' <<<<<.....................................SUCCEEDED!!!.....................................>>>>> ``` Test via Driver.scala: ``` davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl --conf-type=group_bys --name quiour clientsstart/purchases.v1 -k '{"user_id":"5"}' --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. 2025-01-30 18:57:04 INFO FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] --- 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH 2025-01-30 18:57:04 INFO BigTableKVStoreImpl:119 - Performing multi-get for 1 requests 2025-01-30 18:57:04 INFO Fetcher:453 - Constructing response for groupBy: quiour clientsstart.purchases.v1 for keys: Map(user_id -> 5) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:58 - Window Tails: 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_sum_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_count_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_3d -> Some(2023-11-29 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_14d -> Some(2023-11-18 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_average_30d -> Some(2023-11-02 00:00:00) 2025-01-30 18:57:04 INFO SawtoothOnlineAggregator:60 - purchase_price_last10 -> Some(unbounded) 2025-01-30 18:57:04 INFO FetcherMain$:174 - --- [FETCHED RESULT] --- { "purchase_price_average_14d" : 72.5, "purchase_price_average_30d" : 250.6, "purchase_price_average_3d" : null, "purchase_price_count_14d" : 2, "purchase_price_count_30d" : 5, "purchase_price_count_3d" : null, "purchase_price_last10" : [ 76, 69, 367, 466, 275 ], "purchase_price_sum_14d" : 145, "purchase_price_sum_30d" : 1253, "purchase_price_sum_3d" : null } 2025-01-30 18:57:04 INFO FetcherMain$:176 - Fetched in: 366.143459 ms ``` ## Cheour clientslist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Added support for Google Cloud Platform (GCP) integration in the application. - Introduced a new command-line tool for fetching data with flexible configuration options. - **Improvements** - Enhanced configuration handling for more robust parameter processing. - Streamlined jar downloading and deployment processes. - Improved command-line argument parsing for better usability. - Integrated service component into the build and upload process. - **Technical Updates** - Updated build and upload scripts to include service JAR artifacts. - Refactored fetcher functionality to centralize data retrieval logic. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Tested fetch with integration script below. (Also tested via Driver.scala and also works pasting further down)
Test via Driver.scala:
Checklist
Summary by CodeRabbit
Release Notes
New Features
Improvements
Technical Updates