Skip to content

Conversation

@david-zlai
Copy link
Contributor

@david-zlai david-zlai commented Jan 31, 2025

Summary

Tested fetch with integration script below. (Also tested via Driver.scala and also works pasting further down)

<<<<<.....................................FETCH.....................................>>>>>
+ touch tmp_fetch.out
+ zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}'
+ tee /dev/tty tmp_fetch.out
+ grep -q purchase_price_average_14d
Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False}
18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT
18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service.
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo]
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo]
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath.
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT]
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE]
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender]
18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each.
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used
18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z
18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE]
18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration.
18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point
18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY

18:50:34.810 [main] INFO  ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] ---
18:50:34.812 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:34.979 [main] INFO  ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH
18:50:35.048 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:35.165 [pool-22-thread-1] INFO  ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5)
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Window Tails: 
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
18:50:35.180 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
+ cat tmp_fetch.out
+ grep purchase_price_average_14d
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
+ '[' 0 -ne 0 ']'
+ echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m'
<<<<<.....................................SUCCEEDED!!!.....................................>>>>>

Test via Driver.scala:

davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl  --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}'  --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2025-01-30 18:57:04 INFO  FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] ---
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:58 - Window Tails: 
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_last10 -> Some(unbounded)
2025-01-30 18:57:04 INFO  FetcherMain$:174 - --- [FETCHED RESULT] ---
{
  "purchase_price_average_14d" : 72.5,
  "purchase_price_average_30d" : 250.6,
  "purchase_price_average_3d" : null,
  "purchase_price_count_14d" : 2,
  "purchase_price_count_30d" : 5,
  "purchase_price_count_3d" : null,
  "purchase_price_last10" : [ 76, 69, 367, 466, 275 ],
  "purchase_price_sum_14d" : 145,
  "purchase_price_sum_30d" : 1253,
  "purchase_price_sum_3d" : null
}
2025-01-30 18:57:04 INFO  FetcherMain$:176 - Fetched in: 366.143459 ms

Checklist

  • Added Unit Tests
  • Covered by existing CI
  • Integration tested
  • Documentation update

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for Google Cloud Platform (GCP) integration in the application.
    • Introduced a new command-line tool for fetching data with flexible configuration options.
  • Improvements

    • Enhanced configuration handling for more robust parameter processing.
    • Streamlined jar downloading and deployment processes.
    • Improved command-line argument parsing for better usability.
    • Integrated service component into the build and upload process.
  • Technical Updates

    • Updated build and upload scripts to include service JAR artifacts.
    • Refactored fetcher functionality to centralize data retrieval logic.

davidhan@Davids-MacBook-Pro: ~/zipline/chronon (davidhan/fetch_no_spark) $ java -cp $SERVICE:$CLOUD_GCP ai.chronon.online.FetcherMain fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl  --conf-type=group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}' --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance
16:41:37.223 [main] INFO  ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] ---
16:41:37.226 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
16:41:37.359 [main] INFO  ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH
16:41:37.428 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
16:41:37.531 [pool-22-thread-1] INFO  ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5)
16:41:37.547 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00
16:41:37.547 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Window Tails:
16:41:37.548 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
16:41:37.548 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
16:41:37.549 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_last10 -> Some(unbounded)
16:41:37.576 [main] INFO  ai.chronon.online.FetcherMain$ - --- [FETCHED RESULT] ---
{
  "purchase_price_average_14d" : 72.5,
  "purchase_price_average_30d" : 250.6,
  "purchase_price_average_3d" : null,
  "purchase_price_count_14d" : 2,
  "purchase_price_count_30d" : 5,
  "purchase_price_count_3d" : null,
  "purchase_price_last10" : [ 76, 69, 367, 466, 275 ],
  "purchase_price_sum_14d" : 145,
  "purchase_price_sum_30d" : 1253,
  "purchase_price_sum_3d" : null
}
16:41:37.576 [main] INFO  ai.chronon.online.FetcherMain$ - Fetched in: 339.71775 ms
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 31, 2025

Walkthrough

This pull request introduces enhancements to Chronon's GCP integration and configuration handling across multiple files. The changes focus on improving the deployment and fetching mechanisms for services, with key modifications in run.py, build_and_upload_gcp_artifacts.sh, FetcherMain.scala, and Driver.scala. The updates streamline GCP-related operations, introduce more flexible configuration management, and provide a more robust command-line interface for feature fetching.

Changes

File Change Summary
api/py/ai/chronon/repo/run.py - Added is_gcp attribute
- Modified conf_type handling
- Enhanced GCP-specific command argument processing
- Added download_service_jar method
- Updated main function to include service_jar_path
distribution/build_and_upload_gcp_artifacts.sh - Added SERVICE_JAR variable
- Integrated service JAR build and upload process
online/src/main/scala/ai/chronon/online/FetcherMain.scala - New command-line application for data fetching
- Implemented flexible argument parsing
- Added support for GCP configuration
spark/src/main/scala/ai/chronon/spark/Driver.scala - Refactored FetcherCli to use FetcherMain.FetcherArgs
- Simplified argument handling

Possibly related PRs

Suggested Reviewers

  • piyush-zlai
  • nikhil-zlai

Poem

🚀 In clouds of code, we dance and weave,
GCP's embrace, our systems now achieve.
Jars download, configs align with grace,
Chronon's magic leaves no empty space!
Fetching data with a programmer's might 🌟

Warning

Review ran into problems

🔥 Problems

GitHub Actions: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository.

Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 1d2a026 and d9262ea.

📒 Files selected for processing (1)
  • api/py/ai/chronon/repo/run.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/py/ai/chronon/repo/run.py
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: table_utils_delta_format_spark_tests
  • GitHub Check: other_spark_tests
  • GitHub Check: mutation_spark_tests
  • GitHub Check: fetcher_spark_tests
  • GitHub Check: no_spark_scala_tests
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: join_spark_tests
  • GitHub Check: enforce_triggered_workflows

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
online/src/main/scala/ai/chronon/online/FetcherMain.scala (1)

113-139: Potential long run without timeouts.

The fetch loop may run indefinitely (especially when loop is true) with minimal breaks. Consider graceful interruption or shorter intervals for production use.

api/py/ai/chronon/repo/run.py (2)

477-484: Clarify fallback logic.

Switching entrypoints between FetcherMain and Driver can be confusing. A brief comment explaining which modes use each would aid comprehension.


821-833: Rename method to be consistent.

download_service_jar stands out from the existing download_chronon_gcp_jar. Use consistent naming for clarity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 2658a80 and 5866a0a.

📒 Files selected for processing (4)
  • api/py/ai/chronon/repo/run.py (6 hunks)
  • distribution/build_and_upload_gcp_artifacts.sh (3 hunks)
  • online/src/main/scala/ai/chronon/online/FetcherMain.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Driver.scala (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: mutation_spark_tests
  • GitHub Check: fetcher_spark_tests
  • GitHub Check: table_utils_delta_format_spark_tests
  • GitHub Check: other_spark_tests
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: no_spark_scala_tests
  • GitHub Check: join_spark_tests
🔇 Additional comments (11)
online/src/main/scala/ai/chronon/online/FetcherMain.scala (2)

31-70: Validate error handling.

The trait defines many optional fields but doesn’t fully validate incompatible usage (e.g., missing keyJson alongside keyJsonFile). If an invalid combination is provided, the user might get a misleading error.


190-196: Check exit usage.

System.exit(0) abruptly halts the JVM. Ensure no essential cleanup is bypassed.

api/py/ai/chronon/repo/run.py (3)

668-674: Validate GCP env variables.

Ensure presence of project ID and instance ID. If missing, consider raising an error rather than silently continuing.


903-907: Parameter grouping.

The --is-gcp addition is useful, but re-check that relevant code paths handle the new parameter consistently without duplication.


920-922: Combine jars carefully.

Joining two jar paths with : can be OS-dependent. For Windows, use ;. Consider a safer approach for cross-platform usage.

spark/src/main/scala/ai/chronon/spark/Driver.scala (2)

30-30: Good addition.

Importing FetcherMain clarifies concurrency with the main fetch logic. Ensures reuse of tested code.


637-639: Delegation is neat.

Using FetcherMain.run(args) avoids code duplication. This is a maintainability win.

distribution/build_and_upload_gcp_artifacts.sh (4)

44-44: LGTM!

Service JAR build command follows existing pattern.


48-48: LGTM!

Service JAR path follows existing convention.


60-64: LGTM!

Service JAR existence check matches existing pattern.


81-81: Verify GCP storage permissions.

Upload command looks good, but ensure service JAR upload permissions are configured.

Comment on lines +81 to +88
def impl(props: Map[String, String]): Api = {
val urls = Array(new File(onlineJar()).toURI.toURL)
val cl = ScalaClassLoader.fromURLs(urls, this.getClass.getClassLoader)
val cls = cl.loadClass(onlineClass())
val constructor = cls.getConstructors.apply(0)
val onlineImpl = constructor.newInstance(props)
onlineImpl.asInstanceOf[Api]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure reflection is safe.

Dynamically loading classes via ScalaClassLoader can introduce security risks if the jar is untrusted. Confirm that only trusted jars are used.

Comment on lines +161 to +164
val result = Await.result(resultFuture, 5.seconds)
val awaitTimeMs = (System.nanoTime - startNs) / 1e6d

// treeMap to produce a sorted result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid fixed 5-second timeout.

Large data fetches could exceed 5 seconds. Consider configurable timeouts or a retry mechanism for robust handling.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
online/src/main/scala/ai/chronon/online/FetcherMain.scala (1)

113-191: Fetch logic is solid; consider configurable Await timeout.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 2ce379f and 1d2a026.

📒 Files selected for processing (3)
  • api/py/ai/chronon/repo/run.py (6 hunks)
  • online/src/main/scala/ai/chronon/online/FetcherMain.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Driver.scala (2 hunks)
🧰 Additional context used
📓 Learnings (1)
spark/src/main/scala/ai/chronon/spark/Driver.scala (3)
Learnt from: chewy-zlai
PR: zipline-ai/chronon#62
File: spark/src/main/scala/ai/chronon/spark/stats/drift/SummaryUploader.scala:9-10
Timestamp: 2024-11-12T09:38:33.532Z
Learning: In Spark applications, when defining serializable classes, passing an implicit `ExecutionContext` parameter can cause serialization issues. In such cases, it's acceptable to use `scala.concurrent.ExecutionContext.Implicits.global`.
Learnt from: chewy-zlai
PR: zipline-ai/chronon#50
File: spark/src/main/scala/ai/chronon/spark/stats/drift/SummaryUploader.scala:37-40
Timestamp: 2024-11-12T09:38:33.532Z
Learning: Avoid using `Await.result` in production code; prefer handling `Future`s asynchronously when possible to prevent blocking.
Learnt from: nikhil-zlai
PR: zipline-ai/chronon#50
File: spark/src/main/scala/ai/chronon/spark/stats/drift/SummaryUploader.scala:19-47
Timestamp: 2024-11-12T09:38:33.532Z
Learning: In Scala, the `grouped` method on collections returns an iterator, allowing for efficient batch processing without accumulating all records in memory.
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: table_utils_delta_format_spark_tests
  • GitHub Check: other_spark_tests
  • GitHub Check: mutation_spark_tests
  • GitHub Check: join_spark_tests
  • GitHub Check: no_spark_scala_tests
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: fetcher_spark_tests
  • GitHub Check: enforce_triggered_workflows
🔇 Additional comments (13)
api/py/ai/chronon/repo/run.py (7)

411-412: Looks straightforward.


477-480: Mode-based entrypoint switch is correct.

Also applies to: 482-482, 484-484


668-674: Conditional GCP args insertion is neat.


821-833: GCS download logic aligns with existing pattern.


903-903: CLI flag addition looks fine.


907-907: Parameter inclusion is consistent.


920-922: Validate Jar existence before concatenation.

online/src/main/scala/ai/chronon/online/FetcherMain.scala (4)

31-101: Trait covers options cleanly; reflection approach is okay.


103-108: Subcommand integration is standard.


110-112: Thrift parse helper looks good.


193-197: Main method is straightforward.

spark/src/main/scala/ai/chronon/spark/Driver.scala (2)

28-28: Importing FetcherMain is appropriate.


630-632: Delegation to FetcherMain simplifies fetch logic.

@david-zlai david-zlai requested a review from tchow-zlai January 31, 2025 03:52
Copy link
Contributor

@piyush-zlai piyush-zlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarification q on the service jar, otherwise looks good to me

return chronon_gcp_jar_destination_path


def download_service_jar(destination_dir: str, customer_id: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this for the fetch verb right? Fetch just uses the online & gcp jars

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes it does. so i tried to just use the online assembly jar but was missing some deps like apache commons here https://github.com/zipline-ai/chronon/blob/main/build.sbt#L423

@david-zlai david-zlai merged commit e9f3dae into main Jan 31, 2025
12 checks passed
@david-zlai david-zlai deleted the davidhan/fetch_no_spark branch January 31, 2025 21:44
nikhil-zlai pushed a commit that referenced this pull request Feb 4, 2025
…o that spark is not required (#306)

## Summary

Tested fetch with integration script below. (Also tested via
Driver.scala and also works pasting further down)

```
<<<<<.....................................FETCH.....................................>>>>>
+ touch tmp_fetch.out
+ zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}'
+ tee /dev/tty tmp_fetch.out
+ grep -q purchase_price_average_14d
Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False}
18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT
18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service.
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo]
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo]
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath.
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT]
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE]
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender]
18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each.
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used
18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z
18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE]
18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration.
18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point
18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY

18:50:34.810 [main] INFO  ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] ---
18:50:34.812 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:34.979 [main] INFO  ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH
18:50:35.048 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:35.165 [pool-22-thread-1] INFO  ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5)
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Window Tails: 
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
18:50:35.180 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
+ cat tmp_fetch.out
+ grep purchase_price_average_14d
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
+ '[' 0 -ne 0 ']'
+ echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m'
<<<<<.....................................SUCCEEDED!!!.....................................>>>>>

```


Test via Driver.scala:

```
davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl  --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}'  --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2025-01-30 18:57:04 INFO  FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] ---
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:58 - Window Tails: 
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_last10 -> Some(unbounded)
2025-01-30 18:57:04 INFO  FetcherMain$:174 - --- [FETCHED RESULT] ---
{
  "purchase_price_average_14d" : 72.5,
  "purchase_price_average_30d" : 250.6,
  "purchase_price_average_3d" : null,
  "purchase_price_count_14d" : 2,
  "purchase_price_count_30d" : 5,
  "purchase_price_count_3d" : null,
  "purchase_price_last10" : [ 76, 69, 367, 466, 275 ],
  "purchase_price_sum_14d" : 145,
  "purchase_price_sum_30d" : 1253,
  "purchase_price_sum_3d" : null
}
2025-01-30 18:57:04 INFO  FetcherMain$:176 - Fetched in: 366.143459 ms

```
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Added support for Google Cloud Platform (GCP) integration in the
application.
- Introduced a new command-line tool for fetching data with flexible
configuration options.

- **Improvements**
- Enhanced configuration handling for more robust parameter processing.
  - Streamlined jar downloading and deployment processes.
  - Improved command-line argument parsing for better usability.
  - Integrated service component into the build and upload process.

- **Technical Updates**
  - Updated build and upload scripts to include service JAR artifacts.
  - Refactored fetcher functionality to centralize data retrieval logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
kumar-zlai pushed a commit that referenced this pull request Apr 25, 2025
…o that spark is not required (#306)

## Summary

Tested fetch with integration script below. (Also tested via
Driver.scala and also works pasting further down)

```
<<<<<.....................................FETCH.....................................>>>>>
+ touch tmp_fetch.out
+ zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}'
+ tee /dev/tty tmp_fetch.out
+ grep -q purchase_price_average_14d
Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False}
18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT
18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service.
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo]
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo]
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath.
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT]
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE]
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender]
18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each.
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used
18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z
18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE]
18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration.
18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point
18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY

18:50:34.810 [main] INFO  ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] ---
18:50:34.812 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:34.979 [main] INFO  ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH
18:50:35.048 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:35.165 [pool-22-thread-1] INFO  ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5)
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Window Tails: 
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
18:50:35.180 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
+ cat tmp_fetch.out
+ grep purchase_price_average_14d
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
+ '[' 0 -ne 0 ']'
+ echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m'
<<<<<.....................................SUCCEEDED!!!.....................................>>>>>

```


Test via Driver.scala:

```
davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl  --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}'  --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2025-01-30 18:57:04 INFO  FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] ---
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:58 - Window Tails: 
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_last10 -> Some(unbounded)
2025-01-30 18:57:04 INFO  FetcherMain$:174 - --- [FETCHED RESULT] ---
{
  "purchase_price_average_14d" : 72.5,
  "purchase_price_average_30d" : 250.6,
  "purchase_price_average_3d" : null,
  "purchase_price_count_14d" : 2,
  "purchase_price_count_30d" : 5,
  "purchase_price_count_3d" : null,
  "purchase_price_last10" : [ 76, 69, 367, 466, 275 ],
  "purchase_price_sum_14d" : 145,
  "purchase_price_sum_30d" : 1253,
  "purchase_price_sum_3d" : null
}
2025-01-30 18:57:04 INFO  FetcherMain$:176 - Fetched in: 366.143459 ms

```
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Added support for Google Cloud Platform (GCP) integration in the
application.
- Introduced a new command-line tool for fetching data with flexible
configuration options.

- **Improvements**
- Enhanced configuration handling for more robust parameter processing.
  - Streamlined jar downloading and deployment processes.
  - Improved command-line argument parsing for better usability.
  - Integrated service component into the build and upload process.

- **Technical Updates**
  - Updated build and upload scripts to include service JAR artifacts.
  - Refactored fetcher functionality to centralize data retrieval logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
kumar-zlai pushed a commit that referenced this pull request Apr 29, 2025
…o that spark is not required (#306)

## Summary

Tested fetch with integration script below. (Also tested via
Driver.scala and also works pasting further down)

```
<<<<<.....................................FETCH.....................................>>>>>
+ touch tmp_fetch.out
+ zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}'
+ tee /dev/tty tmp_fetch.out
+ grep -q purchase_price_average_14d
Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False}
18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT
18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service.
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo]
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo]
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath.
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT]
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE]
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender]
18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each.
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used
18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z
18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE]
18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration.
18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point
18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY

18:50:34.810 [main] INFO  ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] ---
18:50:34.812 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:34.979 [main] INFO  ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH
18:50:35.048 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:35.165 [pool-22-thread-1] INFO  ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5)
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Window Tails: 
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
18:50:35.180 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
+ cat tmp_fetch.out
+ grep purchase_price_average_14d
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
+ '[' 0 -ne 0 ']'
+ echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m'
<<<<<.....................................SUCCEEDED!!!.....................................>>>>>

```


Test via Driver.scala:

```
davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl  --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}'  --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2025-01-30 18:57:04 INFO  FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] ---
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:58 - Window Tails: 
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_last10 -> Some(unbounded)
2025-01-30 18:57:04 INFO  FetcherMain$:174 - --- [FETCHED RESULT] ---
{
  "purchase_price_average_14d" : 72.5,
  "purchase_price_average_30d" : 250.6,
  "purchase_price_average_3d" : null,
  "purchase_price_count_14d" : 2,
  "purchase_price_count_30d" : 5,
  "purchase_price_count_3d" : null,
  "purchase_price_last10" : [ 76, 69, 367, 466, 275 ],
  "purchase_price_sum_14d" : 145,
  "purchase_price_sum_30d" : 1253,
  "purchase_price_sum_3d" : null
}
2025-01-30 18:57:04 INFO  FetcherMain$:176 - Fetched in: 366.143459 ms

```
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Added support for Google Cloud Platform (GCP) integration in the
application.
- Introduced a new command-line tool for fetching data with flexible
configuration options.

- **Improvements**
- Enhanced configuration handling for more robust parameter processing.
  - Streamlined jar downloading and deployment processes.
  - Improved command-line argument parsing for better usability.
  - Integrated service component into the build and upload process.

- **Technical Updates**
  - Updated build and upload scripts to include service JAR artifacts.
  - Refactored fetcher functionality to centralize data retrieval logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
chewy-zlai pushed a commit that referenced this pull request May 15, 2025
…o that spark is not required (#306)

## Summary

Tested fetch with integration script below. (Also tested via
Driver.scala and also works pasting further down)

```
<<<<<.....................................FETCH.....................................>>>>>
+ touch tmp_fetch.out
+ zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}'
+ tee /dev/tty tmp_fetch.out
+ grep -q purchase_price_average_14d
Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False}
18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT
18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service.
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo]
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo]
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath.
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT]
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE]
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender]
18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each.
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used
18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z
18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE]
18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration.
18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point
18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY

18:50:34.810 [main] INFO  ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] ---
18:50:34.812 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:34.979 [main] INFO  ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH
18:50:35.048 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:35.165 [pool-22-thread-1] INFO  ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5)
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Window Tails: 
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
18:50:35.180 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
+ cat tmp_fetch.out
+ grep purchase_price_average_14d
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
+ '[' 0 -ne 0 ']'
+ echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m'
<<<<<.....................................SUCCEEDED!!!.....................................>>>>>

```


Test via Driver.scala:

```
davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl  --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}'  --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2025-01-30 18:57:04 INFO  FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] ---
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:58 - Window Tails: 
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_last10 -> Some(unbounded)
2025-01-30 18:57:04 INFO  FetcherMain$:174 - --- [FETCHED RESULT] ---
{
  "purchase_price_average_14d" : 72.5,
  "purchase_price_average_30d" : 250.6,
  "purchase_price_average_3d" : null,
  "purchase_price_count_14d" : 2,
  "purchase_price_count_30d" : 5,
  "purchase_price_count_3d" : null,
  "purchase_price_last10" : [ 76, 69, 367, 466, 275 ],
  "purchase_price_sum_14d" : 145,
  "purchase_price_sum_30d" : 1253,
  "purchase_price_sum_3d" : null
}
2025-01-30 18:57:04 INFO  FetcherMain$:176 - Fetched in: 366.143459 ms

```
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Added support for Google Cloud Platform (GCP) integration in the
application.
- Introduced a new command-line tool for fetching data with flexible
configuration options.

- **Improvements**
- Enhanced configuration handling for more robust parameter processing.
  - Streamlined jar downloading and deployment processes.
  - Improved command-line argument parsing for better usability.
  - Integrated service component into the build and upload process.

- **Technical Updates**
  - Updated build and upload scripts to include service JAR artifacts.
  - Refactored fetcher functionality to centralize data retrieval logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
chewy-zlai pushed a commit that referenced this pull request May 15, 2025
…o that spark is not required (#306)

## Summary

Tested fetch with integration script below. (Also tested via
Driver.scala and also works pasting further down)

```
<<<<<.....................................FETCH.....................................>>>>>
+ touch tmp_fetch.out
+ zipline run --mode fetch --conf-type group_bys --name quickstart/purchases.v1_test -k '{"user_id":"5"}'
+ tee /dev/tty tmp_fetch.out
+ grep -q purchase_price_average_14d
Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False}
18:50:29,965 |-INFO in ch.qos.logback.classic.LoggerContext[default] - This is logback-classic version 0.1.0-SNAPSHOT
18:50:29,965 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service.
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,966 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.joran.SerializedModelConfigurator
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.scmo]
18:50:29,968 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.scmo]
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logback.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
18:50:29,975 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs multiple times on the classpath.
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,976 |-WARN in ch.qos.logback.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logback.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml]
18:50:29,977 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logback.xml] is not of type file
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT]
18:50:30,009 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
18:50:30,013 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [FILE]
18:50:30,023 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.core.rolling.RollingFileAppender]
18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each.
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used
18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight.
18:50:30,028 |-INFO in ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z
18:50:30,028 |-INFO in ch.qos.logback.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log
18:50:30,030 |-INFO in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE]
18:50:30,031 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,033 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logback.classic.AsyncAppender[ASYNCFILE]
18:50:30,033 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logback.classic.AsyncAppender]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logback.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logback.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logback.core.model.processor.DefaultProcessor@14b0e127 - End of configuration.
18:50:30,035 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallback point
18:50:30,035 |-INFO in ch.qos.logback.classic.util.ContextInitializer@a0db585 - ch.qos.logback.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY

18:50:34.810 [main] INFO  ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] ---
18:50:34.812 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:34.979 [main] INFO  ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH
18:50:35.048 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:35.165 [pool-22-thread-1] INFO  ai.chronon.online.Fetcher - Constructing response for groupBy: quickstart.purchases.v1_test for keys: Map(user_id -> 5)
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Window Tails: 
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
18:50:35.180 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
+ cat tmp_fetch.out
+ grep purchase_price_average_14d
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
+ '[' 0 -ne 0 ']'
+ echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m'
<<<<<.....................................SUCCEEDED!!!.....................................>>>>>

```


Test via Driver.scala:

```
davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl  --conf-type=group_bys --name quickstart/purchases.v1 -k '{"user_id":"5"}'  --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2025-01-30 18:57:04 INFO  FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] ---
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:453 - Constructing response for groupBy: quickstart.purchases.v1 for keys: Map(user_id -> 5)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:58 - Window Tails: 
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_last10 -> Some(unbounded)
2025-01-30 18:57:04 INFO  FetcherMain$:174 - --- [FETCHED RESULT] ---
{
  "purchase_price_average_14d" : 72.5,
  "purchase_price_average_30d" : 250.6,
  "purchase_price_average_3d" : null,
  "purchase_price_count_14d" : 2,
  "purchase_price_count_30d" : 5,
  "purchase_price_count_3d" : null,
  "purchase_price_last10" : [ 76, 69, 367, 466, 275 ],
  "purchase_price_sum_14d" : 145,
  "purchase_price_sum_30d" : 1253,
  "purchase_price_sum_3d" : null
}
2025-01-30 18:57:04 INFO  FetcherMain$:176 - Fetched in: 366.143459 ms

```
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Added support for Google Cloud Platform (GCP) integration in the
application.
- Introduced a new command-line tool for fetching data with flexible
configuration options.

- **Improvements**
- Enhanced configuration handling for more robust parameter processing.
  - Streamlined jar downloading and deployment processes.
  - Improved command-line argument parsing for better usability.
  - Integrated service component into the build and upload process.

- **Technical Updates**
  - Updated build and upload scripts to include service JAR artifacts.
  - Refactored fetcher functionality to centralize data retrieval logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
chewy-zlai pushed a commit that referenced this pull request May 16, 2025
…o that spark is not required (#306)

## Summary

Tested fetch with integration script below. (Also tested via
Driver.scala and also works pasting further down)

```
<<<<<.....................................FETCH.....................................>>>>>
+ touch tmp_fetch.out
+ zipline run --mode fetch --conf-type group_bys --name quiour clientsstart/purchases.v1_test -k '{"user_id":"5"}'
+ tee /dev/tty tmp_fetch.out
+ grep -q purchase_price_average_14d
Running with args: {'mode': 'fetch', 'conf_type': 'group_bys', 'conf': None, 'env': 'dev', 'dataproc': False, 'ds': None, 'app_name': None, 'start_ds': None, 'end_ds': None, 'parallelism': None, 'repo': '.', 'online_jar': 'cloud_gcp-assembly-0.1.0-SNAPSHOT.jar', 'online_class': 'ai.chronon.integrations.cloud_gcp.GcpApiImpl', 'version': None, 'spark_version': '2.4.0', 'spark_submit_path': None, 'spark_streaming_submit_path': None, 'online_jar_fetch': None, 'sub_help': False, 'online_args': None, 'chronon_jar': None, 'release_tag': None, 'list_apps': None, 'render_info': None, 'is_gcp': False}
18:50:29,965 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - This is logbaour clients-classic version 0.1.0-SNAPSHOT
18:50:29,965 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - No custom configurators were discovered as a service.
18:50:29,966 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logbaour clients.classic.joran.SerializedModelConfigurator
18:50:29,966 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logbaour clients.classic.joran.SerializedModelConfigurator
18:50:29,968 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - Could NOT find resource [logbaour clients-test.scmo]
18:50:29,968 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - Could NOT find resource [logbaour clients.scmo]
18:50:29,973 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - ch.qos.logbaour clients.classic.joran.SerializedModelConfigurator.configure() call lasted 2 milliseconds. ExecutionStatus=INVOKE_NEXT_IF_ANY
18:50:29,973 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - Trying to configure with ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - Constructed configurator of type class ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator
18:50:29,973 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - Could NOT find resource [logbaour clients-test.xml]
18:50:29,975 |-INFO in ch.qos.logbaour clients.classic.LoggerContext[default] - Found resource [logbaour clients.xml] at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logbaour clients.xml]
18:50:29,976 |-WARN in ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logbaour clients.xml] occurs multiple times on the classpath.
18:50:29,976 |-WARN in ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logbaour clients.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar!/logbaour clients.xml]
18:50:29,976 |-WARN in ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator@2b34e38c - Resource [logbaour clients.xml] occurs at [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logbaour clients.xml]
18:50:29,977 |-INFO in ch.qos.logbaour clients.core.joran.spi.ConfigurationWatchList@3d37203b - URL [jar:file:/private/var/folders/2p/h5v8s0515xv20cgprdjngttr0000gn/T/tmpssh6bz80/service-0.1.0-SNAPSHOT.jar!/logbaour clients.xml] is not of type file
18:50:30,009 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - Processing appender named [STDOUT]
18:50:30,009 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logbaour clients.core.ConsoleAppender]
18:50:30,013 |-INFO in ch.qos.logbaour clients.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logbaour clients.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,023 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - Processing appender named [FILE]
18:50:30,023 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logbaour clients.core.rolling.RollingFileAppender]
18:50:30,025 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - setting totalSizeCap to 10 GB
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Archive files will be limited to [100 MB] each.
18:50:30,026 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - No compression will be used
18:50:30,027 |-INFO in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@2144496344 - Will use the pattern /tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log for the active file
18:50:30,028 |-INFO in ch.qos.logbaour clients.core.rolling.SizeAndTimeBasedFNATP@1894593a - The date pattern is 'yyyy-dd-MM' from file name pattern '/tmp/chronon/logs/chronon-fs.%d{yyyy-dd-MM}-%i.log'.
18:50:30,028 |-INFO in ch.qos.logbaour clients.core.rolling.SizeAndTimeBasedFNATP@1894593a - Roll-over at midnight.
18:50:30,028 |-INFO in ch.qos.logbaour clients.core.rolling.SizeAndTimeBasedFNATP@1894593a - Setting initial period to 2025-01-31T02:50:30.028Z
18:50:30,028 |-INFO in ch.qos.logbaour clients.core.model.processor.ImplicitModelHandler - Assuming default type [ch.qos.logbaour clients.classic.encoder.PatternLayoutEncoder] for [encoder] property
18:50:30,030 |-INFO in ch.qos.logbaour clients.core.rolling.RollingFileAppender[FILE] - Active log file name: /tmp/chronon/logs/chronon-fs.log
18:50:30,030 |-INFO in ch.qos.logbaour clients.core.rolling.RollingFileAppender[FILE] - File property is set to [/tmp/chronon/logs/chronon-fs.log]
18:50:30,031 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCFILE]
18:50:30,031 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logbaour clients.classic.AsyncAppender]
18:50:30,033 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderRefModelHandler - Attaching appender named [FILE] to ch.qos.logbaour clients.classic.AsyncAppender[ASYNCFILE]
18:50:30,033 |-INFO in ch.qos.logbaour clients.classic.AsyncAppender[ASYNCFILE] - Attaching appender named [FILE] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logbaour clients.classic.AsyncAppender[ASYNCFILE] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - Processing appender named [ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderModelHandler - About to instantiate appender of type [ch.qos.logbaour clients.classic.AsyncAppender]
18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderRefModelHandler - Attaching appender named [STDOUT] to ch.qos.logbaour clients.classic.AsyncAppender[ASYNCSTDOUT]
18:50:30,034 |-INFO in ch.qos.logbaour clients.classic.AsyncAppender[ASYNCSTDOUT] - Attaching appender named [STDOUT] to AsyncAppender.
18:50:30,034 |-INFO in ch.qos.logbaour clients.classic.AsyncAppender[ASYNCSTDOUT] - Setting discardingThreshold to 0
18:50:30,034 |-INFO in ch.qos.logbaour clients.classic.model.processor.RootLoggerModelHandler - Setting level of ROOT logger to INFO
18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCSTDOUT] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.AppenderRefModelHandler - Attaching appender named [ASYNCFILE] to Logger[ROOT]
18:50:30,034 |-INFO in ch.qos.logbaour clients.core.model.processor.DefaultProcessor@14b0e127 - End of configuration.
18:50:30,035 |-INFO in ch.qos.logbaour clients.classic.joran.JoranConfigurator@10823d72 - Registering current configuration as safe fallbaour clients point
18:50:30,035 |-INFO in ch.qos.logbaour clients.classic.util.ContextInitializer@a0db585 - ch.qos.logbaour clients.classic.util.DefaultJoranConfigurator.configure() call lasted 62 milliseconds. ExecutionStatus=DO_NOT_INVOKE_NEXT_IF_ANY

18:50:34.810 [main] INFO  ai.chronon.online.FetcherMain$ - --- [START FETCHING for Map(user_id -> 5)] ---
18:50:34.812 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:34.979 [main] INFO  ai.chronon.online.Fetcher - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_TEST_BATCH
18:50:35.048 [main] INFO  a.c.i.cloud_gcp.BigTableKVStoreImpl - Performing multi-get for 1 requests
18:50:35.165 [pool-22-thread-1] INFO  ai.chronon.online.Fetcher - Constructing response for groupBy: quiour clientsstart.purchases.v1_test for keys: Map(user_id -> 5)
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Batch End: 2023-12-02 00:00:00
18:50:35.178 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator - Window Tails: 
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
18:50:35.180 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
+ cat tmp_fetch.out
+ grep purchase_price_average_14d
18:50:35.179 [pool-22-thread-1] INFO  a.c.a.w.SawtoothOnlineAggregator -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
+ '[' 0 -ne 0 ']'
+ echo -e '\033[0;32m<<<<<.....................................SUCCEEDED!!!.....................................>>>>>\033[0m'
<<<<<.....................................SUCCEEDED!!!.....................................>>>>>

```


Test via Driver.scala:

```
davidhan@Mac: ~/zipline/chronon (davidhan/zipline_integration_script) $ java -Dlog4j.configuration=log4j.properties -cp $CLOUD_GCP:$SPARK_HOME/jars/* ai.chronon.spark.Driver fetch --online-jar=cloud_gcp-assembly-0.1.0-SNAPSHOT.jar --online-class=ai.chronon.integrations.cloud_gcp.GcpApiImpl  --conf-type=group_bys --name quiour clientsstart/purchases.v1 -k '{"user_id":"5"}'  --is-gcp --gcp-project-id=canary-443022 --gcp-bigtable-instance-id=zipline-canary-instance
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2025-01-30 18:57:04 INFO  FetcherMain$:145 - --- [START FETCHING for Map(user_id -> 5)] ---
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:188 - Fetched group_by_serving_info from : QUICKSTART_PURCHASES_V1_BATCH
2025-01-30 18:57:04 INFO  BigTableKVStoreImpl:119 - Performing multi-get for 1 requests
2025-01-30 18:57:04 INFO  Fetcher:453 - Constructing response for groupBy: quiour clientsstart.purchases.v1 for keys: Map(user_id -> 5)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:57 - Batch End: 2023-12-02 00:00:00
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:58 - Window Tails: 
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_sum_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_count_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_3d -> Some(2023-11-29 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_14d -> Some(2023-11-18 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_average_30d -> Some(2023-11-02 00:00:00)
2025-01-30 18:57:04 INFO  SawtoothOnlineAggregator:60 -   purchase_price_last10 -> Some(unbounded)
2025-01-30 18:57:04 INFO  FetcherMain$:174 - --- [FETCHED RESULT] ---
{
  "purchase_price_average_14d" : 72.5,
  "purchase_price_average_30d" : 250.6,
  "purchase_price_average_3d" : null,
  "purchase_price_count_14d" : 2,
  "purchase_price_count_30d" : 5,
  "purchase_price_count_3d" : null,
  "purchase_price_last10" : [ 76, 69, 367, 466, 275 ],
  "purchase_price_sum_14d" : 145,
  "purchase_price_sum_30d" : 1253,
  "purchase_price_sum_3d" : null
}
2025-01-30 18:57:04 INFO  FetcherMain$:176 - Fetched in: 366.143459 ms

```
## Cheour clientslist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Added support for Google Cloud Platform (GCP) integration in the
application.
- Introduced a new command-line tool for fetching data with flexible
configuration options.

- **Improvements**
- Enhanced configuration handling for more robust parameter processing.
  - Streamlined jar downloading and deployment processes.
  - Improved command-line argument parsing for better usability.
  - Integrated service component into the build and upload process.

- **Technical Updates**
  - Updated build and upload scripts to include service JAR artifacts.
  - Refactored fetcher functionality to centralize data retrieval logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants