Update builder/server DuckDB dependencies#662
Conversation
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR updates DuckDB dependency handling in the datamanager Dockerfile to fix a breaking build. The changes install DuckDB v1.1.3 libraries explicitly in both builder and server stages, and reorder COPY statements in the planner stage to be more granular.
Key Changes:
- Install DuckDB v1.1.3 shared libraries from GitHub releases in both builder and server stages
- Reorder COPY statements in planner stage to copy source files separately from Cargo files
- Add wget and unzip dependencies, with cleanup in server stage
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
🚀 The Update (preview) for forstmeier/pocketsizefund/production (at 7681a48) was successful. ✨ Neo ExplanationInitial production environment deployment for PocketSizeFund, creating a complete ECS-based microservices architecture with three containerized services (datamanager, equitypricemodel, portfoliomanager) behind an Application Load Balancer, running in a new VPC with private subnets and service discovery.Root Cause AnalysisThis deployment was triggered by a code change in the repository. The developer is provisioning a completely new production environment for the PocketSizeFund application from scratch. This appears to be either an initial production deployment or a full infrastructure rebuild, as all 52 resources are being created fresh with no existing state. Dependency ChainThe infrastructure follows a logical bottom-up build sequence:
Risk analysisRisk Level: MODERATE This is a greenfield deployment creating all new resources. While there's no risk of data loss or service disruption (nothing exists yet), the moderate risk stems from the complexity of coordinating 52 interdependent resources. Any misconfiguration in networking, security groups, or IAM permissions could result in a non-functional deployment requiring troubleshooting or rollback. Resource Changes Name Type Operation
+ vpc aws:ec2/vpc:Vpc create
+ nat_elastic_ip aws:ec2/eip:Eip create
+ alb_sg aws:ec2/securityGroup:SecurityGroup create
+ private_subnet_1_rta aws:ec2/routeTableAssociation:RouteTableAssociation create
+ http_listener aws:lb/listener:Listener create
+ execution_role_policy aws:iam/rolePolicyAttachment:RolePolicyAttachment create
+ datamanager_logs aws:cloudwatch/logGroup:LogGroup create
+ portfoliomanager_task aws:ecs/taskDefinition:TaskDefinition create
+ ecs_cluster aws:ecs/cluster:Cluster create
+ private_route_table aws:ec2/routeTable:RouteTable create
+ public_subnet_2 aws:ec2/subnet:Subnet create
+ public_internet_route aws:ec2/route:Route create
+ equitypricemodel_logs aws:cloudwatch/logGroup:LogGroup create
+ portfoliomanager_logs aws:cloudwatch/logGroup:LogGroup create
+ public_subnet_2_rta aws:ec2/routeTableAssociation:RouteTableAssociation create
+ datamanager_sd aws:servicediscovery/service:Service create
+ task_role_s3_policy aws:iam/rolePolicy:RolePolicy create
+ execution_role aws:iam/role:Role create
+ public_route_table aws:ec2/routeTable:RouteTable create
+ service_discovery aws:servicediscovery/privateDnsNamespace:PrivateDnsNamespace create
+ vpc_endpoints_sg aws:ec2/securityGroup:SecurityGroup create
+ datamanager_tg aws:lb/targetGroup:TargetGroup create
+ vpc_endpoints_ingress aws:ec2/securityGroupRule:SecurityGroupRule create
+ ecs_self_ingress aws:ec2/securityGroupRule:SecurityGroupRule create
+ execution_role_secrets_policy aws:iam/rolePolicy:RolePolicy create
+ s3_gateway_endpoint aws:ec2/vpcEndpoint:VpcEndpoint create
+ equitypricemodel_sd aws:servicediscovery/service:Service create
+ nat_gateway aws:ec2/natGateway:NatGateway create
+ alb aws:lb/loadBalancer:LoadBalancer create
+ equitypricemodel_task aws:ecs/taskDefinition:TaskDefinition create
+ datamanager_task aws:ecs/taskDefinition:TaskDefinition create
+ equitypricemodel_service aws:ecs/service:Service create
+ igw aws:ec2/internetGateway:InternetGateway create
+ private_subnet_1 aws:ec2/subnet:Subnet create
+ public_subnet_1 aws:ec2/subnet:Subnet create
+ ecs_sg aws:ec2/securityGroup:SecurityGroup create
+ public_subnet_1_rta aws:ec2/routeTableAssociation:RouteTableAssociation create
+ ecs_from_alb aws:ec2/securityGroupRule:SecurityGroupRule create
+ ecr_dkr_endpoint aws:ec2/vpcEndpoint:VpcEndpoint create
+ nat_route aws:ec2/route:Route create
+ task_role aws:iam/role:Role create
... and 12 other changes |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @applications/datamanager/Dockerfile:
- Around line 11-14: The Dockerfile currently downloads
libduckdb-linux-amd64.zip (v1.1.3) without integrity checking; update the
download to the v1.4.3 release URL (replace the version in the RUN command) and
add SHA256 checksum verification before unzipping: download the zip, verify it
with the official v1.4.3 SHA256 (replace EXPECTED_SHA256 with the checksum from
the DuckDB v1.4.3 release), fail the build if verification fails, then unzip and
clean up as before (affecting the RUN line that references
libduckdb-linux-amd64.zip and ldconfig).
🧹 Nitpick comments (1)
applications/datamanager/Dockerfile (1)
7-9: LGTM! Dependencies added for DuckDB installation.The addition of
wgetandunzipis necessary for downloading and extracting the DuckDB binary in the subsequent steps.Note: These tools remain in the chef stage (which is used as a base for planner and builder). If you want to minimize intermediate image sizes, consider adding cleanup after DuckDB installation, though this is optional.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
applications/datamanager/Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run Rust code checks
🔇 Additional comments (4)
applications/datamanager/Dockerfile (4)
24-24: LGTM! Targeted COPY improves build efficiency.The explicit copying of Cargo.toml, Cargo.lock, and the src directory follows Docker best practices and works well with the cargo-chef pattern for dependency caching.
Also applies to: 28-28
53-58: LGTM! Appropriate packages for runtime stage.The package installation includes necessary SSL/TLS dependencies and temporary tools (wget, unzip) that will be cleaned up after DuckDB installation.
67-67: LGTM! Ensures SSL certificates are properly configured.The explicit call to
update-ca-certificatesis necessary to ensure the certificate store is current and properly configured for SSL/TLS connections.
71-71: LGTM! Standard SSL/TLS configuration.Adding
SSL_CERT_DIRprovides better compatibility across different SSL/TLS implementations. Some libraries prefer this environment variable overSSL_CERT_FILE, and setting both ensures maximum compatibility.If you want to verify that your application or its dependencies specifically benefit from this variable, you can check the SSL/TLS library documentation (e.g., OpenSSL, rustls, reqwest).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @applications/datamanager/Dockerfile:
- Around line 9-10: Update the DuckDB download and checksum in the Dockerfile:
replace the wget URL that references v1.0.0 and the archive name
(libduckdb-linux-amd64.zip) with a current stable release (e.g., v1.4.3) and the
corresponding archive name/URL, and replace the existing sha256sum string with
the official checksum for that release; after updating, verify the checksum and
rebuild to ensure compatibility and successful verification.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
applications/datamanager/Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run Rust code checks
🔇 Additional comments (3)
applications/datamanager/Dockerfile (3)
14-16: Good layer hygiene.Purging temporary tools (wget, unzip) and cleaning apt cache in the same layer as their installation properly minimizes image size.
26-32: LGTM!The planner stage correctly copies the manifest files and source directory needed for
cargo chef prepareto generate the recipe.
53-67: Well-structured server stage.The multi-stage approach correctly:
- Copies only the DuckDB shared library (not build tools) from the chef stage
- Runs
ldconfigto register the library with the dynamic linker- Configures SSL certificates for secure connections
This keeps the final image minimal while ensuring runtime dependencies are available.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @applications/datamanager/Dockerfile:
- Around line 26-32: Remove the unnecessary source copy in the planner stage:
delete the COPY applications/datamanager/src/ applications/datamanager/src/ line
so the planner only copies Cargo.toml/Cargo.lock (and the per-crate Cargo.toml
already copied) before running cargo chef prepare --recipe-path recipe.json;
leave the subsequent builder stage that copies src/ intact so source changes
don’t bust the cargo-chef dependency cache.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
applications/datamanager/Cargo.tomlapplications/datamanager/Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run Rust code checks
🔇 Additional comments (4)
applications/datamanager/Dockerfile (3)
53-63: Clean runtime stage setup.The server stage correctly:
- Uses a minimal debian:trixie-slim base for smaller image size
- Installs only runtime dependencies (ca-certificates, openssl)
- Copies the DuckDB shared library from the chef stage (line 61)
- Runs
ldconfigto register the library in the dynamic linker cache (line 63)This ensures the compiled Rust binary can find and load the DuckDB library at runtime.
65-67: SSL certificate configuration is correct.The environment variables point to Debian's standard certificate locations, which work correctly with the
update-ca-certificatescommand executed earlier (line 58). This ensures proper SSL/TLS certificate validation for outbound connections.
7-16: Excellent security improvements addressing prior concerns.The DuckDB installation now includes:
- SHA256 checksum verification (line 10) - addresses the security concern from past reviews
- Upgrade to v1.4.3 latest stable release - addresses the outdated version concern
- Proper cleanup of temporary tools (wget/unzip) and apt cache - addresses image size concerns
The checksum
3c7bb8d586d39ccce56442c3f6bafa97f53d7b6bb5405dac827442fcb31494aamatches the official DuckDB v1.4.3 release. The multi-stage approach correctly installs the library in the chef stage and makes it available for subsequent stages.applications/datamanager/Cargo.toml (1)
33-33: Version alignment is correct and the update is safe.The duckdb crate version 1.4.3 correctly matches the native library version (v1.4.3) in the Dockerfile. The codebase uses only standard, stable duckdb-rs APIs (Connection::open_in_memory, execute_batch, prepare, query_map) that have remained unchanged across the 1.x series. The duckdb-rs Rust bindings maintain API stability throughout the 1.x release line, so this version bump introduces no breaking API changes.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @applications/datamanager/Dockerfile:
- Around line 7-16: The Dockerfile currently verifies libduckdb-linux-amd64.zip
using an unverifiable SHA256 value
("3c7bb8d586d39ccce56442c3f6bafa97f53d7b6bb5405dac827442fcb31494aa") which is
not published by DuckDB v1.4.3; either replace that sha256 check with a checksum
obtained from an official DuckDB source and add a comment documenting the
source, or remove the echo | sha256sum -c - step entirely and rely on HTTPS (or
switch to an alternative distribution that publishes verifiable checksums);
update the Dockerfile sequence that references libduckdb-linux-amd64.zip and the
sha256 snippet accordingly so the build no longer validates against an
unverifiable hash.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
applications/datamanager/Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run Rust code checks
🔇 Additional comments (2)
applications/datamanager/Dockerfile (2)
26-30: Perfect cargo-chef optimization.The planner stage now correctly copies only manifest files (Cargo.toml and Cargo.lock), excluding source code. This prevents cache invalidation on every source change and properly leverages cargo-chef's dependency caching, addressing the previous review concern.
51-63: Well-structured server stage with proper DuckDB and SSL integration.The server stage is correctly configured:
- Minimal base image with only runtime dependencies
- SSL certificates properly installed and updated (lines 53-57)
- DuckDB shared library copied from chef stage (line 59)
- ldconfig ensures library availability (line 61)
- SSL_CERT_FILE explicitly set for runtime clarity (line 63)
The reorganization properly separates build-time and runtime dependencies, ensuring a lean production image.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Overview
Changes
COPYstatements orderComments
Fixing a breaking build.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.