Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
🚀 The Update (preview) for forstmeier/pocketsizefund/production (at 071e83e) was successful. ✨ Neo ExplanationInitial production infrastructure deployment for PocketSizeFund, creating a complete AWS environment with VPC networking, ECS container orchestration, load balancing, and storage for a three-service portfolio management application.Root Cause AnalysisThis is an initial infrastructure deployment for the PocketSizeFund application on AWS. The developer is provisioning an entirely new production environment from scratch - there are no existing resources, and all 64 resources shown are being created for the first time. This represents the foundation of a microservices-based portfolio management system with three main services: a data manager, an equity price model, and a portfolio manager. Dependency ChainThe deployment follows a standard AWS infrastructure pattern:
Risk analysisLow Risk - This is a greenfield deployment with no existing infrastructure. There are no resources being replaced or deleted, eliminating concerns about data loss or service disruption. Resource Changes Name Type Operation
+ pocketsizefund-production pulumi:pulumi:Stack create
+ execution_role aws:iam/role:Role create
+ public_route_table aws:ec2/routeTable:RouteTable create
+ private_route_table aws:ec2/routeTable:RouteTable create
+ service_discovery aws:servicediscovery/privateDnsNamespace:PrivateDnsNamespace create
+ portfoliomanager_tg aws:lb/targetGroup:TargetGroup create
+ execution_role_secrets_policy aws:iam/rolePolicy:RolePolicy create
+ public_subnet_2_rta aws:ec2/routeTableAssociation:RouteTableAssociation create
+ portfoliomanager_sd aws:servicediscovery/service:Service create
+ datamanager_sd aws:servicediscovery/service:Service create
+ portfoliomanager_rule aws:lb/listenerRule:ListenerRule create
+ portfoliomanager_service aws:ecs/service:Service create
+ datamanager_task aws:ecs/taskDefinition:TaskDefinition create
+ private_subnet_1_rta aws:ec2/routeTableAssociation:RouteTableAssociation create
+ ecs_egress aws:ec2/securityGroupRule:SecurityGroupRule create
+ portfoliomanager_task aws:ecs/taskDefinition:TaskDefinition create
+ equitypricemodel_service aws:ecs/service:Service create
+ datamanager_logs aws:cloudwatch/logGroup:LogGroup create
+ execution_role_policy aws:iam/rolePolicyAttachment:RolePolicyAttachment create
+ model_artifacts_bucket_versioning aws:s3/bucketVersioningV2:BucketVersioningV2 create
+ datamanager_service aws:ecs/service:Service create
+ vpc_endpoints_sg aws:ec2/securityGroup:SecurityGroup create
+ private_subnet_2_rta aws:ec2/routeTableAssociation:RouteTableAssociation create
+ nat_elastic_ip aws:ec2/eip:Eip create
+ data_bucket_versioning aws:s3/bucketVersioningV2:BucketVersioningV2 create
+ igw aws:ec2/internetGateway:InternetGateway create
+ datamanager_tg aws:lb/targetGroup:TargetGroup create
+ public_internet_route aws:ec2/route:Route create
+ ecs_from_alb aws:ec2/securityGroupRule:SecurityGroupRule create
+ http_listener aws:lb/listener:Listener create
+ data_bucket aws:s3/bucketV2:BucketV2 create
+ model_artifacts_bucket aws:s3/bucketV2:BucketV2 create
+ equitypricemodel_repository aws:ecr/repository:Repository create
+ task_role aws:iam/role:Role create
+ ecs_cluster aws:ecs/cluster:Cluster create
+ sagemaker_ecr_policy aws:iam/rolePolicy:RolePolicy create
+ ecr_api_endpoint aws:ec2/vpcEndpoint:VpcEndpoint create
+ sagemaker_execution_role aws:iam/role:Role create
+ private_subnet_2 aws:ec2/subnet:Subnet create
+ alb_sg aws:ec2/securityGroup:SecurityGroup create
+ alb aws:lb/loadBalancer:LoadBalancer create
... and 24 other changes |
Confidence Score: 4/5
|
| ticker = ticker_data.get("ticker", "") | ||
| # Polygon uses 'sic_description' for industry, but we can also check other fields | ||
| # The primary_exchange and type fields help filter | ||
| if ticker_data.get("type") not in ("CS", "ADRC"): # Common Stock or ADR |
There was a problem hiding this comment.
Corrected spelling of 'ADRC' to 'ADR' in comment. The ticker type code is 'ADRC' but the comment should say 'ADR Common' or just keep 'ADR' as written.
| if ticker_data.get("type") not in ("CS", "ADRC"): # Common Stock or ADR | |
| if ticker_data.get("type") not in ("CS", "ADRC"): # Common Stock or ADR Common |
| output_key: str, | ||
| ) -> str: | ||
| """Write consolidated training data to S3 as parquet.""" | ||
| import io |
There was a problem hiding this comment.
The io module is imported inside the function write_training_data_to_s3 rather than at the module level. Move this import to the top of the file with other imports for consistency and better code organization.
| let start_date_int = start_timestamp.format("%Y%m%d").to_string().parse::<i32>().unwrap_or(0); | ||
| let end_date_int = end_timestamp.format("%Y%m%d").to_string().parse::<i32>().unwrap_or(99999999); |
There was a problem hiding this comment.
The magic number 99999999 is used as a fallback for end_date_int. Consider defining this as a named constant (e.g., MAX_DATE_INT) to improve code readability and maintainability.
forstmeier
left a comment
There was a problem hiding this comment.
Bots left good cleanup comments. Overall looks good especially the categories stuff.
| output_key: str, | ||
| ) -> str: | ||
| """Write consolidated training data to S3 as parquet.""" | ||
| import io |
|
|
||
| logger = structlog.get_logger() | ||
|
|
||
| POLYGON_BASE_URL = "https://api.polygon.io" |
There was a problem hiding this comment.
This can probably be fetched from Secrets Manager and should be the massive URL and variable name.
| export AWS_S3_DATA_BUCKET="$(pulumi stack output aws_s3_data_bucket)" | ||
| export AWS_S3_MODEL_ARTIFACTS_BUCKET="$(pulumi stack output aws_s3_model_artifacts_bucket)" |
There was a problem hiding this comment.
These expect the stack to be up rather than permanent buckets just FYI.
There was a problem hiding this comment.
buckets being made permanent
|
Oh, and one general thing (I can't remember if this is in the pull request changes or not) but any file/variable names that reference the model should be |
This pull request enhances logging, error handling, and data normalization in the
datamanagerRust application, especially around DataFrame creation and S3/HTTP interactions. It also updates the expected data types for equity bar fields and adds new environment variables for DuckDB library paths in the Dockerfile. The changes improve observability, debugging, and robustness of the application's data pipelines.Logging and Error Handling Improvements:
debug,info, andwarn) throughout DataFrame creation functions inapplications/datamanager/src/data.rsto trace row counts, column normalization, errors, and final DataFrame shapes. Error handling now logs warnings before returning errors. [1] [2] [3] [4] [5] [6] [7]applications/datamanager/src/equity_bars.rs, including API request details, response sizes, JSON parsing, and S3 upload status. Errors now usewarn!instead ofinfo!for failures. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]Data Normalization and Type Updates:
ticker,side,action,sector,industry) to uppercase in DataFrames for consistency. [1] [2] [3] [4]BarResultstruct inapplications/datamanager/src/equity_bars.rsfromOption<u64>toOption<f64>to better reflect the expected data types from the API. [1] [2]Docker and Environment Configuration:
applications/datamanager/Dockerfileto support database operations.Local Settings Update:
.claude/settings.local.jsonto include additional AWS and Python commands, improving local development capabilities.