Skip to content

Commit

Permalink
glue crawlers
Browse files Browse the repository at this point in the history
  • Loading branch information
cnfait committed Mar 26, 2024
1 parent 9547fc6 commit 612e154
Show file tree
Hide file tree
Showing 6 changed files with 12 additions and 8 deletions.
2 changes: 1 addition & 1 deletion sdlf-cicd/template-cicd-domain-team-role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ Resources:
- glue:StopCrawler
- glue:TagResource
- glue:UntagResource
Resource: !Sub arn:${AWS::Partition}:glue:${AWS::Region}:${AWS::AccountId}:crawler/sdlf-${pTeamName}-*
Resource: !Sub arn:${AWS::Partition}:glue:${AWS::Region}:${AWS::AccountId}:crawler/*
- Effect: Allow
Action:
- glue:CreateDatabase
Expand Down
1 change: 0 additions & 1 deletion sdlf-dataset/template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,6 @@ Resources:
Role: !Sub "{{resolve:ssm:/SDLF2/IAM/${pTeamName}/CrawlerRoleArn}}"
CrawlerSecurityConfiguration: !Sub "{{resolve:ssm:/SDLF2/Glue/${pTeamName}/SecurityConfigurationId}}"
DatabaseName: !Ref rGlueDataCatalog
Name: !Sub sdlf-${pTeamName}-${pDatasetName}-post-stage-crawler
Targets:
S3Targets:
- Path: !Sub s3://${pStageBucket}/post-stage/${pTeamName}/${pDatasetName}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
import os

import boto3
from datalake_library import octagon
from datalake_library.commons import init_logger
from datalake_library.configuration.resource_configs import DynamoConfiguration
from datalake_library.interfaces.dynamo_interface import DynamoInterface

logger = init_logger(__name__)

ssm_endpoint_url = "https://ssm." + os.getenv("AWS_REGION") + ".amazonaws.com"
ssm = boto3.client("ssm", endpoint_url=ssm_endpoint_url)

def get_glue_transform_details(bucket, team, dataset, pipeline, stage):
dynamo_config = DynamoConfiguration()
Expand Down Expand Up @@ -67,7 +71,8 @@ def lambda_handler(event, context):
event["body"]["glue"] = get_glue_transform_details(
bucket, team, dataset, pipeline, stage
) # custom user code called
event["body"]["glue"]["crawler_name"] = "-".join(["sdlf", team, dataset, "post-stage-crawler"])
event["body"]["glue"]["crawler_name"] = ssm.get_parameter(Name=f"/SDLF2/Glue/{team}/{dataset}/GlueCrawler")["Parameter"]["Value"]
"-".join(["sdlf", team, dataset, "post-stage-crawler"])
event["body"]["peh_id"] = peh_id
octagon_client.update_pipeline_execution(
status="{} {} Processing".format(stage, component), component=component
Expand Down
2 changes: 1 addition & 1 deletion sdlf-stageB/template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -584,7 +584,7 @@ Resources:
- glue:StartCrawler
- glue:GetCrawler
Resource:
- !Sub arn:${AWS::Partition}:glue:${AWS::Region}:${AWS::AccountId}:crawler/sdlf-${pTeamName}-*
- !Sub arn:${AWS::Partition}:glue:${AWS::Region}:${AWS::AccountId}:crawler/*
- Effect: Allow
Action:
- xray:PutTraceSegments # W11 exception
Expand Down
4 changes: 2 additions & 2 deletions sdlf-team/template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ Resources:
Properties:
Name: !Sub /SDLF2/Glue/${pTeamName}/SecurityConfigurationId
Type: String
Value: !Sub sdlf-${pTeamName}-glue-security-config # unfortunately AWS::Glue::SecurityConfiguration doesn't provide any return value
Value: !Sub sdlf2-${pTeamName}-glue-security-config # unfortunately AWS::Glue::SecurityConfiguration doesn't provide any return value
Description: !Sub Name of the ${pTeamName} Glue security configuration
#### END KMS STACK

Expand Down Expand Up @@ -481,7 +481,7 @@ Resources:
- glue:StartCrawler
- glue:GetCrawler
Resource:
- !Sub arn:${AWS::Partition}:glue:${AWS::Region}:${AWS::AccountId}:crawler/sdlf-${pTeamName}-*
- !Sub arn:${AWS::Partition}:glue:${AWS::Region}:${AWS::AccountId}:crawler/*
- Effect: Allow
Action:
- glue:GetTable
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,5 +80,5 @@ Resources:
MaxCapacity: 2.0
GlueVersion: "4.0"
Name: !Sub sdlf-${pTeamName}-${pDatasetName}-glue-job
SecurityConfiguration: !Sub "{{resolve:ssm:/SDLF/Glue/${pTeamName}/SecurityConfigurationId:1}}"
SecurityConfiguration: !Sub "{{resolve:ssm:/SDLF2/Glue/${pTeamName}/SecurityConfigurationId:1}}"
Role: !Ref rGlueRole

0 comments on commit 612e154

Please sign in to comment.