Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assessment for RunSubmit API usages #266

Closed
5 tasks
renardeinside opened this issue Sep 22, 2023 · 3 comments
Closed
5 tasks

Assessment for RunSubmit API usages #266

renardeinside opened this issue Sep 22, 2023 · 3 comments
Assignees
Labels
cloud/azure issues related to Azure enhancement New feature or request feat/crawler step/assessment go/uc/upgrade - Assessment Step step/assign metastore go/uc/upgrade Assign Metastore

Comments

@renardeinside
Copy link
Contributor

renardeinside commented Sep 22, 2023

Problem statement

Existing processes that create RunSubmit Jobs (the so-called ephemeral jobs) with the following properties:

  • DBR > 11.X
  • [AWS-specific] Uses instance_profile
  • Has no data_security_mode property specified in the job creation request
  • [Azure-specific] some configurations in the spark_conf (To be confirmed)

May be broken when UC is enabled in a given workspace (meaning when workspace is assigned to the metastore).

Why this happens?

When UC is enabled, all RunSubmit Jobs with DBR 11.X+ will by default UC-enabled. If there is a conflict in permissions between UC and service principal (e.g. instance profile), the job will fail.

The reason and the change is described here.

How can we identify the identical RunSubmit?

This requires additional internal discussion.

TODO:

  • get a list of all (persisted) jobs
  • get a list of all job runs
  • find job runs that have no persisted job (out of workflow)
  • group all job runs from non-persisted jobs to approximate the number of unique airflow/azure data factory DAGs.
  • identify job runs that do not include data_security_mode in the job creation request and are run against 11.x compute
@renardeinside renardeinside modified the milestones: 1 week, 1 month Sep 22, 2023
@larsgeorge-db larsgeorge-db added enhancement New feature or request to be discussed labels Sep 22, 2023
@zpappa
Copy link

zpappa commented Sep 28, 2023

@nfx let's prioritize this one for next week

@nfx
Copy link
Collaborator

nfx commented Sep 28, 2023

@tamilselvanveeramani Is assigned (working) on it, as you see.

Right, @tamilselvanveeramani ?

@pohlposition pohlposition added step/assessment go/uc/upgrade - Assessment Step step/assign metastore go/uc/upgrade Assign Metastore labels Sep 28, 2023
@nfx
Copy link
Collaborator

nfx commented Oct 2, 2023

@tamilselvanveeramani could not make any progress on this one, up for the next pickup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment