Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Detect notebook include graph by analysing dbutils.notebook.run(...) calls #1200

Closed
1 task done
nfx opened this issue Apr 1, 2024 · 3 comments
Closed
1 task done
Labels
migrate/code Abstract Syntax Trees and other dark magic

Comments

@nfx
Copy link
Collaborator

nfx commented Apr 1, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

From https://docs.databricks.com/en/notebooks/notebook-workflows.html:

The dbutils.notebook API is a complement to %run because it lets you pass parameters to and return values from a notebook. This allows you to build complex workflows and pipelines with dependencies. For example, you can get a list of files in a directory and pass the names to another notebook, which is not possible with %run. You can also create if-then-else workflows based on return values or call other notebooks using relative paths. Unlike %run, the dbutils.notebook.run() method starts a new job to run the notebook. These methods, like all of the dbutils APIs, are available only in Python and Scala. However, you can use dbutils.notebook.run() to invoke an R notebook.

Proposed Solution

Scan Python code for dbutils.notebook.run(...) calls and treat them like a dependency

Additional Context

No response

@nfx nfx added the migrate/code Abstract Syntax Trees and other dark magic label Apr 1, 2024
@nfx nfx added this to UCX Apr 1, 2024
@github-project-automation github-project-automation bot moved this to Triage in UCX Apr 1, 2024
@ericvergnaud
Copy link
Contributor

What about Scala code ?

@nfx
Copy link
Collaborator Author

nfx commented Apr 2, 2024

We don't have scala parser yet, for now let's create higher level epic and postpone until py and sql migrations are fully operational. Scala also has good almost builtin parser which is typesafer, so it is just a matter of priorities.

In this task, let's yield Advisory when non-py and non-sql notebook gets called

@nfx
Copy link
Collaborator Author

nfx commented Apr 22, 2024

closed in:

@nfx nfx closed this as completed Apr 22, 2024
@github-project-automation github-project-automation bot moved this from Triage to Archive in UCX Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migrate/code Abstract Syntax Trees and other dark magic
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants