Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Detect usage of streaming checkpoint in Python code onto DBFS or other storage #1103

Closed
1 task done
Tracked by #1085
nfx opened this issue Mar 25, 2024 · 1 comment
Closed
1 task done
Tracked by #1085
Labels
migrate/code Abstract Syntax Trees and other dark magic

Comments

@nfx
Copy link
Collaborator

nfx commented Mar 25, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

in

# Function to create and setup a new StreamingContext
def functionToCreateContext():
    sc = SparkContext(...)  # new context
    ssc = StreamingContext(...)
    lines = ssc.socketTextStream(...)  # create DStreams
    ...
    ssc.checkpoint(checkpointDirectory)  # set checkpoint directory
    return ssc

# Get StreamingContext from checkpoint data or create a new one
context = StreamingContext.getOrCreate(checkpointDirectory, functionToCreateContext)

the checkpoint directory can be on DBFS mountpoint or a storage account.

Proposed Solution

raise alert

Additional Context

Related issues:

@nfx nfx added enhancement New feature or request needs-triage labels Mar 25, 2024
@nfx nfx added this to UCX Mar 25, 2024
@github-project-automation github-project-automation bot moved this to Triage in UCX Mar 25, 2024
@nfx nfx added migrate/code Abstract Syntax Trees and other dark magic and removed enhancement New feature or request needs-triage labels Mar 25, 2024
@nfx nfx moved this from Triage to Quarter Backlog in UCX Apr 10, 2024
@nfx nfx moved this from Quarter Backlog to Month Backlog in UCX Jul 4, 2024
@nfx nfx moved this from Month Backlog to Active Backlog in UCX Jul 4, 2024
@nfx
Copy link
Collaborator Author

nfx commented Nov 5, 2024

Duplicate of #492

@nfx nfx marked this as a duplicate of #492 Nov 5, 2024
@nfx nfx closed this as completed Nov 5, 2024
@github-project-automation github-project-automation bot moved this from Active Backlog to Archive in UCX Nov 5, 2024
@nfx nfx removed this from UCX Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migrate/code Abstract Syntax Trees and other dark magic
Projects
None yet
Development

No branches or pull requests

1 participant