You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is to track and refer to other issues/prs for various prism features. This issue shouldn't generally be commented on, but have this top entry edited as needed, referring to granular issues for individual features and support.
Complete items should be checked, and have links to their completing PR or closed primary tracking issue.
Items marked should only have an issue filed when the work has started, typically there's a meaningful design proposal, and understanding of what the closing criteria are. This can be "X set of existing SDK tests now pass", or a given capability is possible (eg. UI related features.)
Prism Areas for Contribution
Beam Core Priorities
These are features that prevent Prism use and adoption.
Beam Feature Burn Down (from Java and Python Validates Runner Tests)
The goal in this section is correctly implement Beam features in prism such that the validates runner suites of each SDK pass. The following issues were produced from examining the test outputs from the failing tests. This list will be refined as new failures are discovered initial blocking features are implemented.
Notable issues found in trying to run the Non Go SDKs (Java, Python, or others). Tracked in #28187, and more granular issues should be referred to here.
Prism currently stores everything in memory. This includes all element data, in progress bundle data, pipeline info, artifacts etc. This is fast, but not the best use of memory for using prism long term as a stand alone runner.
Per Pipeline data should be moved to a local file cache.
They aren’t stored in memory when not needed. Eg. Artifacts shouldn’t live in memory once necessary environments are spun up.
Garbage collect artifacts after pipeline termination.
Garbage collect older pipelines after some threshold.
Separate Prism management logs and pipeline logs, with rolling log files
Optimized stages need to be stored, so no complex mapping needs to occur for any persisted state.
Per stage pending elements and state needs to be stored so bundles can be re-computed on restarts.
It should be possible for a pipeline to be aborted, and prism torn down, and for a pipeline to be restarted from where it left off, with new worker processes.
FrostDB is an embeddable-in-Go, write optimized, in-memory + persistence, columnar database that might be a good thing to look at to enable these features.
Bundles Retries
Prism currently doesn’t retry failed bundles. A bundle failure fails the pipeline.
Adding a sensible retries policy would improve bundle reliability.
Affects how elements are divided into bundles, and scheduled.
Eg. A failed bundle could be split into smaller and smaller bundles, until the failing elements are isolated. Such a strategy would also enable implementation of error tolerance policies for example.
Improve (static) Bundle Splitting
Prism currently schedules all available pending elements into a single bundle.
Instead it could use some heuristic to determine how to split pending elements into new bundles to improve worker level parallelism before Channel or Sub Element Splitting occurs.
SDK side logs via the Beam logging service to the runner should be available via the API. This may just require an SDK side change, or at least be a toggle on the prism stand alone binary.
These are non-user facing Beam features that Dataflow implements. In order for Prism to serve the purpose of validating pipeline locally before production runner execution, these are required, to reduce worker side execution differences.
What needs to happen?
This issue is to track and refer to other issues/prs for various prism features. This issue shouldn't generally be commented on, but have this top entry edited as needed, referring to granular issues for individual features and support.
Ultimately, this will eventually track support in the Beam Compatibility Matrix, and keeping the Prism README up to date.
Complete items should be checked, and have links to their completing PR or closed primary tracking issue.
Items marked should only have an issue filed when the work has started, typically there's a meaningful design proposal, and understanding of what the closing criteria are. This can be "X set of existing SDK tests now pass", or a given capability is possible (eg. UI related features.)
Prism Areas for Contribution
Beam Core Priorities
These are features that prevent Prism use and adoption.
In progress by @lostluck
Beam Feature Burn Down (from Java and Python Validates Runner Tests)
The goal in this section is correctly implement Beam features in prism such that the validates runner suites of each SDK pass. The following issues were produced from examining the test outputs from the failing tests. This list will be refined as new failures are discovered initial blocking features are implemented.
Features
Metrics
Windowing
Various Coders or PreProcessing issues.
State and Timer issues.
Next Steps - only once the filtered suite fully passes.
Non-Go Blockers
Notable issues found in trying to run the Non Go SDKs (Java, Python, or others). Tracked in #28187, and more granular issues should be referred to here.
Other Beam Core
This is an incomplete list of Beam features that would be nice to have.
Persistence & Reliability Features
Prism currently stores everything in memory. This includes all element data, in progress bundle data, pipeline info, artifacts etc. This is fast, but not the best use of memory for using prism long term as a stand alone runner.
Performance features
These are non-user facing Beam features that Dataflow implements. In order for Prism to serve the purpose of validating pipeline locally before production runner execution, these are required, to reduce worker side execution differences.
Stand Alone UI Based Features
These are features that are best tied to the ability to understand a job in the UI.
Other features
The following are known issues/desires without a specific categorization at present.
Completed Work
This section should be structured similarly to the Beam Compatibility Matrix for ease of transition to populating it there.
The text was updated successfully, but these errors were encountered: