Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using pipestat to report results to PEPhub #125

Closed
nsheff opened this issue Dec 5, 2023 · 1 comment
Closed

Using pipestat to report results to PEPhub #125

nsheff opened this issue Dec 5, 2023 · 1 comment
Labels
enhancement New feature or request likely-solved
Milestone

Comments

@nsheff
Copy link
Contributor

nsheff commented Dec 5, 2023

It would be nice if a pipestat-aware pipeline could update a result table in PEPhub somehow.

Some questions:

  1. are result tables the same as input tables? Right now we think of PEPhub as serving input tables. Would reported results add columns to these tables, or should they be different "result tables"?
  2. pephub connects to a database with pepdbagent, which relies on sqlalchemy. pipestat connects to a database with a dbbackend, using sqlmodel. how do those link?
  3. would we write a "PEPhubBackend" class, to go alongside the FileBackend and DBBackend? If we did that, would the the reporting happen through the PEPhub API, or through a direct database connection?
    3a. The former would require PEPhub API changes, to allow users to update values via API.
    3b. The latter would be an admin-only flow; we could already do it for our internal use cases, but it's less useful, users couldn't use it for their own pipelines.

Updating a local PEP

what about if a pipeline could just update its local PEP? There could be a PEPBackend. You'd point looper to the PEP as its input, and then also configure pipestat with a pointer to the same PEP. It would be the input and output for the pipeline. But basically, looper table already does this.

@nsheff nsheff added the enhancement New feature or request label Dec 5, 2023
@nsheff
Copy link
Contributor Author

nsheff commented Feb 14, 2024

Here's an example API for this

import pipestat
psm = pipestat.PipestatManager(
    sample_name="my sample",
    schema_path="pipeline/output_schema.yaml",  # maybe not required?
    pephub_registry_path="nsheff/my_demo_project",  # Here's where the result will be pushed
    pipeline_type="sample"
)

result_value = "some computed result"

psm.report({"some_attribute": result_value}, sample_name=sample["sample_name"])  # adds result to PEPhub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request likely-solved
Projects
None yet
Development

No branches or pull requests

2 participants