MLflow Pipelines Batch Scoring Template

The MLflow Batch Scoring Pipeline is an MLflow Pipeline for applying a registered MLflow model to a specified dataset.

For more information about the MLflow Batch Scoring Pipeline, check out the documentation at https://mlflow.org/docs/latest/pipelines.html#batch-scoring-pipeline. For more information about MLflow Pipelines, see https://mlflow.org/docs/latest/pipelines.html.

Installation instructions

Install MLflow Pipelines: pip install mlflow[pipelines]
Clone the MLflow Batch Scoring Pipeline template repository locally:

git clone https://github.com/mlflow/mlp-batch-scoring-template.git.

Enter the root directory of the MLflow Batch Scoring Pipeline template: cd mlp-batch-scoring-template
Install MLflow Batch Scoring Pipeline dependencies: pip install -r requirements.txt

Development Environment -- Databricks

Sync this repository with Databricks Repos and run the notebooks/databricks notebook on a Databricks Cluster running version 11.0 or greater of the Databricks Runtime or the Databricks Runtime for Machine Learning with workspace files support enabled.

Note: When making changes to pipelines on Databricks, it is recommended that you either edit files on your local machine and use dbx to sync them to Databricks Repos, as demonstrated here, or edit files in Databricks Repos by opening separate browser tabs for each YAML file or Python code module that you wish to modify.

For the latter approach, we recommend opening at least 3 browser tabs to facilitate easier development:

One tab for modifying configurations in pipeline.yaml and / or profiles/{profile}.yaml
One tab for modifying step function(s) defined in steps/{step}.py
One tab for modifying and running the driver notebook (notebooks/databricks)

Development Environment -- Local machine

Jupyter

Launch the Jupyter Notebook environment via the jupyter notebook command.
Open and run the notebooks/jupyter.ipynb notebook in the Jupyter environment.

Command-Line Interface (CLI)

First, enter the template root directory via cd mlp-batch-scoring-template. Then, try running the following MLflow CLI commands to get started. Note that the --step argument is optional; pipeline commands that are run without a --step act on the entire pipeline.

export MLFLOW_PIPELINES_PROFILE=local
mlflow pipelines --help
mlflow pipelines inspect --step step_name
mlflow pipelines run --step step_name
mlflow pipelines clean --step step_name

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
models		models
notebooks		notebooks
profiles		profiles
steps		steps
tests		tests
LICENSE.txt		LICENSE.txt
README.md		README.md
pipeline.yaml		pipeline.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLflow Pipelines Batch Scoring Template

Installation instructions

Development Environment -- Databricks

Development Environment -- Local machine

Jupyter

Command-Line Interface (CLI)

About

Releases

Packages

Contributors 2

Languages

License

kriscon-db/mlp-batch-scoring-template

Folders and files

Latest commit

History

Repository files navigation

MLflow Pipelines Batch Scoring Template

Installation instructions

Development Environment -- Databricks

Development Environment -- Local machine

Jupyter

Command-Line Interface (CLI)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages