Skip to content

Commit

Permalink
Build docs for stable branch and make default (#1116)
Browse files Browse the repository at this point in the history
* Update Merlin links to main to stable branch

* Match only release tags for docs builds

* Setup local branches for docs multi build

* Setup docs redirect page to link to stable branch

* Remove stable reference from docs link in readme (rely on redirect)

---------

Co-authored-by: edknv <[email protected]>
  • Loading branch information
oliverholworthy and edknv authored Jun 12, 2023
1 parent cba6697 commit 729da27
Show file tree
Hide file tree
Showing 21 changed files with 78 additions and 72 deletions.
6 changes: 5 additions & 1 deletion .github/workflows/docs-sched-rebuild.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip tox
- name: Setup local branches for docs build
run: |
git branch --track main origin/main || true
git branch --track stable origin/stable || true
- name: Building docs (multiversion)
run: |
tox -vv -e docs-multi
Expand Down Expand Up @@ -83,7 +87,7 @@ jobs:
exit 0
fi
# If any of these commands fail, fail the build.
def_branch=$(gh api "repos/${GITHUB_REPOSITORY}" --jq ".default_branch")
def_branch="stable"
html_url=$(gh api "repos/${GITHUB_REPOSITORY}/pages" --jq ".html_url")
cat > index.html << EOF
<!DOCTYPE html>
Expand Down
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ into three categories:

### Your first issue

1. Read the project's [README.md](https://github.com/NVIDIA-Merlin/models/blob/main/README.md)
1. Read the project's [README.md](https://github.com/NVIDIA-Merlin/models/blob/stable/README.md)
to learn how to setup the development environment.
2. Find an issue to work on. The best way is to look for the [good first issue](https://github.com/NVIDIA-Merlin/models/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
or [help wanted](https://github.com/NVIDIA-Merlin/models/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) labels.
Expand Down Expand Up @@ -116,7 +116,7 @@ deep_block: Block
```

The [Intersphinx](https://docs.readthedocs.io/en/stable/guides/intersphinx.html)
extension truncates the text to [Schema](https://nvidia-merlin.github.io/core/main/api/merlin.schema.html)
extension truncates the text to [Schema](https://nvidia-merlin.github.io/core/stable/api/merlin.schema.html)
and makes it a link.

## Attribution
Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[![PyPI version shields.io](https://img.shields.io/pypi/v/merlin-models.svg)](https://pypi.python.org/pypi/merlin-models/)
![GitHub License](https://img.shields.io/github/license/NVIDIA-Merlin/models)
[![Documentation](https://img.shields.io/badge/documentation-blue.svg)](https://nvidia-merlin.github.io/models/main/)
[![Documentation](https://img.shields.io/badge/documentation-blue.svg)](https://nvidia-merlin.github.io/models/)

The Merlin Models library provides standard models for recommender systems with an aim for high-quality implementations
that range from classic machine learning models to highly-advanced deep learning models.
Expand All @@ -17,7 +17,7 @@ In our initial releases, Merlin Models features a TensorFlow API. The PyTorch AP

### Benefits of Merlin Models

**[RecSys model implementations](https://nvidia-merlin.github.io/models/main/models_overview.html)** - The library provides a high-level API for classic and state-of-the-art deep learning architectures for recommender models.
**[RecSys model implementations](https://nvidia-merlin.github.io/models/stable/models_overview.html)** - The library provides a high-level API for classic and state-of-the-art deep learning architectures for recommender models.
These models include both retrieval (e.g. Matrix Factorization, Two tower, YouTube DNN, ..) and ranking (e.g. DLRM, DCN-v2, DeepFM, ...) models.

**Building blocks** - Within Merlin Models, recommender models are built on reusable building blocks.
Expand All @@ -28,7 +28,7 @@ The library provides model definition blocks (MLP layers, factorization layers,
For example, models depend on NVTabular for pre-processing and integrate easily with Merlin Systems for inference.
The thoughtfully-designed integration makes it straightforward to build performant end-to-end RecSys pipelines.

**[Merlin Models DataLoaders](https://nvidia-merlin.github.io/models/main/api.html#loader-utility-functions)** - Merlin provides seamless integration with common deep learning frameworks, such as TensorFlow, PyTorch, and HugeCTR.
**[Merlin Models DataLoaders](https://nvidia-merlin.github.io/models/stable/api.html#loader-utility-functions)** - Merlin provides seamless integration with common deep learning frameworks, such as TensorFlow, PyTorch, and HugeCTR.
When training deep learning recommender system models, data loading can be a bottleneck.
To address the challenge, Merlin has custom, highly-optimized dataloaders to accelerate existing TensorFlow and PyTorch training pipelines.
The Merlin dataloaders can lead to a speedup that is nine times faster than the same training pipeline used with the GPU.
Expand All @@ -40,7 +40,7 @@ With the Merlin dataloaders, you can:
- Prepare batches asynchronously into the GPU to avoid CPU-to-GPU communication.
- Integrate easily into existing TensorFlow or PyTorch training pipelines by using a similar API.

To learn about the core features of Merlin Models, see the [Models Overview](https://nvidia-merlin.github.io/models/main/models_overview.html) page.
To learn about the core features of Merlin Models, see the [Models Overview](https://nvidia-merlin.github.io/models/stable/models_overview.html) page.

### Installation

Expand All @@ -59,7 +59,7 @@ pip install merlin-models

Merlin Models is included in the Merlin Containers.

Refer to the [Merlin Containers](https://nvidia-merlin.github.io/Merlin/main/containers.html) documentation page for information about the Merlin container names, URLs to the container images on the NVIDIA GPU Cloud catalog, and key Merlin components.
Refer to the [Merlin Containers](https://nvidia-merlin.github.io/Merlin/stable/containers.html) documentation page for information about the Merlin container names, URLs to the container images on the NVIDIA GPU Cloud catalog, and key Merlin components.

#### Installing Merlin Models from Source

Expand All @@ -75,7 +75,7 @@ cd models && pip install -e .
Merlin Models makes it straightforward to define architectures that adapt to different input features.
This adaptability is provided by building on a core feature of the NVTabular library.
When you use NVTabular for feature engineering, NVTabular creates a schema that identifies the input features.
You can see the `Schema` object in action by looking at the [From ETL to Training RecSys models - NVTabular and Merlin Models integrated example](https://nvidia-merlin.github.io/models/main/examples/02-Merlin-Models-and-NVTabular-integration.html) example notebook.
You can see the `Schema` object in action by looking at the [From ETL to Training RecSys models - NVTabular and Merlin Models integrated example](https://nvidia-merlin.github.io/models/stable/examples/02-Merlin-Models-and-NVTabular-integration.html) example notebook.

You can easily build popular RecSys architectures like [DLRM](http://arxiv.org/abs/1906.00091), as shown in the following code sample.
After you define the model, you can train and evaluate it with a typical Keras model.
Expand Down Expand Up @@ -107,11 +107,11 @@ eval_metrics = model.evaluate(valid, batch_size=1024, return_dict=True)
The target binary feature is also inferred from the schema (i.e., tagged as 'TARGET').

You can find more details and information about a low-level API in our overview of the
[Deep Learning Recommender Model](https://nvidia-merlin.github.io/models/main/models_overview.html#deep-learning-recommender-model).
[Deep Learning Recommender Model](https://nvidia-merlin.github.io/models/stable/models_overview.html#deep-learning-recommender-model).

### Notebook Examples and Tutorials

View the example notebooks in the [documentation](https://nvidia-merlin.github.io/models/main/examples/README.html) to help you become familiar with Merlin Models.
View the example notebooks in the [documentation](https://nvidia-merlin.github.io/models/stable/examples/README.html) to help you become familiar with Merlin Models.

The same notebooks are available in the `examples` directory from the [Merlin Models](https://github.com/NVIDIA-Merlin/models) GitHub repository.

Expand Down
4 changes: 2 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ the link is to the repository:

```markdown
Refer to the sample Python programs in the
[examples/blah](https://github.com/NVIDIA-Merlin/models/tree/main/examples/blah)
[examples/blah](https://github.com/NVIDIA-Merlin/models/tree/stable/examples/blah)
directory of the repository.
```

Expand Down Expand Up @@ -139,7 +139,7 @@ a relative path works both in the HTML docs page and in the repository browsing
Use a link to the HTML page like the following:

```markdown
<https://nvidia-merlin.github.io/NVTabular/main/Introduction.html>
<https://nvidia-merlin.github.io/NVTabular/stable/Introduction.html>
```

> I'd like to change this in the future. My preference would be to use a relative
Expand Down
8 changes: 5 additions & 3 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import re
import subprocess
import sys

Expand Down Expand Up @@ -115,24 +116,25 @@

if os.path.exists(gitdir):
tag_refs = subprocess.check_output(["git", "tag", "-l", "v*"]).decode("utf-8").split()
tag_refs = [tag for tag in tag_refs if re.match(r"^v[0-9]+.[0-9]+.[0-9]+$", tag)]
tag_refs = natsorted(tag_refs)[-6:]
smv_tag_whitelist = r"^(" + r"|".join(tag_refs) + r")$"
else:
smv_tag_whitelist = r"^v.*$"

smv_branch_whitelist = r"^main$"
smv_branch_whitelist = r"^(main|stable)$"

smv_refs_override_suffix = r"-docs"

html_sidebars = {"**": ["versions.html"]}
html_baseurl = "https://nvidia-merlin.github.io/models/main"
html_baseurl = "https://nvidia-merlin.github.io/models/stable/"

intersphinx_mapping = {
"python": ("https://docs.python.org/3", None),
"cudf": ("https://docs.rapids.ai/api/cudf/stable/", None),
"distributed": ("https://distributed.dask.org/en/latest/", None),
"torch": ("https://pytorch.org/docs/stable/", None),
"merlin-core": ("https://nvidia-merlin.github.io/core/main/", None),
"merlin-core": ("https://nvidia-merlin.github.io/core/stable/", None),
}

autodoc_inherit_docstrings = False
Expand Down
2 changes: 1 addition & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Merlin Models GitHub Repository

About Merlin
Merlin is the overarching project that brings together the Merlin projects.
See the `documentation <https://nvidia-merlin.github.io/Merlin/main/README.html>`_
See the `documentation <https://nvidia-merlin.github.io/Merlin/stable/README.html>`_
or the `repository <https://github.com/NVIDIA-Merlin/Merlin>`_ on GitHub.

Developer website for Merlin
Expand Down
2 changes: 1 addition & 1 deletion examples/02-Merlin-Models-and-NVTabular-integration.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1409,7 +1409,7 @@
"\n",
"In the next notebooks, we will explore multiple ranking models with Merlin Models.\n",
"\n",
"You can learn more about NVTabular, its functionality and supported ops by visiting our [github repository](https://github.com/NVIDIA-Merlin/NVTabular/) or exploring the [examples](https://github.com/NVIDIA-Merlin/NVTabular/tree/main/examples), such as [`Getting Started MovieLens`](https://github.com/NVIDIA-Merlin/NVTabular/blob/main/examples/getting-started-movielens/02-ETL-with-NVTabular.ipynb) or [`Scaling Criteo`](https://github.com/NVIDIA-Merlin/NVTabular/tree/main/examples/scaling-criteo)."
"You can learn more about NVTabular, its functionality and supported ops by visiting our [github repository](https://github.com/NVIDIA-Merlin/NVTabular/) or exploring the [examples](https://github.com/NVIDIA-Merlin/NVTabular/tree/stable/examples), such as [`Getting Started MovieLens`](https://github.com/NVIDIA-Merlin/NVTabular/blob/stable/examples/getting-started-movielens/02-ETL-with-NVTabular.ipynb) or [`Scaling Criteo`](https://github.com/NVIDIA-Merlin/NVTabular/tree/stable/examples/scaling-criteo)."
]
}
],
Expand Down
4 changes: 2 additions & 2 deletions examples/03-Exploring-different-models.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@
"\n",
"In this example, we'll demonstrate how to build and train several popular deep learning-based ranking model architectures. Merlin Models provides a high-level API to define those architectures, but allows for customization as they are composed by reusable building blocks.\n",
"\n",
"In this example notebook, we use for training and evaluation synthetic data that mimics the schema (features and cardinalities) of [Ali-CCP dataset](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1): Alibaba Click and Conversion Prediction dataset. The Ali-CCP is a dataset gathered from real-world traffic logs of the recommender system in Taobao, the largest online retail platform in the world. To download the raw Ali-CCP training and test datasets visit [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can get the raw dataset via this [get_aliccp() function](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/datasets/ecommerce/aliccp/dataset.py#L43) and generate the parquet files from it to be used in this example.\n",
"In this example notebook, we use for training and evaluation synthetic data that mimics the schema (features and cardinalities) of [Ali-CCP dataset](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1): Alibaba Click and Conversion Prediction dataset. The Ali-CCP is a dataset gathered from real-world traffic logs of the recommender system in Taobao, the largest online retail platform in the world. To download the raw Ali-CCP training and test datasets visit [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can get the raw dataset via this [get_aliccp() function](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/datasets/ecommerce/aliccp/dataset.py#L43) and generate the parquet files from it to be used in this example.\n",
"\n",
"### Learning objectives\n",
"- Preparing the data with NVTabular\n",
Expand Down Expand Up @@ -432,7 +432,7 @@
}
},
"source": [
"We're ready to start training, for that, we create our dataset objects, and under the hood we use Merlin `Loader` class for reading chunks of parquet files. `Loader` asynchronously iterate through CSV or Parquet dataframes on GPU by leveraging an NVTabular `Dataset`. To read more about Merlin optimized dataloaders visit [here](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/models/tf/dataset.py#L141)."
"We're ready to start training, for that, we create our dataset objects, and under the hood we use Merlin `Loader` class for reading chunks of parquet files. `Loader` asynchronously iterate through CSV or Parquet dataframes on GPU by leveraging an NVTabular `Dataset`. To read more about Merlin optimized dataloaders visit [here](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/models/tf/dataset.py#L141)."
]
},
{
Expand Down
6 changes: 3 additions & 3 deletions examples/04-Exporting-ranking-models.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@
"source": [
"We use the synthetic train and test datasets generated by mimicking the real [Ali-CCP: Alibaba Click and Conversion Prediction](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1) dataset to build our recommender system ranking models. \n",
"\n",
"If you would like to use real Ali-CCP dataset instead, you can download the training and test datasets on [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can then use [get_aliccp()](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/datasets/ecommerce/aliccp/dataset.py#L43) function to curate the raw csv files and save them as parquet files."
"If you would like to use real Ali-CCP dataset instead, you can download the training and test datasets on [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can then use [get_aliccp()](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/datasets/ecommerce/aliccp/dataset.py#L43) function to curate the raw csv files and save them as parquet files."
]
},
{
Expand Down Expand Up @@ -459,7 +459,7 @@
}
},
"source": [
"In this example, we build, train, and export a Deep Learning Recommendation Model [(DLRM)](https://arxiv.org/abs/1906.00091) architecture. To learn more about how to train different deep learning models, how easily transition from one model to another and the seamless integration between data preparation and model training visit [03-Exploring-different-models.ipynb](https://github.com/NVIDIA-Merlin/models/blob/main/examples/03-Exploring-different-models.ipynb) notebook."
"In this example, we build, train, and export a Deep Learning Recommendation Model [(DLRM)](https://arxiv.org/abs/1906.00091) architecture. To learn more about how to train different deep learning models, how easily transition from one model to another and the seamless integration between data preparation and model training visit [03-Exploring-different-models.ipynb](https://github.com/NVIDIA-Merlin/models/blob/stable/examples/03-Exploring-different-models.ipynb) notebook."
]
},
{
Expand Down Expand Up @@ -693,7 +693,7 @@
"source": [
"We trained and exported our ranking model and NVTabular workflow. In the next step, we will learn how to deploy our trained DLRM model into [Triton Inference Server](https://github.com/triton-inference-server/server) with [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library. NVIDIA Triton Inference Server (TIS) simplifies the deployment of AI models at scale in production. TIS provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. It supports a number of different machine learning frameworks such as TensorFlow and PyTorch.\n",
"\n",
"For the next step, visit [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library and execute [Serving-Ranking-Models-With-Merlin-Systems](https://github.com/NVIDIA-Merlin/systems/blob/main/examples/Serving-Ranking-Models-With-Merlin-Systems.ipynb) notebook to deploy our saved DLRM and NVTabular workflow models as an ensemble to TIS and obtain prediction results for a qiven request. In doing so, you need to mount the saved DLRM and NVTabular workflow to the inference container following the instructions in the [README.md](https://github.com/NVIDIA-Merlin/systems/blob/main/examples/README.md)."
"For the next step, visit [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library and execute [Serving-Ranking-Models-With-Merlin-Systems](https://github.com/NVIDIA-Merlin/systems/blob/stable/examples/Serving-Ranking-Models-With-Merlin-Systems.ipynb) notebook to deploy our saved DLRM and NVTabular workflow models as an ensemble to TIS and obtain prediction results for a qiven request. In doing so, you need to mount the saved DLRM and NVTabular workflow to the inference container following the instructions in the [README.md](https://github.com/NVIDIA-Merlin/systems/blob/stable/examples/README.md)."
]
}
],
Expand Down
2 changes: 1 addition & 1 deletion examples/05-Retrieval-Model.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -997,7 +997,7 @@
"id": "155af447-97c4-4875-97ad-84e678fd7b40",
"metadata": {},
"source": [
"Note that above when we set `validation_data=valid` in the `model.fit()`, we compute evaluation metrics on validation set using the negative sampling strategy used for training. To determine the exact accuracy of our trained retrieval model, we need to compute the similarity score between a given query and all possible candidates. The higher the score of the positive candidate (the one that is already interacted with, i.e. target item_id returned by dataloader), the more accurate the model is. We can do this using the `topk_model` model that we create below via `to_top_k_encoder` method, and the following section shows how to instantiate it. The `to_top_k_encoder()` is a method of the [RetrievalModelV2](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/models/tf/models/base.py) class. \n",
"Note that above when we set `validation_data=valid` in the `model.fit()`, we compute evaluation metrics on validation set using the negative sampling strategy used for training. To determine the exact accuracy of our trained retrieval model, we need to compute the similarity score between a given query and all possible candidates. The higher the score of the positive candidate (the one that is already interacted with, i.e. target item_id returned by dataloader), the more accurate the model is. We can do this using the `topk_model` model that we create below via `to_top_k_encoder` method, and the following section shows how to instantiate it. The `to_top_k_encoder()` is a method of the [RetrievalModelV2](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/models/tf/models/base.py) class. \n",
"\n",
"`unique_rows_by_features` : A utility function allows extracting both unique user and item features tables as Merlin Dataset object that can easily be converted to a cuDF data frame. The function extracts unique rows from a specified dataset (transformed train set) based on a specified id-column tags (`ITEM` and `ITEM_ID`)."
]
Expand Down
Loading

0 comments on commit 729da27

Please sign in to comment.