Skip to content

Commit

Permalink
add example workflows
Browse files Browse the repository at this point in the history
* Partially addresses cylc/cylc-doc#627
* Add some examples of Cylc workflow implementation patterns.

---

Co-authored-by: Hilary James Oliver <[email protected]>
  • Loading branch information
oliver-sanders and hjoliver committed Jan 24, 2024
1 parent ec8eec8 commit bf4c77c
Show file tree
Hide file tree
Showing 26 changed files with 730 additions and 38 deletions.
22 changes: 4 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,28 +29,14 @@ domains.
# install cylc
conda install cylc-flow

# write your first workflow
mkdir -p ~/cylc-src/example
cat > ~/cylc-src/example/flow.cylc <<__CONFIG__
[scheduling]
initial cycle point = 1
cycling mode = integer
[[graph]]
P1 = """
a => b => c & d
b[-P1] => b
"""
[runtime]
[[a, b, c, d]]
script = echo "Hello $CYLC_TASK_NAME"
__CONFIG__
# extract an example to run
cylc get-resources examples/integer-cycling

# install and run it
cylc install example
cylc play example
cylc vip integer-cycling # vip = validate, install and play

# watch it run
cylc tui example
cylc tui integer-cycling
```

### The Cylc Ecosystem
Expand Down
27 changes: 27 additions & 0 deletions cylc/flow/etc/examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Examples

These examples are intended to illustrate the major patterns for implementing
Cylc workflows. The hope is that users can find a workflow which fits their
pattern, make a copy and fill in the details. Keep the examples minimal and
abstract. We aren't trying to document every Cylc feature here, just the
major design patterns.

These examples are auto-documented in cylc-doc which looks for an `index.rst`
file in each example.

Users can extract them using `cylc get-resources` which will put them into the
configured Cylc source directory (`~/cylc-src` by default). They can then be
run using the directory name, e.g. `cylc vip hello-world`.

Files:

* `index.rst`
This file is used to generate a page in the documentation for the example.
This file is excluded when the user extracts the example.
* `.validate`
This is a test file, it gets detected and run automatically.
This file is excluded when the user extracts the example.
* `README.rst`
Examples can include a README file, to save duplication, you can
`.. include::` this in the `index.rst` file (hence using ReStructuredText
rather than Markdown).
33 changes: 33 additions & 0 deletions cylc/flow/etc/examples/converging-workflow/.validate
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/bin/bash
# THIS FILE IS PART OF THE CYLC WORKFLOW ENGINE.
# Copyright (C) NIWA & British Crown (Met Office) & Contributors.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

set -eux

ID="$(< /dev/urandom tr -dc A-Za-z | head -c6)"

# start the workflow
cylc vip --check-circular --no-run-name --no-detach --workflow-name "$ID"

# it should have reached the forth cycle
test -d "${HOME}/cylc-run/${ID}/log/job/4"
test ! -d "${HOME}/cylc-run/${ID}/log/job/5"

# lint
cylc lint "$ID"

# clean up
cylc clean "$ID"
35 changes: 35 additions & 0 deletions cylc/flow/etc/examples/converging-workflow/flow.cylc
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
[meta]
title = Converging Workflow
description = """
A workflow which runs a pattern of tasks over and over until a
convergence condition has been met.
"""

[scheduling]
cycling mode = integer
initial cycle point = 1
[[graph]]
P1 = """
# run "increment" then check the convergence condition
check_convergence[-P1]:not_converged? => increment => check_convergence

# if the workflow has converged, then do nothing
check_convergence:converged?
"""

[runtime]
[[increment]]
# a task which evolves the data
[[check_convergence]]
# a task which checks whether the convergence condition has been met
script = """
if (( CYLC_TASK_CYCLE_POINT == 4 )); then
# for the purpose of example, assume convergence at cycle point 4
cylc message -- 'convergence condition met'
else
cylc message -- 'convergence condition not met'
fi
"""
[[[outputs]]]
converged = 'convergence condition met'
not_converged = 'convergence condition not met'
39 changes: 39 additions & 0 deletions cylc/flow/etc/examples/converging-workflow/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
Converging Workflow
===================

.. admonition:: Get a copy of this example
:class: hint

.. code-block:: console
$ cylc get-resources examples/converging-workflow
A workflow which runs a pattern of tasks over and over until a convergence
condition has been met.

* The ``increment`` task runs some kind of model or process which increments
us toward the solution.
* The ``check_convergence`` task, checks if the convergence condition has been
met.

.. literalinclude:: flow.cylc
:language: cylc

Run it with::

$ cylc vip converging-workflow

.. admonition:: Example - Genetic algorithms
:class: hint

.. _genetic algorithm: https://en.wikipedia.org/wiki/Genetic_algorithm

An example of a converging workflow might be a `genetic algorithm`_, where you
"breed" entities, then test their "fitness", and breed again, over and over
until you end up with an entity which is able to satisfy the requirement.

.. digraph:: Example

random_seed -> breed -> test_fitness
test_fitness -> breed
test_fitness -> stop
28 changes: 28 additions & 0 deletions cylc/flow/etc/examples/datetime-cycling/.validate
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#!/bin/bash
# THIS FILE IS PART OF THE CYLC WORKFLOW ENGINE.
# Copyright (C) NIWA & British Crown (Met Office) & Contributors.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

set -eux

ID="$(< /dev/urandom tr -dc A-Za-z | head -c6)"
cylc vip \
--check-circular \
--no-detach \
--no-run-name \
--final-cycle-point "$(isodatetime now --format 'CCYYMMDD')T00" \
--workflow-name "$ID"
cylc lint "$ID"
cylc clean "$ID"
44 changes: 44 additions & 0 deletions cylc/flow/etc/examples/datetime-cycling/flow.cylc
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
[meta]
title = Datetime Cycling
description = """
A basic cycling workflow which runs the same set of tasks over
and over. Each cycle will be given a datetime identifier.
The task "a" will wait until the real-world (or wallclock) time passes
the cycle time.
Try changing the "initial cycle point" to "previous(00T00) - P1D" to
see how this works.
"""

[scheduling]
# start the workflow cycling at 00:00 this morning
initial cycle point = previous(T00)

[[graph]]
# repeat this with a "P"eriod of "1" "D"ay -> P1D
P1D = """
# this is the workflow we want to repeat:
a => b => c & d

# this is an "inter-cycle dependency", it makes the task "b"
# wait until its previous instance has successfully completed:
b[-P1D] => b

# this makes the task "a" wait until its cycle point matches
# the real world time - i.e. it prevents the workflow from getting
# ahead of the clock. If the workflow is running behind (e.g. after
# a delay, or from an earlier initial cycle point) it will catch
# until the clock-trigger constrains it again. To run entirely in
# "simulated time" remove this line:
@wall_clock => a
"""

[runtime]
[[root]]
# all tasks will "inherit" the configuration in the "root" section
script = echo "Hello, I'm task $CYLC_TASK_NAME in cycle $CYLC_TASK_CYCLE_POINT!"
[[a]]
[[b]]
[[c]]
[[d]]
16 changes: 16 additions & 0 deletions cylc/flow/etc/examples/datetime-cycling/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Datetime Cycling
================

.. admonition:: Get a copy of this example
:class: hint

.. code-block:: console
$ cylc get-resources examples/datetime-cycling
.. literalinclude:: flow.cylc
:language: cylc

Run it with::

$ cylc vip datetime-cycling
47 changes: 47 additions & 0 deletions cylc/flow/etc/examples/event-driven-cycling/.validate
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#!/bin/bash
# THIS FILE IS PART OF THE CYLC WORKFLOW ENGINE.
# Copyright (C) NIWA & British Crown (Met Office) & Contributors.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

set -eux

ID="$(< /dev/urandom tr -dc A-Za-z | head -c6)"

# start the workflow
cylc vip --check-circular --no-run-name --workflow-name "$ID"
sleep 1 # give it a reasonable chance to start up

# kick off the first cycle
./bin/trigger "$ID" WORLD=earth

# wait for it to complete
cylc workflow-state "$ID" \
--task=run \
--point=1 \
--status=succeeded \
--max-polls=60 \
--interval=1

# check the job received the environment variable we provided
grep 'Hello earth' "$HOME/cylc-run/$ID/log/job/1/run/NN/job.out"

# stop the workflow
cylc stop --kill --max-polls=10 --interval=2 "$ID"

# lint
cylc lint "$ID"

# clean up
cylc clean "$ID"
49 changes: 49 additions & 0 deletions cylc/flow/etc/examples/event-driven-cycling/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Cylc is good at orchestrating tasks to a schedule, e.g:

* ``PT1H`` - every hour
* ``P1D`` - every day
* ``P1M`` - every month
* ``PT1H ! (T00, T12)`` - every hour, except midnight and midday.

But sometimes the things you want to run don't have a schedule.

This example uses ``cylc ext-trigger`` to establish a pattern where Cylc waits
for an external signal and starts a new cycle every time a signal is recieved.

The signal can carry data using the ext-trigger ID, this example sets the ID
as a file path containing some data that we want to make available to the tasks
that run in the cycle it triggers.

To use this example, first start the workflow as normal::

cylc vip event-driven-cycling

Then, when you're ready, kick off a new cycle, specifying any
environment variables you want to configure this cycle with::

./bin/trigger <workflow-id> WORLD=earth

Replacing ``<workflow-id>`` with the ID you installed this workflow as.

.. admonition:: Example - CI/CD
:class: hint

This pattern is good for CI/CD type workflows where you're waiting on
external events. This pattern is especially powerful when used with
sub-workflows where it provides a solution to two-dimensional cycling
problems.

.. admonition:: Example - Polar satellite data processing
:class: hint

Polar satellites pass overhead at irregular intervals. This makes it tricky
to schedule data processing because you don't know when the satellite will
pass over the receiver station. With the event driven cycling approach you
could start a new cycle every time data arrives.

.. note::

* The number of parallel cycles can be adjusted by changing the
:cylc:conf:`[scheduling]runahead limit`.
* To avoid hitting the runahead limit, ensure that failures are handled in
the graph.
32 changes: 32 additions & 0 deletions cylc/flow/etc/examples/event-driven-cycling/bin/trigger
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/usr/bin/env bash

set -eu

if [[ $# -lt 1 ]]; then
echo 'Usage ./trigger WORKFLOW_ID [KEY=VALUE ...]' >&2
echo
echo 'Trigger a new cycle in the target workflow.'
echo 'Any environment variable KEY=VALUE pairs will be broadcasted to'
echo 'all tasks in the cycle.'
exit 1
fi

# determine the workflow
WORKFLOW_ID="$1"
shift
WORKFLOW_RUN_DIR="${HOME}/cylc-run/${WORKFLOW_ID}"
EXT_TRIGGER_DIR="${WORKFLOW_RUN_DIR}/triggers"
mkdir -p "$EXT_TRIGGER_DIR"

# pick a trigger-id
TRIGGER_ID="$(isodatetime --print-format CCYYMMDDThhmmss)"

# write environment variables to a broadcast file
TRIGGER_FILE="${EXT_TRIGGER_DIR}/${TRIGGER_ID}.cylc"
echo '[environment]' >"$TRIGGER_FILE"
for env in "$@"; do
echo " $env" >> "$TRIGGER_FILE"
done

# issue the xtrigger
cylc ext-trigger "$WORKFLOW_ID" 'trigger' "$TRIGGER_ID"
Loading

0 comments on commit bf4c77c

Please sign in to comment.