Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates Fortran documentation for error codes #284

Merged
merged 3 commits into from
May 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions doc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,22 +18,27 @@ To be released at some future date

Note

This section details changes made in the development branch that have not yet been applied to a released version of the SmartSim library.
This section details changes made in the development branch that have not yet
been applied to a released version of the SmartSim library.

Description

A full list of changes and detailed notes can be found below:

- Update Fortran tutorials for SmartRedis
- Add support for multiple network interface binding in Orchestrator and Colocated DBs

Detailed notes

- Update the Github Actions runner image from `macos-10.15`` to `macos-12``. The
former began deprecation in May 2022 and was finally removed in May 2023 (PR285_)
- Orchestrator and Colocated DB now accept a list of interfaces to bind to. The argument name is still `interface`
for backward compatibility reasons. (PR281_)
- The Fortran tutorials had not been fully updated to show how to handle return/error
codes. These have now all been updated (PR284_)
- Orchestrator and Colocated DB now accept a list of interfaces to bind to. The
argument name is still `interface` for backward compatibility reasons. (PR281_)

.. _PR285: https://github.com/CrayLabs/SmartSim/pull/285
.. _PR284: https://github.com/CrayLabs/SmartSim/pull/284
.. _PR281: https://github.com/CrayLabs/SmartSim/pull/281

0.4.2
Expand Down
39 changes: 30 additions & 9 deletions doc/sr_fortran_walkthrough.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,18 @@ SmartRedis ``DataSet`` API is also provided.
Update the ``Client`` constructor ``cluster`` flag to `.false.`
to connect to a single shard (single compute host) database.

Error handling
==============

The core of the SmartRedis library is written in C++ which utilizes the
ashao marked this conversation as resolved.
Show resolved Hide resolved
exception handling features of the language to catch errors. This same
functionality does not exist in Fortran, so instead most SmartRedis
methods are functions that return error codes that can be checked. This
also has the added benefit that Fortran programs can incorporate
SmartRedis calls within their own error handling methods. A full list of
return codes for Fortran can be found in ``enum_fortran.inc.`` Additionally, the
``errors`` module has ``get_last_error`` and ``print_last_error`` to retrieve
the text of the error message emitted within the C++ code.

Tensors
=======
Expand Down Expand Up @@ -60,8 +72,10 @@ if using a clustered database or ``.false.`` otherwise.
use smartredis_client, only : client_type

type(client_type) :: client
integer :: return_code

call client%initialize(.false.) ! Change .false. to true if using a clustered database
return_code = client%initialize(.false.) ! Change .false. to true if using a clustered database
ashao marked this conversation as resolved.
Show resolved Hide resolved
if (return_code .ne. SRNoError) stop 'Error in initializing client'
end program example

**Putting a Fortran array into the database**
Expand All @@ -84,7 +98,7 @@ shape of the array.
.. literalinclude:: ../smartredis/examples/serial/fortran/smartredis_put_get_3D.F90
:linenos:
:language: fortran
:lines: 1-11,13-24,26-27
:lines: 46-54

**Unpacking an array stored in the database**

Expand Down Expand Up @@ -191,7 +205,8 @@ methods are used:

.. code-block:: Fortran

call client%initialize(.true.)
return_code = client%initialize(.true.)
ashao marked this conversation as resolved.
Show resolved Hide resolved
if (return_code .ne. SRNoError) stop 'Error in initializing client'

The only optional argument to the initialize
routine is to determine whether the RedisAI
Expand All @@ -217,8 +232,10 @@ database cluster.
.. code-block:: Fortran

if (pe_id == 0) then
call client%set_model_from_file(model_key, model_file, "TORCH", "CPU")
call client%set_script_from_file(script_key, "CPU", script_file)
return_code = client%set_model_from_file(model_key, model_file, "TORCH", "CPU")
if (return_code .ne. SRNoError) stop 'Error in setting model'
return_code = client%set_script_from_file(script_key, "CPU", script_file)
ashao marked this conversation as resolved.
Show resolved Hide resolved
if (return_code .ne. SRNoError) stop 'Error in setting script'
endif

This only needs to be done on the root MPI task because
Expand Down Expand Up @@ -306,7 +323,8 @@ into the Redis database.
.. code-block:: Fortran

call random_number(array)
call client%put_tensor(in_key, array, shape(array))
return_code = client%put_tensor(in_key, array, shape(array))
ashao marked this conversation as resolved.
Show resolved Hide resolved
if (return_code .ne. SRNoError) stop 'Error putting tensor in the database'

The Redis database can now be called to run preprocessing
scripts on these data.
Expand All @@ -315,7 +333,8 @@ scripts on these data.

inputs(1) = in_key
outputs(1) = script_out_key
call client%run_script(script_name, "pre_process", inputs, outputs)
return_code = client%run_script(script_name, "pre_process", inputs, outputs)
ashao marked this conversation as resolved.
Show resolved Hide resolved
if (return_code .ne. SRNoError) stop 'Error running script'

The call to ``client%run_script`` specifies the
key used to identify the script loaded during
Expand Down Expand Up @@ -345,7 +364,8 @@ and the output will stored using the same key.

inputs(1) = script_out_key
outputs(1) = out_key
call client%run_model(model_name, inputs, outputs)
return_code = client%run_model(model_name, inputs, outputs)
if (return_code .ne. SRNoError) stop 'Error running model'

ashao marked this conversation as resolved.
Show resolved Hide resolved
As before the results of running the inference are
stored within the database and are not available to
Expand All @@ -354,7 +374,8 @@ the tensor from the database by using the ``unpack_tensor`` method.

.. code-block:: Fortran

call client%unpack_tensor(out_key, result, shape(result))
return_code = client%unpack_tensor(out_key, result, shape(result))
ashao marked this conversation as resolved.
Show resolved Hide resolved
if (return_code .ne. SRNoError) stop 'Error retrieving the tensor'

The ``result`` array now contains the outcome of the inference.
It is a 10-element array representing the likelihood that the
Expand Down
23 changes: 16 additions & 7 deletions doc/sr_integration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Fortran::

type(client_type) :: client
return_code = client%initialize(use_cluster)
if (return_code /= SRNoError) stop 'Error in initialization'
if (return_code .ne. SRNoError) stop 'Error in initialization'

C::

Expand Down Expand Up @@ -116,7 +116,7 @@ Python::
Fortran::

if (root_client) return_code = client%set_model_from_file(model_name, model_file, backend, device)
if (return_code /= SRNoError) stop 'Error setting model'
if (return_code .ne. SRNoError) stop 'Error setting model'

C::

Expand Down Expand Up @@ -181,23 +181,30 @@ performs the initialization of other components. ::

! Import SmartRedis modules
use, only smartredis_client : client_type
! Include all fortran enumerators especially for error checking
include "enum_fortran.inc"

! Declare a new variable called client and a string to create a unique
! name for names
type(client_type) :: smartredis_client
character(len=7) :: name_prefix
integer :: mpi_rank, mpi_code, smartredis_code
integer :: mpi_rank, mpi_code, return_code

! Note adding use_cluster as an additional runtime argument for SmartRedis
call initialize_model(temperature, number_of_timesteps, use_cluster)
call smartredis_client%initialize(use_cluster)
return_code = smartredis_client%initialize(use_cluster)
if (return_code .ne. SRNoError) stop 'Error in init'
call MPI_Comm_rank(MPI_COMM_WORLD, mpi_rank, mpi_code)
! Build the prefix for all tensors set in this model
write(name_prefix,'(I6.6,A)') mpi_rank, '_'

! Assume all ranks will use the same machine learning model, so no need to
! add the prefix to the model name
if (mpi_rank==0) call set_model_from_file("example_model_name", "path/to/model.pt", "TORCH", "gpu")
if (mpi_rank==0) then
return_code = set_model_from_file("example_model_name", "path/to/model.pt", "TORCH", "gpu")
if (return_code .ne. SRNoError) stop 'Error in setting model'
endif


Next, add the calls in the main loop to send the temperature to the orchestrator ::

Expand All @@ -209,12 +216,14 @@ Next, add the calls in the main loop to send the temperature to the orchestrator
call write_current_state(temperature)
model_input(1) = name_prefix//"temperature"
model_output(1) = name_prefix//"temperature_out"
call smartredis_client%put_tensor(model_input(1), temperature)
return_code = smartredis_client%put_tensor(model_input(1), temperature)
if (return_code .ne. SRNoError) stop 'Error in putting tensor'

! Run the machine learning model
return_code = smartredis_client%run_model("example_model_name", model_input, model_output)
! The following line overwrites the prognostic temperature array
return_code = smartredis_client%unpack_tensor(model_output(1), temperature)
if (return_code .ne. SRNoError) stop 'Error in retrieving tensor'

! Call a time integrator to step the temperature field forward
call timestep_simulation(temperature)
Expand All @@ -226,4 +235,4 @@ temperature array in the orchestrator, instruct the orchestrator to call
a machine learning model for prediction/inference, and unpack the resulting
inference into the existing temperature array. For more complex examples,
please see some of the integrations in the SmartSim Zoo or feel free to
contact the team at [email protected]
contact the team at [email protected]