diff --git a/README.md b/README.md index df692468..1df24e2f 100644 --- a/README.md +++ b/README.md @@ -49,6 +49,7 @@ Table of Contents - [37/File Archive Format](spec_37.rst) - [38/Flux Security Key Value Encoding](spec_38.rst) - [39/Flux Security Signature](spec_39.rst) +- [40/Fluxion Resource Set Extension](spec_40.rst) Build Instructions ------------------ diff --git a/index.rst b/index.rst index f0d5f451..52d3a048 100644 --- a/index.rst +++ b/index.rst @@ -260,6 +260,13 @@ for a series of typed key-value pairs. The Flux Security Signature is a NUL terminated string that represents content secured with a digital signature. +:doc:`40/Fluxion Resource Set Extension ` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This specification defines the data format used by the Fluxion scheduler +to store resource graph data in RFC 20 *R* version 1 objects. + + .. Each file must appear in a toctree .. toctree:: :hidden: @@ -301,3 +308,4 @@ content secured with a digital signature. spec_37 spec_38 spec_39 + spec_40 diff --git a/spec_20.rst b/spec_20.rst index be1eb37a..6b79228e 100644 --- a/spec_20.rst +++ b/spec_20.rst @@ -2,8 +2,11 @@ GitHub is NOT the preferred viewer for this file. Please visit https://flux-framework.rtfd.io/projects/flux-rfc/en/latest/spec_20.html +.. default-domain:: js + +####################################### 20/Resource Set Specification Version 1 -======================================= +####################################### This specification defines the version 1 format of the resource-set representation or *R* in short. @@ -14,17 +17,17 @@ representation or *R* in short. - State: Raw - +******** Language --------- +******** The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. - +***************** Related Standards ------------------ +***************** - :doc:`4/Flux Resource Model ` @@ -38,9 +41,9 @@ Related Standards - :doc:`29/Hostlist Format ` - +******** Overview --------- +******** Flexible resource representation is important for some of the key components of Flux. @@ -58,9 +61,9 @@ Finally, when a Flux instance launches a child instance, *R* is passed down from the enclosing instance to the child instance, where it primes the child scheduler with a block of allocatable resources. - +************ Design Goals ------------- +************ The *R* format is designed with the following goals: @@ -74,9 +77,12 @@ The *R* format is designed with the following goals: - Allow the consumers of *R* to deserialize an *R* object while minimizing the parsing complexity and the data to read; +************** +Implementation +************** Producers and Consumers ------------------------ +======================= - The scheduler for a Flux instance (or instance scheduler) uses this format to serialize each resource allocation @@ -95,216 +101,157 @@ Producers and Consumers - The program execution service emits a valid *R* object to release a resource subset of an *R* to the instance scheduler. - Resource Set Format Definition ------------------------------- +============================== The JSON documents that conform to the *R* format SHALL be referred to as *R* JSON documents or in short *R* documents. An *R* JSON document SHALL consist of a dictionary with four -keys: ``version``, ``execution``, ``scheduling`` and ``attributes``. -It SHALL be valid if and only -if it contains the ``version`` key and either or both the ``execution`` -and ``scheduling`` keys. The value of the ``execution`` key SHALL contain -sufficient data for the execution system to perform its -core tasks. The value of ``scheduling`` SHALL contain sufficient data -for schedulers. Finally, the value of ``attributes`` SHALL provide -optional information including but not being limited -to data specific to the scheduler used to create -this JSON document. - - -Version -~~~~~~~ - -The value of the ``version`` key SHALL contain 1 to indicate -the format version. - +keys: :data:`version`, :data:`execution`, :data:`scheduling` and +:data:`attributes`. It SHALL be valid if and only +if it contains the :data:`version` key and either or both the :data:`execution` +and :data:`scheduling` keys. The value of the :data:`execution` key SHALL +contain sufficient data for the execution system to perform its core tasks. The +value of :data:`scheduling` SHALL contain sufficient data for schedulers. +Finally, the value of :data:`attributes` SHALL provide optional information +including but not being limited to data specific to the scheduler used to +create this JSON document. -Execution -~~~~~~~~~ +.. data:: version -The value of the ``execution`` key SHALL contain at least the keys -``R_lite``, and ``nodelist``, with optional keys ``properties``, -``starttime`` and ``expiration``. Other keys are reserved for future -extensions. + The value of the :data:`version` key SHALL contain 1 to indicate + the format version. -``R_lite`` is a strict list of dictionaries each of which SHALL contain -at least the following two keys: +.. data:: execution -**rank** - The value of the ``rank`` key SHALL be a string list of - broker rank identifiers in **idset format** (See RFC 22). This list - SHALL indicate the broker ranks to which other information in - the current entry applies. + The value of the :data:`execution` key SHALL contain at least the keys + :data:`R_lite`, and :data:`nodelist`, with optional keys :data:`properties`, + :data:`starttime` and :data:`expiration`. Other keys are reserved for future + extensions. -**children** - The ``children`` key encodes the information about certain compute resources - contained within this compute node. The value of this key SHALL contain a dictionary - with two keys: ``core`` and ``gpu``. Other keys are reserved for future - extensions. + .. data:: R_lite - **core** - The ``core`` key SHALL contain a logical compute core IDs string - in RFC 22 **idset format**. + :data:`R_lite` is a strict list of dictionaries each of which SHALL contain + at least the :data:`rank` and :data:`children` keys. - **gpu** - The OPTIONAL ``gpu`` key SHALL contain a logical GPU IDs string - in RFC 22 **idset format**. + .. data:: rank + The value of the :data:`rank` key SHALL be a string list of + broker rank identifiers in **idset format** (See RFC 22). This list + SHALL indicate the broker ranks to which other information in + the current entry applies. -The ``nodelist`` key SHALL be an array of hostnames which correspond to -the ``rank`` entries of the ``R_lite`` dictionary, and serves as a mapping -of ``R_lite`` ``rank`` entries to hostname. Each entry in ``nodelist`` MAY -contain a string in RFC 29 *Hostlist Format*, e.g. ``host[0-16]``. + .. data:: children -The ``execution`` key MAY also contain any of the following optional keys: + The :data:`children` key encodes the information about certain compute + resources contained within this compute node. The value of this key SHALL + contain a dictionary with two keys: :data:`core` and :data:`gpu`. Other + keys are reserved for future extensions. -**properties** - The optional properties key SHALL be a dictionary where each key maps a - single property name to a RFC 22 idset string. The idset string SHALL - represent a set of execution target ranks. A given execution target - rank MAY appear in multiple property mappings. Property names SHALL - be valid UTF-8, and MUST NOT contain the following illegal characters: + .. data:: core - :: - - ! & ' " ^ ` | ( ) - - Additionally, the ``@`` character is reserved for scheduler specific - property use. In this case, the literal property SHALL still apply - to the defined execution target ranks, but the scheduler MAY use the - suffix after ``@`` to apply the property to children resources of the - execution target or for another scheduler specific purpose. For example, - the property ``amd-mi50@gpu`` SHALL apply to the defined execution - target ranks, but a scheduler MAY use the ``gpu`` suffix to perform - scheduling optimization for gpus of the corresponding ranks. This MAY - result in both ``amd-mi50@gpu`` and ``amd-mi50`` being valid properties - for resources in the instance. - -**starttime** - The value of the ``starttime`` key, if present, SHALL - encode the start time at which the resource set is valid. The - value SHALL be the number of seconds elapsed since the Unix Epoch - (1970-01-01 00:00:00 UTC) with optional microsecond precision. - If ``starttime`` is unset, then the resource set has no specified - start time and is valid beginning at any time up to ``expiration``. - -**expiration** - The value of the ``expiration`` key, if present, SHALL - encode the end or expiration time of the resource set in seconds - since the Unix Epoch, with optional microsecond precision. If - ``starttime`` is also set, ``expiration`` MUST be greater than - ``starttime``. If ``expiration`` is unset, the resource set has no - specified end time and is valid beginning at ``starttime`` without - expiration. + The :data:`core` key SHALL contain a logical compute core IDs string + in RFC 22 **idset format**. + .. data:: gpu -Scheduling -~~~~~~~~~~ + The OPTIONAL :data:`gpu` key SHALL contain a logical GPU IDs string + in RFC 22 **idset format**. -The ``scheduling`` key allows RFC4-compliant schedulers to serialize any subset -of graph resource data into its value and later deserialize this value with -no data loss. The ``scheduling`` key contains a dictionary with a single key: ``graph``. -Other keys are reserved for future extensions. -The ``graph`` key SHALL conform to the latest version of the JSON Graph Format (JGF). -Thus, its value is a dictionary with two keys, ``nodes`` and ``edges``, -that encode the resource vertices and edges as described in RFC 4. + .. data:: nodelist + The :data:`nodelist` key SHALL be an array of hostnames which correspond to + the :data:`rank` entries of the :data:`R_lite` dictionary, and serves as a + mapping of :data:`R_lite` :data:`rank` entries to hostname. Each entry in + :data:`nodelist` MAY contain a string in RFC 29 *Hostlist Format*, e.g. + ``host[0-16]``. -Graph Vertices -^^^^^^^^^^^^^^ + The :data:`execution` key MAY also contain any of the following optional keys: -The value of the ``nodes`` key defined in JGF is a strict list -of graph vertices. Each list member is a vertex that contains -two keys: ``id`` and ``metadata``. -The ``id`` key SHALL contain a unique string ID for the containing vertex. -The value of the ``metadata`` key is a dictionary that encodes -the resource pool data described in RFC 4. -Thus, this dictionary SHALL contain the following -keys to describe the base data of a resource pool: + .. data:: properties -- ``type`` + The optional :data:`properties` key SHALL be a dictionary where each key + maps a single property name to a RFC 22 idset string. The idset string SHALL + represent a set of execution target ranks. A given execution target + rank MAY appear in multiple property mappings. Property names SHALL + be valid UTF-8, and MUST NOT contain the following illegal characters:: -- ``uuid`` - -- ``basename`` - -- ``name`` + ! & ' " ^ ` | ( ) -- ``id`` + Additionally, the ``@`` character is reserved for scheduler specific + property use. In this case, the literal property SHALL still apply + to the defined execution target ranks, but the scheduler MAY use the + suffix after ``@`` to apply the property to children resources of the + execution target or for another scheduler specific purpose. For example, + the property ``amd-mi50@gpu`` SHALL apply to the defined execution + target ranks, but a scheduler MAY use the ``gpu`` suffix to perform + scheduling optimization for gpus of the corresponding ranks. This MAY + result in both ``amd-mi50@gpu`` and ``amd-mi50`` being valid properties + for resources in the instance. -- ``properties`` + .. data:: starttime -- ``size`` + The value of the :data:`starttime` key, if present, SHALL + encode the start time at which the resource set is valid. The + value SHALL be the number of seconds elapsed since the Unix Epoch + (1970-01-01 00:00:00 UTC) with optional microsecond precision. + If :data:`starttime` is unset, then the resource set has no specified + start time and is valid beginning at any time up to :data:`expiration`. -- ``unit`` + .. data:: expiration -It MAY contain other OPTIONAL resource vertex data. + The value of the :data:`expiration` key, if present, SHALL + encode the end or expiration time of the resource set in seconds + since the Unix Epoch, with optional microsecond precision. If + :data:`starttime` is also set, :data:`expiration` MUST be greater than + :data:`starttime`. If :data:`expiration` is unset, the resource set has no + specified end time and is valid beginning at :data:`starttime` without + expiration. +.. data:: scheduling -Graph Edges -^^^^^^^^^^^ + The :data:`scheduling` key MAY contain scheduler-specific resource data. It + SHALL NOT be interpreted other Flux components. When used, it SHALL ride + along on the resource acquisition protocol (RFC 28) and resource allocation + protocol (RFC 27) so that it may be included in static configuration, + allocated to jobs, and passed down a Flux instance hierarchy. -The value of the ``edges`` key defined in JGF SHALL be a strict list of graph edges. -Each list element SHALL be an edge that connects two graph vertices and -contains the ``source``, ``target`` and ``metadata`` keys. -The value of the ``source`` key SHALL contain the ID of the source graph vertex. -The value of the ``target`` key SHALL contain the ID of the target graph vertex. -The value of this ``metadata`` key SHALL contain a dictionary that encodes -the resource subsystem and relationship data for the containing edge -as described in RFC 4. It SHALL contain two keys: +.. data:: attributes -**subsystem** - The value of the ``subsystem`` key SHALL be a string that indicates - a specific subsystem to which this edge belongs. (e.g., containment - or power subsystems). + The purpose of the :data:`attributes` key is to provide optional + information on this *R* document. The :data:`attributes` key SHALL + be a dictionary of one key: :data:`system`. -**relationship** - The value of the ``relationship`` key SHALL be a string that indicates - a relationship between the source and target resource vertices. - The relationship SHALL only be defined within the subsystem defined - above. (e.g., "contains" relationship within the "containment" subsystem). + Other keys are reserved for future extensions. + .. data:: system -Attributes -~~~~~~~~~~ -The purpose of the ``attributes`` key is to provide optional -information on this *R* document. The ``attributes`` key SHALL -be a dictionary of one key: ``system``. -Other keys are reserved for future extensions. + Attributes in the :data:`system` dictionary provide additional system + information that have affected the creation of this *R* document. + All of the system attributes are optional. -**system** -Attributes in the ``system`` dictionary provide additional system -information that have affected the creation of this *R* document. -All of the system attributes are optional. + A common system attribute is: -A common system attribute is: + .. describe:: scheduler -**scheduler** -The value of the ``scheduler`` is a free-from dictionary that -may provide the information specific to the scheduler used -to produce this document. For example, a scheduler that -manages multiple job queues may add ``queue=batch`` -to indicate that this resource set was allocated from within -its ``batch`` queue. + The value of the :data:`scheduler` key is a free-from dictionary that + may provide the information specific to the scheduler used + to produce this document. For example, a scheduler that + manages multiple job queues may add ``queue=batch`` + to indicate that this resource set was allocated from within + its ``batch`` queue. Example R -~~~~~~~~~ +========= The following is an example of a version 1 resource specification. The example below indicates a resource set with the ranks 19 through 22. These ranks correspond to the nodes node186 through node189. Each of the nodes contains 48 cores (0-47) and 8 gpus (0-7). -The ``startime`` and ``expiration`` indicate the resources were valid +The :data:`starttime` and data:`expiration` indicate the resources were valid for about 30 minutes on February 16, 2023. .. literalinclude:: data/spec_20/example1.json :language: json - -References ----------- - -`JSON Graph Format Github, Anthony Bargnesi, et al., Visited Jan. 2019 `__ diff --git a/spec_40.rst b/spec_40.rst new file mode 100644 index 00000000..661c37a7 --- /dev/null +++ b/spec_40.rst @@ -0,0 +1,139 @@ +.. github display + GitHub is NOT the preferred viewer for this file. Please visit + https://flux-framework.rtfd.io/projects/flux-rfc/en/latest/spec_40.html + + +################################# +40/Fluxion Resource Set Extension +################################# + +This specification defines the data format used by the Fluxion scheduler +to store resource graph data in RFC 20 *R* version 1 objects. + +- Name: github.com/flux-framework/rfc/spec_20.rst + +- Editor: Dong H. Ahn + +- State: Raw + + +******** +Language +******** + +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", +"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" +in this document are to be interpreted as described in RFC 2119. + + +***************** +Related Standards +***************** + +- :doc:`4/Flux Resource Model ` + +- :doc:`14/Canonical Job Specification ` + +- :doc:`20/Resource Set Specification Version 1 ` + +- :doc:`27/Flux Resource Allocation Protocol Version 1 ` + +- :doc:`28/Flux Resource Aquisition Protocol Version 1 ` + + +********** +Background +********** + +RFC 20 defines version 1 of Flux's JSON-based resource set representation *R*. +In version 1, resource types are purposefully constrained to nodes, cores, and +GPUs and there is no way to express a special relationship between nodes, or +between cores and GPUs within a node. However, a ``scheduling`` key is defined +which allows scheduler-specific extensions to be attached to *R*, allowing more +complex resource types and relationships to be included, if only for the +benefit of the scheduler. + +The ``scheduling`` key is opaque to rest of Flux, but: + +- may be part of the static configuration of a Flux system instance +- passes through to the scheduler when it acquires its initial resource set +- passes through to jobs when resources are allocated to them +- passes through to Flux sub-instances for acquisition by their scheduler + +This document describes the resource graph representation used by the Fluxion +scheduler within the ``scheduling`` key of *R*. + +************** +Implementation +************** + +The Fluxion resource graph representation allows RFC4-compliant schedulers to +serialize any subset of graph resource data into its value and later +deserialize this value with no data loss. The ``scheduling`` key contains a +dictionary with a single key: ``graph``. Other keys are reserved for future +extensions. The ``graph`` key SHALL conform to the latest version of the JSON +Graph Format (JGF). Thus, its value is a dictionary with two keys, ``nodes`` +and ``edges``, that encode the resource vertices and edges as described in +RFC 4. + + +Graph Vertices +============== + +The value of the ``nodes`` key defined in JGF is a strict list +of graph vertices. Each list member is a vertex that contains +two keys: ``id`` and ``metadata``. +The ``id`` key SHALL contain a unique string ID for the containing vertex. +The value of the ``metadata`` key is a dictionary that encodes +the resource pool data described in RFC 4. +Thus, this dictionary SHALL contain the following +keys to describe the base data of a resource pool: + +- ``type`` + +- ``uuid`` + +- ``basename`` + +- ``name`` + +- ``id`` + +- ``properties`` + +- ``size`` + +- ``unit`` + +It MAY contain other OPTIONAL resource vertex data. + + +Graph Edges +=========== + +The value of the ``edges`` key defined in JGF SHALL be a strict list of graph edges. +Each list element SHALL be an edge that connects two graph vertices and +contains the ``source``, ``target`` and ``metadata`` keys. +The value of the ``source`` key SHALL contain the ID of the source graph vertex. +The value of the ``target`` key SHALL contain the ID of the target graph vertex. +The value of this ``metadata`` key SHALL contain a dictionary that encodes +the resource subsystem and relationship data for the containing edge +as described in RFC 4. It SHALL contain two keys: + +**subsystem** + The value of the ``subsystem`` key SHALL be a string that indicates + a specific subsystem to which this edge belongs. (e.g., containment + or power subsystems). + +**relationship** + The value of the ``relationship`` key SHALL be a string that indicates + a relationship between the source and target resource vertices. + The relationship SHALL only be defined within the subsystem defined + above. (e.g., "contains" relationship within the "containment" subsystem). + + +********** +References +********** + +`JSON Graph Format Github, Anthony Bargnesi, et al., Visited Jan. 2019 `__ diff --git a/spell.en.pws b/spell.en.pws index 32cb814e..1b08f060 100644 --- a/spell.en.pws +++ b/spell.en.pws @@ -474,3 +474,4 @@ printf usr Ć’uzzybunny acceptor +Fluxion