Type URI: https://in-toto.io/Provenance/v1
Describe how an artifact or set of artifacts was produced.
The primary focus is on automated builds that followed some "recipe".
{
"subject": [{ ... }],
"predicateType": "https://in-toto.io/Provenance/v1",
"predicate": { // required
"builder": { // required
"id": "<URI>" // required
},
"recipe": { // optional
"type": "<URI>", // required
"definedInMaterial": /* integer */, // optional
"entryPoint": "<STRING>", // optional
"arguments": { /* object */ }, // optional
"reproducibility": { /* object */ } // optional
},
"metadata": { // optional
"buildStartedOn": "<TIMESTAMP>", // optional
"buildFinishedOn": "<TIMESTAMP>", // optional
"materialsComplete": true/false // optional
},
"materials": [
{
"uri": "<URI>", // optional
"digest": { /* DigestSet */ }, // optional
"mediaType": "<MEDIA_TYPE>", // optional
"tags": [ "<STRING>" ] // optional
}
]
}
}
(Note: This is a Predicate type that fits within the larger Attestation framework.)
Identifies the entity that executed the build steps, which is trusted to have performed the operation correctly and populated this provenance.
The identity MUST reflect the entire trust base that could influence the build. For example, GitHub Actions supports both GitHub-hosted runners, where the entire builder is under GitHub's control, and self-hosted runners, where users provide their own runner. In this case, GitHub-hosted runnner is one identity while each self-hosted runner is its own identity.
Verifiers MUST only accept specific builders from specific signers.
Design rationale: The builder is distinct from the signer because one signer may generate attestations for more than one builder, as in the GitHub Actions example above. The field is required, even if it is implicit from the signer, is to aid readability and debugging. It is an object to allow additional fields in the future, in case one URI is not sufficient.
builder.id
string (TypeURI), required
URI indicating the builder's identity.
Identifies the configuration used for the build. When combined with
materials
, this SHOULD fully describe the build, such that re-running this recipe results in bit-for-bit identical output (if the build is reproducible).
- The
recipe.type
,recipe.entryPoint
, andrecipe.definedInMaterial
describe the location of the recipe.- The
recipe.arguments
describes all user-controlled arguments to the recipe, meaning anything that is not fully under the control of thebuilder
.- The
recipe.reproducibility
describes all builder-controlled arguments to the recipe.MAY be unset/null if unknown, but this is DISCOURAGED.
recipe.type
string (TypeURI), required
URI indicating what type of recipe was performed. It determines the meaning of
recipe.entryPoint
,recipe.arguments
,recipe.reproducibility
, andmaterials
.
recipe.definedInMaterial
integer, optional
Index in
materials
containing the recipe steps that are not implied byrecipe.type
. For example, if the recipe type were "make", then this would point to the source containing the Makefile, not themake
program itself.Omit this field (or use null) if the recipe doesn't come from a material.
TODO: What if there is more than one material?
recipe.entryPoint
string, optional
String identifying the entry point. The meaning is defined by
recipe.type
. For example, if the recipe type were "make", then this would reference the directory in which to runmake
as well as which target to use.MAY be omitted if the recipe type specifies a default value.
recipe.arguments
object, optional
Collection of all user-controlled inputs that influenced the build on top of
recipe.definedInMaterial
andrecipe.entryPoint
. The schema is defined byrecipe.type
. A "user" is any entity that is notbuilder
. For example, if GitHub Actions is the builder, then the "user" is anyone who is not GitHub.Omit this field (or use null) to indicate "no arguments."
recipe.reproducibility
object, optional
Collection of all builder-controlled inputs that influenced the build on top of
recipe.definedInMaterial
andrecipe.entryPoint
. The schema is defined byrecipe.type
.TODO: Is there a better name for this? "Reproducibility" sounds more like a property (enum or bool) rather than a set of things needed for reproduction.
Other properties of the build.
metadata.buildStartedOn
string (Timestamp), optional
The timestamp of when the build started.
metadata.buildFinishedOn
string (Timestamp), optional
The timestamp of when the build completed.
metadata.materialsComplete
boolean, optional
If true,
materials
is claimed to be complete, usually through some controls to prevent network access.
materials
array of objects, optional
The collection of artifacts that influenced the build including sources, dependencies, build tools, base images, and so on.
materials[*].uri
string (ResourceURI), optional
The method by which this artifact was referenced during the build.
TODO: Should we differentiate between the "referenced" URI and the "resolved" URI, e.g. "latest" vs "3.4.1"?
TODO: Should wrap in a
locator
object to allow for extensibility, in case we add other types of URIs or other non-URI locators?
materials[*].digest
object (DigestSet), optional
Collection of cryptographic digests for the contents of this artifact.
materials[*].mediaType
string (Media Type), optional
The Media Type for this artifact, if known.
materials[*].tags
array (of strings), optional
Unordered set of labels whose meaning is dependent on
recipe.type
. SHOULD be sorted lexicographically.TODO: Recommend specific conventions, e.g.
source
anddev-dependency
.
This section shows how builder
and recipe
would be populated for various
common scenarios. Other fields are omitted because they do not vary
significantly between systems.
WARNING: This is only for demonstration purposes. The GitHub Actions team has not yet reviewed or approved this design, and it is not yet implemented. Details are subject to change!
GitHub-Hosted runner:
"builder": {
"id": "https://github.com/Attestations/GitHubHostedActions@v1"
}
Self-hosted runner: Not yet supported. We need to figure out a URI scheme that
represents what system hosted the runner, or perhaps add additional properties
in builder
.
GitHub Actions Workflow:
"recipe": {
// Build steps were defined in a GitHub Actions Workflow file ...
"type": "https://github.com/Attestations/GitHubActionsWorkflow@v1",
// ... in the git repo described by `materials[0]` ...
"definedInMaterial": 0,
// ... at the path .github/workflows/build.yaml, using the job "build".
"entryPoint": "build.yaml:build",
// The only possible user-defined parameters that can affect the build are the
// "inputs" to a workflow_dispatch event. This is unset/null for all other
// events.
"arguments": {
"inputs": { ... }
},
// TODO: Additional parameters needed to make the workflow reproducible.
"reproducibility": null
}
WARNING: This is only for demonstration purposes. The Google Cloud Build team has not yet reviewed or approved this design, and it is not yet implemented. Details are subject to change!
Google-hosted worker:
"builder": {
"id": "https://cloudbuild.googleapis.com/GoogleHostedWorker@v1"
}
Custom worker: Not yet supported. We need to figure out a URI scheme that
represents what system hosted the worker, or perhaps add additional properties
in builder
.
Cloud Build config-as-code
(BuildTrigger
with filename
):
"recipe": {
// Build steps were defined in a cloudbuild.yaml file ...
"type": "https://cloudbuild.googleapis.com/CloudBuildYaml@v1",
// ... in the git repo described by `materials[0]` ...
"definedInMaterial": 0,
// ... at the path path/to/cloudbuild.yaml.
"entryPoint": "path/to/cloudbuild.yaml",
// The only possible user-defined parameters that can affect a BuildTrigger
// are the subtitutions in the BuildTrigger.
"arguments": {
"substitutions": {...}
},
// TODO: Additional parameters needed to make the build reproducible.
"reproducibility": null
}
Cloud Build with steps defined in a trigger or over RPC:
"recipe": {
// Build steps were provided as an argument. No `definedInMaterial` or
// `entryPoint`.
"type": "https://cloudbuild.googleapis.com/CloudBuildSteps@v1",
"arguments": {
// The steps that were performed. (Format TBD.)
"steps": [...],
// The substitutions in the build trigger.
"substitutions": {...}
// TODO: Any other arguments?
},
// TODO: Additional parameters needed to make the build reproducible.
"reproducibility": null
}
WARNING: This is just a proof-of-concept. It is not yet standardized.
Individual performed the build, identified by email address:
"builder": {
"id": "mailto:[email protected]"
}
Execution of a bazel build
command:
"recipe": {
// Execution of `bazel build from within the `path/to/workspace`
// directory of `materials[0]`, which is downloaded, extracted (if
// appropriate), and changed into.
"type": "https://example.com/BazelBuidl@v1",
"definedInMaterial": 0,
"entryPoint": "path/to/workspace://foo:bar",
"arguments": {
// List of startup options (before the "build" command).
"startupOptions": []
// List of build flags (after the "build" command).
"buildFlags": []
}
}
Execution of arbitrary commands:
"recipe": {
// There was no entry point, and the commands were run in an ad-hoc fashion.
// There is no `definedInMaterial` or `entryPoint`.
"type": "https://example.com/ManuallyRunCommands@v1",
"arguments": {
// The list of commands that were executed.
"commands": [
"tar xvf foo-1.2.3.tar.gz",
"cd foo-1.2.3",
"./configure --enable-some-feature",
"make foo.zip"
],
// Indicates how to parse the strings in `commands`.
"shell": "bash"
}
}
See ci_survey.md for a list of well-known CI/CD systems, to make sure they all map cleanly into this schema.