Skip to content

Introduce trusted artifacts#712

Closed
zregvart wants to merge 1 commit into
konflux-ci:mainfrom
zregvart:issue/EC-251
Closed

Introduce trusted artifacts#712
zregvart wants to merge 1 commit into
konflux-ci:mainfrom
zregvart:issue/EC-251

Conversation

@zregvart
Copy link
Copy Markdown
Contributor

@zregvart zregvart commented Dec 19, 2023

Adds the use of trusted artifacts[1] to the Tasks and Pipelines. The prefetch-dependencies Task is no longer skipped based on the hermetic parameter -- it needs to run always to produce the trusted artifacts as the build* tasks depend on these, so if it were skipped the build* tasks would be skipped as well.

This also includes one small bugfix in the buildah task, it no longer tries to pull the scratch image.

Ref: https://issues.redhat.com/browse/EC-251

[1] https://github.com/redhat-appstudio/build-trusted-artifacts

@zregvart zregvart marked this pull request as draft December 19, 2023 16:49
@zregvart zregvart force-pushed the issue/EC-251 branch 13 times, most recently from 36ebdaf to d612aaa Compare December 19, 2023 17:28
Comment thread task/buildah/0.1/buildah.yaml Outdated
@stuartwdouglas
Copy link
Copy Markdown
Contributor

It looks like you have accidentally formatted everything? There is a huge amount of non-related formatting changes in the PR.

Comment thread task/buildah/0.1/buildah.yaml Outdated
@stuartwdouglas
Copy link
Copy Markdown
Contributor

Have you performance tested this? Last I heard we were IO bound on the PVCs, and this has the potential to add significantly more PVC IO, especially for builds with a large number of dependencies.

Is the plan to eventually move away from PVC storage altogether?

@zregvart
Copy link
Copy Markdown
Contributor Author

It looks like you have accidentally formatted everything? There is a huge amount of non-related formatting changes in the PR.

This was on purpose (see the separate commit), I would still like to avoid that and from what it seems the .dotfiles are not included in the archive.

Have you performance tested this? Last I heard we were IO bound on the PVCs, and this has the potential to add significantly more PVC IO, especially for builds with a large number of dependencies.

No I have not, but I can do some preliminary performance tests. I'm pretty sure that the performance will not improve with this.

Is the plan to eventually move away from PVC storage altogether?

Not something I can comment on, also not seeing how that would fit with Tekton's notion of a workspace, we would need to not use those to begin with.

@zregvart zregvart force-pushed the issue/EC-251 branch 10 times, most recently from 48c3cc8 to afb009b Compare December 21, 2023 14:21
Comment thread task/buildah-remote/0.1/buildah-remote.yaml
@chmeliik
Copy link
Copy Markdown
Contributor

Yes! We hope to eventually start using OCI archives. But we (mostly me) didn't want to add that additional layer of complexity to this first iteration.

Would it not be possible to do it without OCI archives? When creating the artifact, use create --store <yes-on-a-PVC>, when using it use --store <not-on-a-PVC>

For some filesystems, writing lots of small files is much worse than one large file, so the create step could be much cheaper than use.

Flashbacks to the time when extracting a 1.0 GiB git repo archive to an aws-efs volume took 30 minutes

@lcarva
Copy link
Copy Markdown
Contributor

lcarva commented Jan 19, 2024

Yes! We hope to eventually start using OCI archives. But we (mostly me) didn't want to add that additional layer of complexity to this first iteration.

Would it not be possible to do it without OCI archives? When creating the artifact, use create --store <yes-on-a-PVC>, when using it use --store <not-on-a-PVC>

For some filesystems, writing lots of small files is much worse than one large file, so the create step could be much cheaper than use.

Flashbacks to the time when extracting a 1.0 GiB git repo archive to an aws-efs volume took 30 minutes

Oh! This reminds me of something. @zregvart, when a trusted artifact is being used, it should get extracted to an emptyDir that only the Task has access to. Otherwise, a Task running in parallel can change the contents after they've been verified.

zregvart added a commit to zregvart/ec-policies that referenced this pull request Jan 29, 2024
Given the change in[1], and the new data available in the attestation,
we can create a chain of PipelineRun tasks that contributed to the image
artifact being built.

Namely, the `*_ARTIFACT` parameters of a task should reference the
`*_ARTIFACT` results of a task whose artifact(s) it depends on.

Given such a chain, we can examine if all the Tasks comprising that
chain are acceptable.

This assumes that any acceptable Task conforms to using _only_ the
trusted artifacts when performing the its undertaking.

reference: https://issues.redhat.com/browse/EC-252

[1] konflux-ci/build-definitions#712
@zregvart zregvart force-pushed the issue/EC-251 branch 7 times, most recently from 64b442e to 017c2d0 Compare January 29, 2024 15:32
@stuartwdouglas
Copy link
Copy Markdown
Contributor

Yes! We hope to eventually start using OCI archives. But we (mostly me) didn't want to add that additional layer of complexity to this first iteration.

Would it not be possible to do it without OCI archives? When creating the artifact, use create --store <yes-on-a-PVC>, when using it use --store <not-on-a-PVC>
For some filesystems, writing lots of small files is much worse than one large file, so the create step could be much cheaper than use.
Flashbacks to the time when extracting a 1.0 GiB git repo archive to an aws-efs volume took 30 minutes

Oh! This reminds me of something. @zregvart, when a trusted artifact is being used, it should get extracted to an emptyDir that only the Task has access to. Otherwise, a Task running in parallel can change the contents after they've been verified.

If we are worried about parallel tasks then shouldn't the 'store' task also do all its processing in an off-PVC location? At the moment it writes the archive to the store directory, the gets the digest and moves it/writes the results. If we are worried about parallel tasks then in theory a parallel task could modify the archive between creation and the original task generating the digest.

@zregvart
Copy link
Copy Markdown
Contributor Author

If we are worried about parallel tasks then shouldn't the 'store' task also do all its processing in an off-PVC location? At the moment it writes the archive to the store directory, the gets the digest and moves it/writes the results. If we are worried about parallel tasks then in theory a parallel task could modify the archive between creation and the original task generating the digest.

When restoring the archive from the store the hash of the archive is checked again, for this to be a concern a Task running in parallel would also need to be able to influence the *_ARTIFACT result of the Task creating the archive, I think that is possible only if the malicious Task also has permissions to modify the TaskRun.

Adds the use of trusted artifacts[1] to the Tasks and Pipelines. The
`prefetch-dependencies` Task is no longer skipped based on the
`hermetic` parameter -- it needs to run always to produce the trusted
artifacts as the build* tasks depend on these, so if it were skipped the
build* tasks would be skipped as well.

With this the default value is set for the `*_ARTIFACT` parameters and
the workspace will be unaltered by the `build-trusted-artifact` steps.
This way it is simpler to opt-in to using trusted artifacts, and the
default pipelines do so.

Ref: https://issues.redhat.com/browse/EC-251

[1] https://github.com/redhat-appstudio/build-trusted-artifacts
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@zregvart zregvart marked this pull request as draft January 31, 2024 09:59
@zregvart
Copy link
Copy Markdown
Contributor Author

This will conflict with #771, so let's put this one on hold

@stuartwdouglas
Copy link
Copy Markdown
Contributor

If we are worried about parallel tasks then shouldn't the 'store' task also do all its processing in an off-PVC location? At the moment it writes the archive to the store directory, the gets the digest and moves it/writes the results. If we are worried about parallel tasks then in theory a parallel task could modify the archive between creation and the original task generating the digest.

When restoring the archive from the store the hash of the archive is checked again, for this to be a concern a Task running in parallel would also need to be able to influence the *_ARTIFACT result of the Task creating the archive, I think that is possible only if the malicious Task also has permissions to modify the TaskRun.

I mean if you have two task running in parallel, with either ReadWriteMany or ReadWriteOnce but on the same node, then in this code here there is a potential for a race: https://github.com/redhat-appstudio/build-trusted-artifacts/blob/main/create.sh#L46

Basically at line 46 the archive has been created, but we have not yet calculated the digest. A malicious or erroneous task could in theory modify the archive at this point and we would not be able to detect it, as we have not yet calculated the sha.

@zregvart
Copy link
Copy Markdown
Contributor Author

I mean if you have two task running in parallel, with either ReadWriteMany or ReadWriteOnce but on the same node, then in this code here there is a potential for a race: redhat-appstudio/build-trusted-artifacts@main/create.sh#L46

Basically at line 46 the archive has been created, but we have not yet calculated the digest. A malicious or erroneous task could in theory modify the archive at this point and we would not be able to detect it, as we have not yet calculated the sha.

Only if the $TEMP directory is on the same PVC, right? That will not be the case.

@stuartwdouglas
Copy link
Copy Markdown
Contributor

What do you mean by $TEMP? There is only a call to mktemp -p ${store}, which will create the temp file on the PVC pointed to by store. I think to avoid this you would need to call plain mktemp, create the archive in the /tmp dir, take the sha, then put them both onto the PVC.

@zregvart
Copy link
Copy Markdown
Contributor Author

zregvart commented Feb 1, 2024

What do you mean by $TEMP? There is only a call to mktemp -p ${store}, which will create the temp file on the PVC pointed to by store. I think to avoid this you would need to call plain mktemp, create the archive in the /tmp dir, take the sha, then put them both onto the PVC.

Right, thanks, I've created konflux-ci/build-trusted-artifacts#14 to fix that. I guess the original intent was to make sure that the archive was created on a filesystem with enough free space. Let's see if that becomes an issue at any point.

@zregvart
Copy link
Copy Markdown
Contributor Author

zregvart commented Feb 7, 2024

Submitted a followup in #791, please have a look. I'll close this one as the discussion here will not be applicable over there in most comments.

@zregvart zregvart closed this Feb 7, 2024
zregvart added a commit to zregvart/ec-policies that referenced this pull request Apr 18, 2024
Given the change in[1], and the new data available in the attestation,
we can create a chain of PipelineRun tasks that contributed to the image
artifact being built.

Namely, the `*_ARTIFACT` parameters of a task should reference the
`*_ARTIFACT` results of a task whose artifact(s) it depends on.

Given such a chain, we can examine if all the Tasks comprising that
chain are acceptable.

This assumes that any acceptable Task conforms to using _only_ the
trusted artifacts when performing the its undertaking.

reference: https://issues.redhat.com/browse/EC-252

[1] konflux-ci/build-definitions#712
zregvart added a commit to zregvart/ec-policies that referenced this pull request Apr 18, 2024
Given the change in[1], and the new data available in the attestation,
we can create a chain of PipelineRun tasks that contributed to the image
artifact being built.

Namely, the `*_ARTIFACT` parameters of a task should reference the
`*_ARTIFACT` results of a task whose artifact(s) it depends on.

Given such a chain, we can examine if all the Tasks comprising that
chain are acceptable.

This assumes that any acceptable Task conforms to using _only_ the
trusted artifacts when performing the its undertaking.

Reference: EC-252

[1] konflux-ci/build-definitions#712
zregvart added a commit to zregvart/ec-policies that referenced this pull request May 21, 2024
Given the change in[1], and the new data available in the attestation,
we can create a chain of PipelineRun tasks that contributed to the image
artifact being built.

Namely, the `*_ARTIFACT` parameters of a task should reference the
`*_ARTIFACT` results of a task whose artifact(s) it depends on.

Given such a chain, we can examine if all the Tasks comprising that
chain are acceptable.

This assumes that any acceptable Task conforms to using _only_ the
trusted artifacts when performing the its undertaking.

Reference: EC-252

[1] konflux-ci/build-definitions#712
zregvart added a commit to zregvart/ec-policies that referenced this pull request May 21, 2024
Given the change in[1], and the new data available in the attestation,
we can create a chain of PipelineRun tasks that contributed to the image
artifact being built.

Namely, the `*_ARTIFACT` parameters of a task should reference the
`*_ARTIFACT` results of a task whose artifact(s) it depends on.

Given such a chain, we can examine if all the Tasks comprising that
chain are acceptable.

This assumes that any acceptable Task conforms to using _only_ the
trusted artifacts when performing the its undertaking.

Reference: EC-252

[1] konflux-ci/build-definitions#712
zregvart added a commit to zregvart/ec-policies that referenced this pull request May 21, 2024
Given the change in[1], and the new data available in the attestation,
we can create a chain of PipelineRun tasks that contributed to the image
artifact being built.

Namely, the `*_ARTIFACT` parameters of a task should reference the
`*_ARTIFACT` results of a task whose artifact(s) it depends on.

Given such a chain, we can examine if all the Tasks comprising that
chain are acceptable.

This assumes that any acceptable Task conforms to using _only_ the
trusted artifacts when performing the its undertaking.

Reference: EC-252

[1] konflux-ci/build-definitions#712
lcarva pushed a commit to lcarva/ec-policies that referenced this pull request May 21, 2024
Given the change in[1], and the new data available in the attestation,
we can create a chain of PipelineRun tasks that contributed to the image
artifact being built.

Namely, the `*_ARTIFACT` parameters of a task should reference the
`*_ARTIFACT` results of a task whose artifact(s) it depends on.

Given such a chain, we can examine if all the Tasks comprising that
chain are acceptable.

This assumes that any acceptable Task conforms to using _only_ the
trusted artifacts when performing the its undertaking.

Reference: EC-252

[1] konflux-ci/build-definitions#712
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants