Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Components not properly in dep tree nor BOM #540

Open
jkugler opened this issue Jan 27, 2024 · 17 comments · May be fixed by #678
Open

Components not properly in dep tree nor BOM #540

jkugler opened this issue Jan 27, 2024 · 17 comments · May be fixed by #678
Assignees

Comments

@jkugler
Copy link
Contributor

jkugler commented Jan 27, 2024

This is version 6.4.0

Components with unique bom_refs, but the same name, will generate an error when trying to render a dependency tree. Given this script:

#!/usr/bin/env python3

from cyclonedx.model.bom import Bom
from cyclonedx.model.component import Component, ComponentType
from cyclonedx.output.json import JsonV1Dot5

bom = Bom()
bom.metadata.component = root_component = Component(
    name='myApp',
    type=ComponentType.APPLICATION,
    bom_ref="myApp"
)

component1 = Component(
    type=ComponentType.LIBRARY,
    name='some-component',
    bom_ref="some-component"
)
bom.components.add(component1)
bom.register_dependency(root_component, [component1])

component2 = Component(
    type=ComponentType.LIBRARY,
    name='some-library',
    bom_ref="some-library1"
)
bom.components.add(component2)
bom.register_dependency(component1, [component2])

component3 = Component(
    type=ComponentType.LIBRARY,
    name='some-library',
    bom_ref="some-library2"
)
bom.components.add(component3)
bom.register_dependency(component1, [component3])

print(JsonV1Dot5(bom).output_as_string(indent=2))

I get this error when I run it:

Traceback (most recent call last):
  File "/Users/tek30584/programming/cdx_lib_bugs/./duplicate_name_bug.py", line 38, in <module>
    print(JsonV1Dot5(bom).output_as_string(indent=2))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tek30584/programming/cdx_lib_bugs/.venv/lib/python3.11/site-packages/cyclonedx/output/json.py", line 82, in output_as_string
    self.generate()
  File "/Users/tek30584/programming/cdx_lib_bugs/.venv/lib/python3.11/site-packages/cyclonedx/output/json.py", line 70, in generate
    bom.validate()
  File "/Users/tek30584/programming/cdx_lib_bugs/.venv/lib/python3.11/site-packages/cyclonedx/model/bom.py", line 600, in validate
    raise UnknownComponentDependencyException(
cyclonedx.exception.model.UnknownComponentDependencyException: One or more Components have Dependency references to Components/Services that are not known in this BOM. They are: {<BomRef 'some-library2'>}
@jkowalleck jkowalleck self-assigned this Jan 27, 2024
@jkowalleck
Copy link
Member

jkowalleck commented Jan 27, 2024

will add test cases to older versions, for regression and showcasing-purposes,
and then forward-port those and add fixes.

I am planning to have all affected major versions fixed.

++++

the issue does not rely on same name of components, but the fact that both component have an equal (not identical) bom_ref - with a None value.
Need to showcase this in tests.

@jkowalleck jkowalleck changed the title Components with same names cause rendering errors Components with equal bom_ref.name break dependency tree Jan 27, 2024
@jkowalleck jkowalleck added the bug Something isn't working label Jan 27, 2024
@jkugler
Copy link
Contributor Author

jkugler commented Jan 27, 2024

the issue does not rely on same name of components, but the fact that both component have an equal (not identical) bom_ref - with a None value.

I'm not understanding this. All the components in the example script have unique bom_ref values.

@jkowalleck jkowalleck changed the title Components with equal bom_ref.name break dependency tree Components with equal bom_ref.value break dependency tree Jan 28, 2024
@jkowalleck jkowalleck changed the title Components with equal bom_ref.value break dependency tree Components not properly in dep tree Jan 28, 2024
@jkowalleck jkowalleck changed the title Components not properly in dep tree Components not properly in dep tree nor BOM Jan 29, 2024
@jkowalleck jkowalleck removed the bug Something isn't working label Jan 29, 2024
@jkowalleck
Copy link
Member

jkowalleck commented Jan 29, 2024

did some research and found, that component3 was not added to the BOM.
This is an expected behavior, as the properties relevant for equality of component2 are the same as of component3.

Per CycloneDX Specification, the minimal set of equality-properties of Components are: type, name`
Bom-ref is not an identity-property nor an equality property. Bom-ref's purpose is linking elements, it does not have any meaning besides this purpose.

As all is ass expected, I'll close this issue.
This Issue is not locked, so further discussion may take place here, instead of opening another issue.

@jkowalleck jkowalleck closed this as not planned Won't fix, can't repro, duplicate, stale Jan 29, 2024
@jkugler
Copy link
Contributor Author

jkugler commented Jan 29, 2024

Ah, that is interesting. I would expect a "guaranteed unique" property would be used for testing "unique-ness." But I understand. I guess I'm drawing too much on my database experience. :) Thanks for digging in to this and doing the research!

@eugenhoffmann
Copy link

I encountered a similar issue with a different component tree.
One of the features of CycloneDX is the ability to declare nested components, which is helpful when merging multiple BOM files into a single one. In some cases, these BOM files may contain identical components. However, depending on the resulting component tree, I observed inconsistent behavior - either receiving an exception or no exception.

  • Case 1: The BOM file contains two identical components (x, y). The only difference is the 'bom-ref'. The validation works fine, because the components are nested below different root components (A, B) --> No Exception

  • Case 2: The BOM file contains two identical components (x, y). The only difference is the 'bom-ref'. The validation raises an exception, because the components are both nested below the same root component (A) --> 'UnknownComponentDependencyException'

models

In my opinion, the component equality validation should be performed within the context of a parent component, rather than within the context of a root component. Am I misunderstanding something?

@jkowalleck
Copy link
Member

@madpah FYI

@jkowalleck
Copy link
Member

jkowalleck commented Apr 24, 2024

this issue should have been closed via #587 or so

@eugenhoffmann
Copy link

From my perspective, it seems that this issue does not specifically address the same problem.

To reproduce the second case mentioned here, you can use the provided example BOM file UnkCompDepEx.json.

Here is a code example that reads the BOM file and writes it to a string:

import json
from cyclonedx.model.bom import Bom
from cyclonedx.output.json import JsonV1Dot5

with open("UnkCompDepEx.json") as input_json:
    deserialized_bom = Bom.from_json(data=json.loads(input_json.read()))

print(JsonV1Dot5(deserialized_bom).output_as_string(indent=2))

When running this code, I get the 'UnknownComponentDependencyException' error message:

File "F:\Git\rsb-sbom\venv\lib\site-packages\cyclonedx\output\json.py", line 83, in output_as_string
  self.generate()
File "F:\Git\rsb-sbom\venv\lib\site-packages\cyclonedx\output\json.py", line 71, in generate
  bom.validate()
File "F:\Git\rsb-sbom\venv\lib\site-packages\cyclonedx\model\bom.py", line 658, in validate
  raise UnknownComponentDependencyException(
cyclonedx.exception.model.UnknownComponentDependencyException: One or more Components have Dependency references to Components/Services that are not known in this BOM. They are: {<BomRef 'y2' id=2229836430400>, <BomRef 'x2' id=2229836429824>}

Feel free to reach out if you need further assistance

@jkowalleck jkowalleck reopened this Apr 27, 2024
@wkoot
Copy link

wkoot commented Sep 11, 2024

I think the problem lies in the deduplication of SortedSet; elements are not added to the set when their value matches.
The equality operator in the Component class compares the objects' hash values, which ignores the nested BomRef object.

Example:

>>> from cyclonedx.model.component import Component, BomRef
>>> c1 = Component(bom_ref=BomRef(value='71dd3e04-ac87-4f23-818f-3cd9a17bc785'), name='pip')
>>> c2 = Component(bom_ref=BomRef(value='3d383d8b-af76-462f-9701-503cb6c5eb00'), name='pip')
>>> c1 == c2
True

If object equality is not defined by their BomRef, the assumption that BomRef is a foreign key does not hold.
The related PR #587 seems to deal with a similar issue, where certain fields were omitted from the lt operator.

wkoot added a commit to wkoot/cyclonedx-python-lib that referenced this issue Sep 11, 2024
@jkowalleck
Copy link
Member

jkowalleck commented Sep 11, 2024

I think the problem lies in the deduplication of SortedSet; elements are not added to the set when their value matches. The equality operator in the Component class compares the objects' hash values, which ignores the nested BomRef object.

Example:

>>> from cyclonedx.model.component import Component, BomRef
>>> c1 = Component(bom_ref=BomRef(value='71dd3e04-ac87-4f23-818f-3cd9a17bc785'), name='pip')
>>> c2 = Component(bom_ref=BomRef(value='3d383d8b-af76-462f-9701-503cb6c5eb00'), name='pip')
>>> c1 == c2
True

If object equality is not defined by their BomRef, the assumption that BomRef is a foreign key does not hold. The related PR #587 seems to deal with a similar issue, where certain fields were omitted from the lt operator.

bom-ref is intentional not part of the __eq__ or __hash__ method - it has no meaning, it is a mere reference-indicator.
c1 and c2 are the same component -> both are pip

@wkoot
Copy link

wkoot commented Sep 11, 2024

Fair point but when there are multiple BOM files sourced in the python script, this is no longer applicable.
When merging sets of Vulnerability objects from separate BOM files, they refer to their own Component.
If only one of these is added to the global list of Components, all references will fail since they assume they can use BomRef.

For example, in this case for already loaded bom_files 1 and 2:

for loaded_bom in [bom_file1, bom_file2]:
  root_bom.components |= loaded_bom.components
  root_bom.dependencies |= loaded_bom.dependencies

The set difference loaded_bom.dependencies - loaded_bom.components is empty, while root_bom.dependencies - root_bom.components may not be (due to deduplication based on values without BomRef).
The validate call on the root_bom will then throw a UnknownComponentDependencyException as described in the issue.

Is the BomRef excluded from the hash for optimization? It seems to work different in cyclonedx-dotnet-library.
The merge option in cyclonedx-cli does not seem to do this, allowing for duplicate components instead.

@jkowalleck
Copy link
Member

Fair point but when there are multiple BOM files sourced in the python script, this is no longer applicable. When merging sets of Vulnerability objects from separate BOM files, they refer to their own Component. If only one of these is added to the global list of Components, all references will fail since they assume they can use BomRef.

For example, in this case for already loaded bom_files 1 and 2:

for loaded_bom in [bom_file1, bom_file2]:
  root_bom.components |= loaded_bom.components
  root_bom.dependencies |= loaded_bom.dependencies

The set difference loaded_bom.dependencies - loaded_bom.components is empty, while root_bom.dependencies - root_bom.components may not be (due to deduplication based on values without BomRef). The validate call on the root_bom will then throw a UnknownComponentDependencyException as described in the issue.

then the python script you wrote needs top take care of this.

@wkoot
Copy link

wkoot commented Sep 11, 2024

bom-ref is intentional not part of the __eq__ or __hash__ method - it has no meaning, it is a mere reference-indicator. c1 and c2 are the same component -> both are pip

I have submitted CycloneDX/cyclonedx-cli#399 to address the discrepancy between the implementation in python and c# libraries.
It is however unclear to me as to where the intended behaviour is specified. Do you know if it is somewhere in the spec?
There's nothing mentioned in the CycloneDX JSON Reference, although that's arguably the wrong location to explain deduplication.

Note also that the MergeTests for the C# library explicitly do not take bom-ref of components into account.
For that matter, this also isn't tested in neither test_component.py nor test_model_component.py.
NB - I'm using v1.5 of the spec because the C# library (and therefore cyclonedx-cli) does not yet support v1.6.

@wkoot
Copy link

wkoot commented Sep 18, 2024

bom-ref is intentional not part of the __eq__ or __hash__ method - it has no meaning, it is a mere reference-indicator. c1 and c2 are the same component -> both are pip

I was wondering if duplicate components in nested objects are supposed to be deduplicated, or if they should not be.
Currently, the equality check is only called in set addition - so not in any parent component within the tree.
See also aquasecurity/trivy#7532 (comment); should nesting these components be a solution?
However, if nesting would be an allowed solution - wouldn't that also mean vulnerabilities have to be duplicated?

@jkowalleck
Copy link
Member

bom-ref is intentional not part of the __eq__ or __hash__ method - it has no meaning, it is a mere reference-indicator. c1 and c2 are the same component -> both are pip

I was wondering if duplicate components in nested objects are supposed to be deduplicated, or if they should not be. Currently, the equality check is only called in set addition - so not in any parent component within the tree. See also aquasecurity/trivy#7532 (comment); should nesting these components be a solution? However, if nesting would be an allowed solution - wouldn't that also mean vulnerabilities have to be duplicated?

not intended for nested of parallels, as they may contextualize the individual outer component.

@wkoot
Copy link

wkoot commented Sep 18, 2024

It seems that deserialization silently loads broken bom data, in the case of duplicate components as earlier referred to.

EDIT: Created separate issue - #677

@jkowalleck
Copy link
Member

I'm not sure if this still belongs in the current issue, or if I should open a separate ticket with these findings.

please open a dedicated issue for that.

wkoot added a commit to wkoot/cyclonedx-python-lib that referenced this issue Sep 19, 2024
wkoot added a commit to wkoot/cyclonedx-python-lib that referenced this issue Sep 19, 2024
@wkoot wkoot linked a pull request Sep 19, 2024 that will close this issue
wkoot added a commit to wkoot/cyclonedx-python-lib that referenced this issue Sep 19, 2024
wkoot added a commit to wkoot/cyclonedx-python-lib that referenced this issue Sep 19, 2024
wkoot added a commit to wkoot/cyclonedx-python-lib that referenced this issue Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants