8000 Slow for SBOMs with a large number of files + relationships · Issue #790 · spdx/tools-python · GitHub
[go: up one dir, main page]

Skip to content
Slow for SBOMs with a large number of files + relationships #790
Open
@jonjohnsonjr

Description

@jonjohnsonjr

This check iterates over the list existing_relationships up to 2 times:

def check_if_relationship_exists(
self, relationship: Relationship, existing_relationships: List[Relationship]
) -> bool:
# assume existing relationships are stripped of comments
if relationship in existing_relationships:
return True
relationship_inverted: Relationship = self.invert_relationship(relationship)
if relationship_inverted in existing_relationships:
return True
return False

And it's called for every file:

for file_spdx_id in contained_files:
try:
contains_relationship = Relationship(
spdx_element_id=package_spdx_id,
relationship_type=RelationshipType.CONTAINS,
related_spdx_element_id=file_spdx_id,
)
except ConstructorTypeErrors as err:
logger.append(err.get_messages())
continue
if not self.check_if_relationship_exists(
relationship=contains_relationship, existing_relationships=existing_relationships
):
contains_relationships.append(contains_relationship)

So if F is the number of files and R is the number of relationships, we are doing O(F * R) comparisons of relationships (which seem not cheap individually).

And given that this seems to expect to find a relationship for every file, we know that R is >= F, so this is at least O(F^2).

Looks quadratic to me.

With the caveat that I'm not a python expert, this being a list seems to be the issue:

existing_relationships_without_comments: List[Relationship] = self.get_all_relationships_without_comments(
relationships
)

Is it possible to turn that into a set so these membership tests are O(1) instead of O(N)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0