8000 Fix and test the sharded file hasher by stefanberger · Pull Request #465 · sigstore/model-transparency · GitHub
[go: up one dir, main page]

Skip to content

Fix and test the sharded file hasher #465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 21, 2025

Conversation

stefanberger
Copy link
Contributor

Summary

This PR brings the state of the sharded file hasher to that of the simple file hasher and adds a test case for the shareded file hasher derived from the certificate sign+verify test.

Checklist
  • All commits are signed-off, using DCO
  • All new code has docstrings and type annotations
  • All new code is covered by tests. Aim for at least 90% coverage. CI is configured to highlight lines not covered by tests.
  • Public facing changes are paired with documentation changes
  • Release note has been added to CHANGELOG.md if needed

@stefanberger stefanberger requested review from a team as code owners May 21, 2025 14:48
@stefanberger stefanberger marked this pull request as draft May 21, 2025 14:48
Bring the sharded hasher to the same state as the simple file hasher:
- Record ignored files in the predicate's serialization description
- Adjust the model name in case it is either empty or '..'

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
@stefanberger stefanberger force-pushed the sharded_file_hasher_test branch from cfd6979 to 8a099c7 Compare May 21, 2025 16:18
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
@stefanberger stefanberger force-pushed the sharded_file_hasher_test branch from 8a099c7 to 25a8080 Compare May 21, 2025 16:22
@stefanberger stefanberger marked this pull request as ready for review May 21, 2025 16:22
@stefanberger
Copy link
Contributor Author

Seeing these non-deterministic errors here:

self = <tuf.ngclient.updater.Updater object at 0x000001C69D7A5050>

    def _update_root_symlink(self) -> None:
        """Symlink root.json to current trusted root version in root_history/"""
        linkname = os.path.join(self._dir, "root.json")
        version = self._trusted_set.root.version
        current = os.path.join("root_history", f"{version}.root.json")
        with contextlib.suppress(FileNotFoundError):
            os.remove(linkname)
>       os.symlink(current, linkname)
E       FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'root_history\\12.root.json' -> 'C:\\Users\\runneradmin\\AppData\\Local\\sigstore\\sigstore-python\\tuf\\https%3A%2F%2Ftuf-repo-cdn.sigstore.dev\\root.json'

C:\Users\runneradmin\AppData\Local\hatch\env\virtual\model-signing\DK7Yf6Fk\hatch-test.py3.11\Lib\site-packages\tuf\ngclient\updater.py:367: FileExistsError
=========================== short test summary info ===========================
FAILED tests/api_test.py::TestKeySigning::test_sign_and_verify - FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'root_history\\12.root.json' -> 'C:\\Users\\runneradmin\\AppData\\Local\\sigstore\\sigstore-python\\tuf\\https%3A%2F%2Ftuf-repo-cdn.sigstore.dev\\root.json'
======================== 1 failed, 152 passed in 5.57s ========================
Error: Process completed with exit code 1.

@spencerschrock
Copy link
Contributor

Seeing these non-deterministic errors here

I recommend using unshare if you available.

unshare --map-root-user --net pytest -v -m "not integration"
~/model-transparency/tests/api_test.py:260: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
~/model-transparency/src/model_signing/signing.py:90: in __init__
    self.use_sigstore_signer()
~/model-transparency/src/model_signing/signing.py:147: in use_sigstore_signer
    self._signer = sigstore.Signer(
~/model-transparency/src/model_signing/_signing/sign_sigstore.py:93: in __init__
    self._signing_context = sigstore_signer.SigningContext.production()
~/.local/share/hatch/env/virtual/model-signing/VsHuyp0T/hatch-test.py3.12/lib/python3.12/site-packages/sigstore/sign.py:334: in production
    trusted_root=TrustedRoot.production(),
~/.local/share/hatch/env/virtual/model-signing/VsHuyp0T/hatch-test.py3.12/lib/python3.12/site-packages/sigstore/_internal/trust.py:351: in production
    return cls.from_tuf(DEFAULT_TUF_URL, offline)
~/.local/share/hatch/env/virtual/model-signing/VsHuyp0T/hatch-test.py3.12/lib/python3.12/site-packages/sigstore/_internal/trust.py:338: in from_tuf
    path = TrustUpdater(url, offline).get_trusted_root_path()

Since Sigstore is the default, the TUF fetching code is invoked when the tests call signing.Config().

signing.Config().use_certificate_signer(
private_key=private_key,
signing_certificate=signing_certificate,
certificate_chain=certificate_chain,
).set_hashing_config(

We have a few options:

  1. Mock all these failing tests with mocked_sigstore_signer like we do in the Sigstore unit tests.
  • scripts/tests/verify_test.py::TestVerify::test_verify_sigstore_v0_3_1
  • scripts/tests/verify_test.py::TestVerify::test_verify_sigstore_v1_0_0
  • tests/api_test.py::TestKeySigning::test_sign_and_verify
  • tests/api_test.py::TestCertificateSigning::test_sign_and_verify
  • tests/api_test.py::TestCertificateSigning::test_sign_and_verify_sharded
  1. Lazy init the signer so we don't reach out to network until we call .sign(), and then we can make the SigstoreSigner if none is provided

I lean towards option 2, @mihaimaruseac may have other thoughts.

@mihaimaruseac
Copy link
Collaborator

Yeah, let's do option 2. Merged the other PR already.

@stefanberger
Copy link
Contributor Author

Since Sigstore is the default, the TUF fetching code is invoked when the tests call signing.Config().

I happens very rarely on my development machine and more often in the github actions actually. I got this here:

E       FileExistsError: [Errno 17] File exists: 'root_history/12.root.json' -> '/home/stefanb/.local/share/sigstore-python/tuf/https%3A%2F%2Ftuf-repo-cdn.sigstore.dev/root.json'

If feels like this error is related to concurrency and could even show up if someone ran multiple instances of the model_signing library while creating signatures concurrently. If so, the TUF library should probably have a lock/lock-file that prevents concurrency inside this function.

@mihaimaruseac mihaimaruseac merged commit a85bb63 into sigstore:main May 21, 2025
51 checks passed
@stefanberger stefanberger deleted the sharded_file_hasher_test branch May 21, 2025 17:50
@spencerschrock
Copy link
Contributor

If so, the TUF library should probably have a lock/lock-file that prevents concurrency inside this function.

It sounds like this should be fixed by sigstore-python instead (I'll file an issue):

Note that applications using Updater should be 'single instance'
applications: running multiple instances that use the same cache directories at
the same time is not supported.

https://github.com/theupdateframework/python-tuf/blob/6fc2a3c275ce9911984e8d17578aeec51b506683/tuf/ngclient/updater.py#L32-L34

@jku
Copy link
Member
jku commented May 23, 2025

TUF library should probably have a lock/lock-file that prevents concurrency inside this function.

I don't disagree (it's just that implementing lockfiles well is not always trivial) theupdateframework/python-tuf#2836

As a potential fix for the parallel testing use case specifically: The conflict happens because the tuf client writes/reads files in the metadata store and artifact cache. These locations are decided by sigstore-python based on user data dir and user cache dir respectively (with https://github.com/tox-dev/platformdirs). So if you are able to set those to a unique location for each te 7AB4 st, then conflicts will not happen... Unfortunately that probably requires platform-specific shenanigans: setting e.g. $XDG_DATA_HOME is unlikely to do anything on Windows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0