8000 Concurrency errors from multiple TUF Updaters · Issue #1403 · sigstore/sigstore-python · GitHub
[go: up one dir, main page]

Skip to content

Concurrency errors from multiple TUF Updaters #1403

New issue

Have a question abo 8000 ut this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
spencerschrock opened this issue May 21, 2025 · 8 comments
Open

Concurrency errors from multiple TUF Updaters #1403

spencerschrock opened this issue May 21, 2025 · 8 comments
Assignees
Labels
component:tuf TUF related components documentation Improvements or additions to documentation

Comments

@spencerschrock
Copy link
spencerschrock commented May 21, 2025

Description

Hello, Sigstore's model-signing library uses sigstore-python, which in turn uses python-tuf.

We've had some intermittent test failures (sigstore/model-transparency#465) when we have multiple Sigstore signers, which launch multiple trusted roots and lead to some known concurrency errors. TUF's documentation says not to have multiple Updaters

Note that applications using Updater should be 'single instance'
applications: running multiple instances that use the same cache directories at
the same time is not supported.

The callstack is roughly this:

~/model-transparency/src/model_signing/signing.py:147: in use_sigstore_signer
    self._signer = sigstore.Signer(
~/model-transparency/src/model_signing/_signing/sign_sigstore.py:93: in __init__
    self._signing_context = sigstore_signer.SigningContext.production()
~/.local/share/hatch/env/virtual/model-signing/VsHuyp0T/hatch-test.py3.12/lib/python3.12/site-packages/sigstore/sign.py:334: in production
    trusted_root=TrustedRoot.production(),

This scenario is in violation of TUF's concurrency support, but does/should Sigstore python support this scenario? If not, can you update your documentation to clarify this?

self = <tuf.ngclient.updater.Updater object at 0x000001C69D7A5050>

    def _update_root_symlink(self) -> None:
        """Symlink root.json to current trusted root version in root_history/"""
        linkname = os.path.join(self._dir, "root.json")
        version = self._trusted_set.root.version
        current = os.path.join("root_history", f"{version}.root.json")
        with contextlib.suppress(FileNotFoundError):
            os.remove(linkname)
>       os.symlink(current, linkname)
E       FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'root_history\\12.root.json' -> 'C:\\Users\\runneradmin\\AppData\\Local\\sigstore\\sigstore-python\\tuf\\https%3A%2F%2Ftuf-repo-cdn.sigstore.dev\\root.json'

I've only seen it on our Windows runners, but haven't looked into it too much yet.
https://github.com/sigstore/model-transparency/actions/runs/15144309667/job/42576120558#step:4:100
https://github.com/sigstore/model-transparency/actions/runs/15165373394/job/42641686030#step:4:102

Version

sigstore 3.6.2

@spencerschrock spencerschrock added the bug Something isn't working label May 21, 2025
@woodruffw
Copy link
Member

Hi @spencerschrock, thanks for opening an issue!

This scenario is in violation of TUF's concurrency support, but does/should Sigstore python support this scenario? If not, can you update your documentation to clarify this?

Yeah, I think this is outside of our intended support: we use the python-tuf APIs directly, so we inherit whatever concurrency constraints they impose.

I'll look at opening a docs PR to clarify this 🙂

@woodruffw woodruffw self-assigned this May 21, 2025
@woodruffw woodruffw added documentation Improvements or additions to documentation component:tuf TUF related components and removed bug Something isn't working labels May 21, 2025
@jku
Copy link
Member
jku commented May 22, 2025

I'm not sure if python-tuf can 100% guarantee safe concurrency but we could do a lot better (just a lock file per repository would handle most issues) it's just not been a priority. I'll file an issue in any case.

@spencerschrock can you clarify what your design looks like: do you have multiple processes or multiple threads or just multiple signingcontexts in a single thread?

@spencerschrock
Copy link
Author

Currently the issue was exposed in our test suite, where pytest launches multiple processes to shard the tests.

In terms of real usage, our project's CLI creates one signingcontext but nothing stops users from running multiple CLI processes. Similarly, I have no immediate insight into how people are using our library in parallel.

cc @mihaimaruseac @stefanberger

@mihaimaruseac
Copy link

We might actually have multiple processes try to sign various models in threads, using the same signing context. I have not run an experiment with that, though. But I imagine some model hub wanting to sign all past models in a pipeline, might perform signing in parallel.

I'll run some experiments and be back

@stefanberger
Copy link

I ran this experiment in two python interpreters:

from model_signing import signing
while True:
    signing.Config().use_sigstore_signer()

Sooner or later one of them trows an exception, oftentimes one related to that existing file.

FileExistsError: [Errno 17] File exists: 'root_history/12.root.json' -> '/home/stefanb/.local/share/sigstore-python/tuf/https%3A%2F%2Ftuf-repo-cdn.sigstore.dev/root.json'

I don't think the error will happen very often in practice... but it should have some sort of serialization.

Let's say we wanted to put a lock-file into /home/stefanb/.local/share/sigstore-python/tuf/https%3A%2F%2Ftuf-repo-cdn.sigstore.dev/ -- how would we create the path on the level of the model_signing library, especially the https%3A%2F%2Ftuf-repo-cdn.sigstore.dev part? We could put a lock file into /home/stefanb/.local/share/sigstore-python/tuf/.lock, maybe it would have less granularity, but it doesn't seem right for the model signing library to do this in .../tuf directory.

Full trace:

  File "<python-input-6>", line 2, in <module>
    signing.Config().use_sigstore_signer()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/stefanb/dev/model-transparency/src/model_signing/signing.py", line 150, in use_sigstore_signer
    self._signer = sigstore.Signer(
                   ~~~~~~~~~~~~~~~^
        oidc_issuer=oidc_issuer,
        ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
        identity_token=identity_token,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/stefanb/dev/model-transparency/src/model_signing/_signing/sign_sigstore.py", line 93, in __init__
    self._signing_context = sigstore_signer.SigningContext.production()
                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/stefanb/dev/model-transparency/venv/lib64/python3.13/site-packages/sigstore/sign.py", line 334, in production
    trusted_root=TrustedRoot.production(),
                 ~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/stefanb/dev/model-transparency/venv/lib64/python3.13/site-packages/sigstore/_internal/trust.py", line 351, in production
    return cls.from_tuf(DEFAULT_TUF_URL, offline)
           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stefanb/dev/model-transparency/venv/lib64/python3.13/site-packages/sigstore/_internal/trust.py", line 338, in from_tuf
    path = TrustUpdater(url, offline).get_trusted_root_path()
           ~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/home/stefanb/dev/model-transparency/venv/lib64/python3.13/site-packages/sigstore/_internal/tuf.py", line 116, in __init__
    self._updater = Updater(
                    ~~~~~~~^
        metadata_dir=str(self._metadata_dir),
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<4 lines>...
        bootstrap=root_json,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/stefanb/dev/model-transparency/venv/lib64/python3.13/site-packages/tuf/ngclient/updater.py", line 143, in __init__
    self._update_root_symlink()
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/stefanb/dev/model-transparency/venv/lib64/python3.13/site-packages/tuf/ngclient/updater.py", line 367, in _update_root_symlink
    os.symlink(current, linkname)
    ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
FileExistsError: [Errno 17] File exists: 'root_history/12.root.json' -> '/home/stefanb/.local/share/sigstore-python/tuf/https%3A%2F%2Ftuf-repo-cdn.sigstore.dev/root.json'

It also can be reproduced with this:

from sigstore import sign
while True:
    sign.TrustedRoot.production()

@jku
Copy link
Member
jku commented May 26, 2025

Let's say we wanted to put a lock-file into /home/stefanb/.local/share/sigstore-python/tuf/https%3A%2F%2Ftuf-repo-cdn.sigstore.dev/ -- how would we create the path on the level of the model_signing library, especially the https%3A%2F%2Ftuf-repo-cdn.sigstore.dev part? We could put a lock file into /home/stefanb/.local/share/sigstore-python/tuf/.lock, maybe it would have less granularity, but it doesn't seem right for the model signing library to do this in .../tuf directory.

Yeah other software touching things under .local/share/sigstore-python is not great. There's two potential improvements that make sense that I can see:

  • the higher level component (like model-signing test suite as an example) can set $HOME or $XDG_DATA_HOME during runtime to make sure every instance gets their own directory
    • this is fine for a test suite but usually not so much for actual user software
    • multi-platform support is still work
  • actual lockfile support could be added to python-tuf
    • this means more 3rd party dependencies or very annoying multi-platform code, but is certainly an option

@stefanberger
Copy link

multi-platform support is still work

The first errors in the test suite actually occurred on Windows test machines in github actions...

actual lockfile support could be added to python-tuf

* this means more 3rd party dependencies or very annoying multi-platform code, but is certainly an option

Considering that I was able to reproduce the issue on the python-sigstore level here, which is part of the model signing code path, I would say that it should also be solved on this level. It would resolve it for all callers.

from sigstore import sign
while True:
    sign.TrustedRoot.production()

@woodruffw
Copy link
Member

Considering that I was able to reproduce the issue on the python-sigstore level here, which is part of the model signing code path, I would say that it should also be solved on this level. It would resolve it for all callers.

I believe @jku is saying that this should be solved in python-tuf, since that would transitively solve the problem for both sigstore-python and model-signing.

(I'm hesitant to have sigstore-python patch around it directly, since IMO we should peek under the veil of underlying TUF operations as little as possible -- python-tuf is expected to provide a sound abstraction for us here.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:tuf TUF related components documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

5 participants
0