10000 build: add freethreaded python ci and wheels by lgray · Pull Request #284 · cms-nanoAOD/correctionlib · GitHub
[go: up one dir, main page]

Skip to content

build: add freethreaded python ci and wheels #284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

lgray
Copy link
Contributor
@lgray lgray commented Mar 12, 2025
  • ci
  • wheels
  • tests

Will add wheels when they're fixed up by #283

@lgray
Copy link
Contributor Author
lgray commented Mar 12, 2025

Ah - right we are blocked because pydantic is not updated to be compatible with 3.13t at all.

Nope it's in beta now!

@lgray lgray changed the title add freethreaded python ci and wheels build: add freethreaded python ci and wheels Mar 12, 2025
@lgray
Copy link
Contributor Author
lgray commented Mar 17, 2025

refreshing this PR since there's now beta2 of the freethreaded pydantic

@lgray
Copy link
Contributor Author
lgray commented Mar 17, 2025

@mgorny @nsmith- @henryiii

I noticed when it's doing the freethreaded windows build that the MS linker is looking for python313.lib, rather than python313t.lib that's specified as PythonLib in CMake.

Results in the error (I think):

LINK : 
8000
fatal error LNK1104: cannot open file 'python313.lib' [C:\Users\runneradmin\AppData\Local\Temp\tmpu4rnegbz\build\_core.vcxproj]

I think that's the only blocker on windows.

@mgorny
Copy link
Contributor
mgorny commented Mar 18, 2025

That looks like https://gitlab.kitware.com/cmake/cmake/-/issues/26016, which is supposedly fixed but I'm not sure in which CMake version.

@mgorny
Copy link
Contributor
mgorny commented Mar 18, 2025

That looks like https://gitlab.kitware.com/cmake/cmake/-/issues/26016, which is supposedly fixed but I'm not sure in which CMake version.

Ah, sorry, now I see 3.30.3 — so supposedly it should work here.

@mgorny
Copy link
Contributor
mgorny commented Mar 18, 2025

Sorry for thinking loudly. I see that CMake is doing the right thing:

2025-03-17T20:15:06.0037191Z   -- Found PythonInterp: C:/hostedtoolcache/windows/Python/3.13.1/x64-freethreaded/python.exe (found suitable version "3.13.1", minimum required is "3.7")
2025-03-17T20:15:06.0038736Z   -- Found PythonLibs: C:/hostedtoolcache/windows/Python/3.13.1/x64-freethreaded/libs/python313t.lib

So looks like python313.lib is coming from elsewhere. pybind11 perhaps?

@lgray
Copy link
Contributor Author
lgray commented Mar 18, 2025

No problem - think out loud all you want. I'm just not sure where to look :-)

@henryiii
Copy link

You need to use the modern FindPython, not the old one. FindPythonLibs / FindPythonInterp was "removed" (sort of) in CMake 3.27, so that's not what 3.30.3 is referring to.

@henryiii
Copy link

Right above here

include(FetchContent)
FetchContent_Declare(pybind11
SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/pybind11
CMAKE_ARGS "-DBUILD_TESTING=OFF -DPYBIND11_NOPYTHON=ON"
)
FetchContent_MakeAvailable(pybind11)
you should set(PYBIND11_FINDPYTHON ON). pybind11 3.0 will change the default.

@lgray
Copy link
Contributor Author
lgray commented Mar 18, 2025

Thanks @henryiii, giving it a try.

@lgray
Copy link
Contributor Author
lgray commented Mar 18, 2025

3.13t manylinux wheels are dying on tests because of missing awkward 3.13t wheel. Otherwise they build fine!

@lgray
Copy link
Contributor Author
lgray commented Mar 18, 2025

@henryiii I guess the best way to test this in freethreaded mode for races and such would to first just try pytest-parallel? Probably mark the dask tests to not be run in parallel?

@henryiii
Copy link

Yes, I believe so.

@lgray
Copy link
Contributor Author
lgray commented Mar 18, 2025

Nice - parallel tests seem to go, except on windows. Looks like some issue with parallel processing on windows to begin with.

@lgray lgray closed this Mar 19, 2025
@lgray lgray reopened this Mar 19, 2025
@lgray lgray closed this Mar 19, 2025
@lgray lgray reopened this Mar 19, 2025
@lgray lgray closed this Apr 3, 2025
@lgray lgray reopened this Apr 3, 2025
@lgray
Copy link
Contributor Author
lgray commented Apr 4, 2025

I don't really understand this failure in macos.

Must be a threadsafety thing, but then why not in ubuntu?

@lgray
Copy link
Contributor Author
lgray commented Apr 4, 2025

Oh wow, it's definitely a thread safety issue (it passed in this latest commit!). Yikes!

@lgray
Copy link
Contributor Author
lgray commented Apr 4, 2025

@nsmith- Any first ideas on what needs a mutex around it?

@lgray
Copy link
Contributor Author
lgray commented Apr 4, 2025

For reference the error that crops up in the MT tests is:

self = <correctionlib.highlevel.CorrectionSet object at 0x2a9ea3d0150>

    def __iter__(self) -> Iterator[str]:
>       return iter(self._base)
E       TypeError: Object of type 'iterator' is not an instance of 'iterator'

/Library/Frameworks/PythonT.framework/Versions/3.13/lib/python3.13t/site-packages/correctionlib/highlevel.py:401: TypeError
________________________________ test_evaluator ________________________________

    def test_evaluator():
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string("{")
    
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string("{}")
    
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string('{"schema_version": "blah"}')
    
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string('{"schema_version": 2, "description": 3}')
    
        cset = core.CorrectionSet.from_string(
            '{"schema_version": 2, "description": "something", "corrections": []}'
        )
        assert cset.schema_version == 2
        assert cset.description == "something"
    
        cset = wrap(
            schema.Correction(
                name="test corr",
                version=2,
                inputs=[],
                output=schema.Variable(name="a scale", type="real"),
                data=1.234,
            )
        )
>       assert set(cset) == {"test corr"}
E       TypeError: Object of type 'iterator' is not an instance of 'iterator'

tests/test_core.py:47: TypeError

@nsmith-
Copy link
Collaborator
nsmith- commented Apr 15, 2025

My best guess would be something related to this:

correctionlib/src/python.cc

Lines 106 to 108 in 093ce46

.def("__iter__", [](const CorrectionSet &v) {
return py::make_key_iterator(v.begin(), v.end());
}, py::keep_alive<0, 1>())

is not threadsafe.

@lgray
Copy link
Contributor Author
lgray commented Apr 18, 2025

Digging around there seems to be something suspicious w.r.t. the py::keep_alive but I've not found an accurate description. make_key_iterator itself seems to be fine.

Will continue digging.

@lgray lgray closed this Jul 10, 2025
@lgray lgray reopened this Jul 10, 2025
@@ -50,16 +49,21 @@ jobs:
fetch-depth: 0
fetch-tags: true

- uses: actions/setup-python@v3
- uses: Quansight-Labs/setup-python@v5
Copy link
@henryiii henryiii Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normal setup-python works now (but not such an old version!)

Suggested change
- uses: Quansight-Labs/setup-python@v5
- uses: actions/setup-python@v5

@@ -94,7 +97,7 @@ skip = ["pp*-*"]
test-extras = "test"
test-command = "python -m pytest {package}/tests"
# When cpython 3.7 is dropped we can ignore i686 (numpy not built)
test-skip = ["pp*-*", "*-musllinux_*", "cp3{10,11,12}-win32", "cp3{8,9,10,11,12}-manylinux_i686"]
test-skip = ["pp*-*", "*-musllinux_*", "cp3{10,11,12}-win32", "cp3{8,9,10,11,12}-manylinux_i686", "*cp313t*"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
test-skip = ["pp*-*", "*-musllinux_*", "cp3{10,11,12}-win32", "cp3{8,9,10,11,12}-manylinux_i686", "*cp313t*"]
test-skip = ["pp*-*", "*-musllinux_*", "cp3{10,11,12}-win32", "cp3{8,9,10,11,12}-manylinux_i686", "*cp313t*"]

This is skipping the tests on 3.13t, so they aren't currently running. Ahh, you are running the normal tests in free-threading; probably makes sense until that is fixed.

run: python -m pytest -ra

- name: Test package (parallel)
if: matrix.python-version == '3.13t' && matrix.runs-on != 'windows-latest'
run: python -Xgil=0 -m pytest -ra --tests-per-worker auto
Copy link
@henryiii henryiii Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could increase this a bit, I think it will be around 2-4 on GHA, more might fail more consistently.

@henryiii
Copy link

I'd recommend retrying with pybind11 3.0, we did make a few fixes related to free-threading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0