[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT Directly cimport interfaces from std::algorithm #27682

Closed
jjerphan opened this issue Oct 29, 2023 · 15 comments · Fixed by #28489
Closed

MAINT Directly cimport interfaces from std::algorithm #27682

jjerphan opened this issue Oct 29, 2023 · 15 comments · Fixed by #28489
Labels
cython good first issue Easy with clear instructions to resolve

Comments

@jjerphan
Copy link
Member
jjerphan commented Oct 29, 2023

Some Cython implementations use interfaces from the standard library of C++, namely std::algorithm::move and std::algorithm::fill from std::algorithm.

Before Cython 3, those interfaces had to be imported directly using the verbose syntax from Cython:

  • # TODO: change for `libcpp.algorithm.move` once Cython 3 is used
    # Introduction in Cython:
    # https://github.com/cython/cython/blob/05059e2a9b89bf6738a7750b905057e5b1e3fe2e/Cython/Includes/libcpp/algorithm.pxd#L47 #noqa
    cdef extern from "<algorithm>" namespace "std" nogil:
    OutputIt move[InputIt, OutputIt](InputIt first, InputIt last, OutputIt d_first) except + #noqa
  • # TODO: change for `libcpp.algorithm.fill` once Cython 3 is used
    # Introduction in Cython:
    #
    # https://github.com/cython/cython/blob/05059e2a9b89bf6738a7750b905057e5b1e3fe2e/Cython/Includes/libcpp/algorithm.pxd#L50 #noqa
    cdef extern from "<algorithm>" namespace "std" nogil:
    void fill[Iter, T](Iter first, Iter last, const T& value) except + #noqa

Cython 3 introduced the following line natively, for those interfaces. Those interfaces should now be cimported directly. That is one can replace the line shown above respectively with:

from libcpp.algorithm cimport move
from libcpp.algorithm cimport fill

I believe this is a good first Cython issue.

Any reader should feel free to pick it up. It might be possible that there is some context missing.

Please let me know if you need help. 🙂

@jjerphan jjerphan added good first issue Easy with clear instructions to resolve cython labels Oct 29, 2023
@jjerphan jjerphan changed the title MAINT Use cimport new Cython 3 interfaces directly MAINT Directly cimport interfaces from std::algorithm Oct 29, 2023
@ArunimSamudra
Copy link

I am working on this.

@lesteve
Copy link
Member
lesteve commented Oct 31, 2023

Our minimum supported Cython version is 0.29.33 currently:

CYTHON_MIN_VERSION = "0.29.33"

Is there a way to use the new way for Cython >= 3 and the old way for Cython < 3?

I assume we want to keep supporting Cython < 3 for some time, in case some unintended Cython 3 bugs are discovered.

Lambxx added a commit to Lambxx/scikit-learn that referenced this issue Oct 31, 2023
@Lambxx
Copy link
Lambxx commented Oct 31, 2023

I have made a PR #27699
I hope that is ok, this is my first PR so I apologise for any errors. Any feedback welcome

@jjerphan
Copy link
Member Author
jjerphan commented Oct 31, 2023

Oh, I missed that. I thought we updated to Cython 3.X after the regressions were fixed.

I think maintaining code for Cython 0.29.X and Cython 3.X is costly. Since those maintainable changes aren't strictly required. I think we should wait for Cython 3.X to be used by default everywhere.

What do you think?

@betatim
Copy link
Member
betatim commented Nov 1, 2023

Is there a reason to support more than one version of cython? I think the cython dependency is different from Numpy or scipy in that the choice of cython version doesn't impact users. Whether you can install a particular version of scikit-learn or not does not depend on the version of cython. (Maybe this is not quite true for those who build scikit-learn from source)

The current minimum version of 0.29.x is the previous "major release" of cython, basically like 3.x now.

This makes me think we should only support one version and that in the past we've also only supported one version.

If there are problems with cython 3 (say the performance regressions come back) then we will have to implement a fix, but that is life.

@jjerphan
Copy link
Member Author

Is there a reason to support more than one version of cython? I think the cython dependency is different from Numpy or scipy in that the choice of cython version doesn't impact users. Whether you can install a particular version of scikit-learn or not does not depend on the version of cython. (Maybe this is not quite true for those who build scikit-learn from source)

I agree with your analysis. I also do not think there's a reason to support multiple major versions of Cython.

This makes me think we should only support one version and that in the past we've also only supported one version.

I also think we only must support one version of Cython, but we are not pressured to. I am not sure whether all the bugs fixes of Cython 3.X.X will be backported to Cython 0.29.X.

For instance, cython/cython#5230 (which was originally met in scikit-learn) just got addressed, but I wonder whether its solution (cython/cython#5233) will be ported to 0.29.X (I am invoking @da-woods to know more about Cython's policy 🙏).

I think the upgrade to Cython 3 must be considered with the release of NumPy 2. To me, this necessitates a comprehensive understanding of the changes for the compilation.

@jjerphan
Copy link
Member Author

Given the remark of the current development of Cython given in cython/cython#5233 (comment), I am in favor of now using Cython 3 and I think we should update scikit-learns' build system accordingly.

What do you think?

@lesteve
Copy link
Member
lesteve commented Dec 21, 2023

I would say, open a PR bumping the minimum cython version and see what happens?

Quickly looking at numpy and scipy, they require Cython 3 on main but not on their latest release (see https://github.com/numpy/numpy/blob/aa78349ca935d58d2903af3db278858613ee8910/pyproject.toml#L5 and https://github.com/scipy/scipy/blob/60de9b912acef5ba794de177416d6fb7d29486e6/pyproject.toml#L14), so I think it's fine for scikit-learn to do the same.

Side-comment: we are using Cython 3 already but we also make sure that our code compiles with Cython 0.29.33.

@jjerphan
Copy link
Member Author
jjerphan commented Jan 5, 2024

@Lambxx: are you interested in upgrading scikit-learn to use Cython 3 at minimum?

@lesteve
Copy link
Member
lesteve commented Jan 25, 2024

I opened #28259 to gather feed-back on updating Cython minimum supported version to 3.0.8

@jatindyerawadekar
Copy link

@jjerphan I'm a first-timer and would also like to contribute to this issue. Is the issue regarding upgrading sklearn to use Cython 3 still open?

@jjerphan
Copy link
Member Author

Hi @jatindyerawadekar,

Good to know that you are interested. Unfortunately this is already being addressed by #28284. I would recommend looking at the "good first issue" label in the issue tracker.

@adrinjalali
Copy link
Member

FYI We're now using cython=3.0.8 on the main branch, this work can now start.

@jjerphan
Copy link
Member Author
jjerphan commented Feb 1, 2024

This work was started with #27699, please do coordinate with @Lambxx.

@adam2392
Copy link
Member

@Lambxx I'm now continuing work on #26032 and saw this was still not merged. Are you interested in continuing the work?, or I can submit a quick PR to patch the code very quickly?

Otherwise, I'm also happy to review it if you submit a revised PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cython good first issue Easy with clear instructions to resolve
Projects
None yet
8 participants