8000 BUG: np.random.multinomial(<float>, ...) raises a TypeError in numpy 1.26 · Issue #25061 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: np.random.multinomial(<float>, ...) raises a TypeError in numpy 1.26 #25061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1fish2 opened this issue Nov 3, 2023 · 11 comments · Fixed by #25090
Closed

BUG: np.random.multinomial(<float>, ...) raises a TypeError in numpy 1.26 #25061

1fish2 opened this issue Nov 3, 2023 · 11 comments · Fixed by #25090

Comments

@1fish2
Copy link
Contributor
1fish2 commented Nov 3, 2023

Describe the issue:

Up through numpy 1.25.2, random.multinomial() accepted a float as the first argument n, whereas numpy 1.26 does not. (In our code, that value comes from a computation involving scipy.constants.Avogadro.)

In numpy 1.25.2 (tweaking an example from the docs):

>>> import numpy as np
>>> np.random.multinomial(20.0, [1/6.]*6, size=1)
array([[4, 5, 2, 6, 3, 0]])

In numpy 1.26.0 and 1.26.1:

>>> import numpy as np
>>> np.random.multinomial(20.0, [1/6.]*6, size=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "numpy/random/mtrand.pyx", line 4256, in numpy.random.mtrand.RandomState.multinomial
TypeError: 'float' object cannot be interpreted as an integer

Q. Is this an intentional API change?

I don't see it in the release notes.

So I guess what changed was shifting API definitions from Cython's UFuncs.pyx to NumPy's __init__.cython-30.pxd

Environment: macOS 13.6, Intel CPU, Python 3.11.6.
Also: Ubuntu Linux, Intel CPU, Python 3.11.6.

Reproduce the code example:

import numpy as np
np.random.multinomial(20.0, [1/6.]*6, size=1)

Error message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "numpy/random/mtrand.pyx", line 4256, in numpy.random.mtrand.RandomState.multinomial
TypeError: 'float' object cannot be interpreted as an integer

Runtime information:

import sys, numpy; print(numpy.__version__); print(sys.version)

1.26.1
3.11.6 (main, Oct 31 2023, 22:06:12) [Clang 15.0.0 (clang-1500.0.40.1)]

print(numpy.show_runtime())

[{'numpy_version': '1.26.1',
  'python': '3.11.6 (main, Oct 31 2023, 22:06:12) [Clang 15.0.0 '
            '(clang-1500.0.40.1)]',
  'uname': uname_result(system='Darwin', node='jerrys-mbp.lan', release='22.6.0', version='Darwin Kernel Version 22.6.0: Fri Sep 15 13:39:52 PDT 2023; root:xnu-8796.141.3.700.8~1/RELEASE_X86_64', machine='x86_64')},
 {'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
                      'found': ['SSSE3',
                                'SSE41',
                                'POPCNT',
                                'SSE42',
                                'AVX',
                                'F16C',
                                'FMA3',
                                'AVX2'],
                      'not_found': ['AVX512F',
                                    'AVX512CD',
                                    'AVX512_KNL',
                                    'AVX512_SKX',
                                    'AVX512_CLX',
                                    'AVX512_CNL',
                                    'AVX512_ICL']}},
 {'architecture': 'Haswell',
  'filepath': '/usr/local/var/pyenv/versions/3.11.6/envs/test1/lib/python3.11/site-packages/numpy/.dylibs/libopenblas64_.0.dylib',
  'internal_api': 'openblas',
  'num_threads': 1,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.23.dev'}]
None

Context for the issue:

I'll adjust our calling code to convert these values to int, so fixing this is not a priority.

I'd like to:

  1. let you know in case it is a bug to fix
  2. hear if this API change applies to other functions so we can fix them all while updating to numpy 1.26.1
@seberg
Copy link
Member
seberg commented Nov 3, 2023

Weird, on main I added an int() call due to a similar thing. I honestly think the error is an improvement, but since we try to change random as much little as possible, I would merge a fix that converts explicitly via int().

I would suspect it to be a cython 3 change about how it converts to integers which are not int, but I doubt the definitions matter too much.

The UFunc.pyx definition surprises me (did cython 3 change its default integer to be Py_ssize_t or intptr_t!?), but I don't think it matters for the issue.

@charris charris added this to the 1.26.2 release milestone Nov 3, 2023
@1fish2
Copy link
Contributor Author
1fish2 commented Nov 3, 2023

Reproducible example code:

# cython: language_level=3str
# distutils: define_macros=NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION

"""
Teasing out factors for numpy #25061, where passing a float to Cython code
[like np.random.RandomState.multinomial()] might raise
    TypeError: 'float' object cannot be interpreted as an integer
depending whether this code is compiled with Cython==0.29 + numpy==1.25.2
vs. Cython==3.0 + numpy==1.26.1, which declare npy_intp differently.
"""

import numpy as np
cimport numpy as np


def func_int(int length):
    """Accepts an int or a float."""
    return length

def func_Py_ssize_t(Py_ssize_t length):
    """Accepts an int; never a float."""
    return length

def func_npy_intp(np.npy_intp length):
    """Accepts an int; float depends on Cython and numpy versions since they
    define npy_intp differently."""
    return length


# Test these in Python:
"""
import testcase as tc
tc.func_int(1.0)         # should be OK
tc.func_Py_ssize_t(1.0)  # should raise TypeError
tc.func_npy_intp(1.0)    # depends on Cython 0.29 vs. 3.0
"""

@seberg
Copy link
Member
seberg commented Nov 4, 2023

Thanks, so only npy_intp changed!? I find that confusing, because the only type that I would expect to be special is at most anything that maps to C-long (which int does).
But npy_intp only maps to long on some platforms, so I don't see how Cython 0.29 knew about that.

Admittedly the definitions for intp seem outright wrong, not that it matters on most platforms... Unless somehow Py_intptr_t changed from being long sometimes (not windows) to never being long!?

@1fish2
Copy link
Contributor Author
1fish2 commented Nov 5, 2023

I think the story is:

  • Yes, Cython 3.0 did not change the semantics of parameter types int (allows a float) or Py_ssize_t (rejects a float).

  • Cython 3.0 made this smart change:

    The NumPy declarations (cimport numpy) were moved over to the NumPy project in order to allow version specific changes on their side.

    and that switched the definition of np.npy_intp.

  • NumPy 1.26.0 is the first release that's built with Cython 3.0.

  • np.npy_intp is defined by Cython 3.0 as int and defined in Cython 0.29 as Py_intptr_t which is defined as int (allows a float).

  • np.npy_intp is defined by NumPy for use with Cython 3.0 as Py_intptr_t which is defined as Py_ssize_t (rejects a float).

==> So if __init__.cython-30.pxd reverted the definition of Py_intptr_t back to int, it'd restore this class of (unintended?) API changes.

Whether that causes other problems, I can't say. A narrower fix would change multinomial to declare n as an int.

@seberg
Copy link
Member
seberg commented Nov 5, 2023

That doesn't add up. NumPy has been the primary source for the definitions for a long time, unless numpy.random is special as its build when numpy isn't yet installed.
Defining npy_intp to int, which is C-long is just wrong on windows and you should ignore the UFunc.pyx. It is wrong, but not used in any case.

@1fish2
Copy link
Contributor Author
1fish2 commented Nov 6, 2023

Do ask the Cython folks for expertise on this and let me know where I was wrong.

Here's a bit of info from a cython-users thread:

"npy_intp" is declared as "Py_ssize_t", which uses the Python "index"
protocol for the conversion, not "int".

ctypedef Py_intptr_t npy_intp

I'm unsure about when int in Cython is a Python int vs. a C int.

@seberg
Copy link
Member
seberg commented Nov 6, 2023

Maybe the difference is the language level default?

@1fish2
Copy link
Contributor Author
1fish2 commented Nov 6, 2023

I like a testable hypothesis! I wrote a unit test to test the limits.

Results:

  • func_int(int length) accepts a C int (32 bit on Intel Mac) or a float.
  • func_Py_ssize_t(Py_ssize_t length) accepts a C long (64 bit on Intel Mac); never a float.
  • func_npy_intp(np.npy_intp length) accepts a C long. It accepts a float when using Cython 0.29.36 and numpy 1.25.2. It rejects a float when using Cython 3.0.5 and numpy 1.26.1.
  • Cython language_level=3 and language_level=3str work the same in these tests.

@1fish2
Copy link
Contributor Author
1fish2 commented Nov 7, 2023

Another hypothesis test: @thalassemia verified that editing __init__.cython-30.pxd (in a clone of Numpy 2.0 main) to define npy_intp as int instead of Py_ssize_t allows np.random.multinomial(20.0, [1/6.]*6, size=1) with no error, and changing the line back to Py_ssize_t and reinstalling causes the function call to raise a TypeError again.

@thalassemia
Copy link
Contributor

The n parameter was changed to npy_intp to fix an ancient indexing bug (#227). Since n is never used as an index, this can be safely reverted to long to restore the graceful truncation of floating point values. I've created a PR doing exactly this.

@seberg
Copy link
Member
seberg commented Nov 9, 2023

The more thorough fix for this is: gh-25094 which fixes the underlying reason for why it changed, rather than a targeted specific one for multinomial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
0