BUG: Inconsistent and potentially misleading conversion of Polynomials to strings when the domain and window differ · Issue #27903 · numpy/numpy · GitHub
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The string value of a Polynomial is inconsistent and potentially misleading, as it does include any information about the domain/window of the polynomial. If the domain and window of the polynomial are not equal, str(poly) returns a string where x refers to the scaled coordinate, rather than the unscaled coordinates used when e.g. calling the polynomial. This means that the returned string can be mathematically different to the 'actual' values of the polynomial, which would be very misleading to any users who are not aware of the domain/window conversions in the Polynomial, and potentially lead to errors in data analysis.
Furthermore, the value of str(poly) is inconsistent with the value of IPython'sdisplay(poly), adding to the potential confusion. With str(poly), x refers to the scaled variable, whereas for display(poly), x refers to the original unscaled variable.
For example, with y = 1 + 2*x + 3*x**2 (see code example), the value of str(poly) is 7601.0 + 15100.0·x + 7500.0·x², which obviously seems surprisingly different to the expected 1.0 + 2.0·x + 3.0·x². The IPython display(poly) version of the polynomial, x↦7601.0+15100.0(-1.0+0.02x)+7500.0(-1.0+0.02x)^2, is more complex, but mathematically gives the correct result.
To avoid confusion, I feel like it may make more sense for str(poly) to return a value formatted similarly to the existing display(poly) behaviour:
This would avoid any misleading ambiguity about if x refers to the scaled or unscaled variables, and ensure the IPython and string versions of the polynomial are consistent. It would also help to signpost and emphasise the effect of the a differing domain/window for any users who are using e.g. Polynomial.fit() for the first time.
The inconsistent and confusing nature of the conversion of a Polynomial to a string makes it more challenging to use Polynomials quick exploration of datasets, and increases the chance of bugs/mistakes in data analysis caused by users incorrectly interpreting the value of str(poly).
I originally encountered this issue when fitting a dataset with Polynomial.fit(), then plotting the fitted polynomial with matplotlib: ax.plot(x, poly(x), label=str(poly)). This produced a legend entry that was inconsistent with the x coordinates in the graph, due to the domain/window scaling - I now use str(poly.convert()) to produce the 'correct' legend entry.
This issue, however, could easily go unnoticed if the domain and window are similar, but different (e.g. domain=[-1, 1.1], window=[-1, 1]), as it may not be immediately obvious that x coordinate in str(poly) is different to the x coordinate in the graph.
The text was updated successfully, but these errors were encountered:
Describe the issue:
The string value of a
Polynomial
is inconsistent and potentially misleading, as it does include any information about the domain/window of the polynomial. If the domain and window of the polynomial are not equal,str(poly)
returns a string wherex
refers to the scaled coordinate, rather than the unscaled coordinates used when e.g. calling the polynomial. This means that the returned string can be mathematically different to the 'actual' values of the polynomial, which would be very misleading to any users who are not aware of the domain/window conversions in the Polynomial, and potentially lead to errors in data analysis.Furthermore, the value of
str(poly)
is inconsistent with the value of IPython'sdisplay(poly)
, adding to the potential confusion. Withstr(poly)
,x
refers to the scaled variable, whereas fordisplay(poly)
,x
refers to the original unscaled variable.For example, with
y = 1 + 2*x + 3*x**2
(see code example), the value ofstr(poly)
is7601.0 + 15100.0·x + 7500.0·x²
, which obviously seems surprisingly different to the expected1.0 + 2.0·x + 3.0·x²
. The IPythondisplay(poly)
version of the polynomial,x↦7601.0+15100.0(-1.0+0.02x)+7500.0(-1.0+0.02x)^2
, is more complex, but mathematically gives the correct result.To avoid confusion, I feel like it may make more sense for
str(poly)
to return a value formatted similarly to the existingdisplay(poly)
behaviour:display(poly)
:x↦7601.0+15100.0(-1.0+0.02x)+7500.0(-1.0+0.02x)^2
str(poly)
:7601.0 + 15100.0·x + 7500.0·x²
str(poly)
:7601.0 + 15100.0·(-1.0+0.02x) + 7500.0·(-1.0+0.02x)²
This would avoid any misleading ambiguity about if
x
refers to the scaled or unscaled variables, and ensure the IPython and string versions of the polynomial are consistent. It would also help to signpost and emphasise the effect of the a differing domain/window for any users who are using e.g.Polynomial.fit()
for the first time.Reproduce the code example:
Error message:
No response
Python and NumPy Versions:
1.25.2
3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0]
Runtime Environment:
[{'numpy_version': '1.25.2',
'python': '3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0]',
'uname': uname_result(system='Linux', node='alice-login02', release='5.14.0-427.42.1.el9_4.x86_64', version='#1 SMP PREEMPT_DYNAMIC Thu Oct 31 14:01:51 UTC 2024', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'Zen',
'filepath': '/alice-home/3/o/ortk2/miniconda3/envs/py311/lib/python3.11/site-packages/numpy.libs/libopenblas64_p-r0-5007b62f.3.23.dev.so',
'internal_api': 'openblas',
'num_threads': 8,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.23.dev'},
{'architecture': 'Zen',
'filepath': '/alice-home/3/o/ortk2/miniconda3/envs/py311/lib/python3.11/site-packages/scipy.libs/libopenblasp-r0-41284840.3.18.so',
'internal_api': 'openblas',
'num_threads': 8,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.18'}]
Context for the issue:
The inconsistent and confusing nature of the conversion of a
Polynomial
to a string makes it more challenging to use Polynomials quick exploration of datasets, and increases the chance of bugs/mistakes in data analysis caused by users incorrectly interpreting the value ofstr(poly)
.I originally encountered this issue when fitting a dataset with
Polynomial.fit()
, then plotting the fitted polynomial with matplotlib:ax.plot(x, poly(x), label=str(poly))
. This produced a legend entry that was inconsistent with the x coordinates in the graph, due to the domain/window scaling - I now usestr(poly.convert())
to produce the 'correct' legend entry.This issue, however, could easily go unnoticed if the domain and window are similar, but different (e.g.
domain=[-1, 1.1], window=[-1, 1]
), as it may not be immediately obvious that x coordinate instr(poly)
is different to the x coordinate in the graph.The text was updated successfully, but these errors were encountered: