-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
ENH: core: add hack enabling unpickling Py2 pickled scalars on Py3 under encoding='latin1' #4883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
-1, I don't think adding more latin1 hacks is the right way, e.g. with this patch this won't work (using #aa☎#):
I think the better way is using the pickle protocol, variant 0 does always seem to use utf32 (or ucs2 for narrow builds) as its the python2 protocol, we can use that to decode correctly (just use |
If the user does not pass Moreover, I don't think the protocol affects string unpickling very much, as it seems to be mainly about adding more opcodes. |
As far as I was able to verify, the purpose of the function |
hm, right I was testing the function itself, so I guess this is ok. what is actually writting the strings as latin1? is it numpy or python? |
also please run
as I think this should be merged to 1.9 |
Yes, you're right that there should also be non-latin1 unicode test case. Adding that... Nothing actually writes latin1 strings --- on Python2, numpy hands over the raw binary data as a Python string object to pickle. Python 3 however loads the corresponding The cleanest case for us would be On Python 3, Numpy hands over the raw data as Python bytes object, so the problem does not arise with Py3 generated pickles. |
Better tests + rebased |
needs a rebase due to the bugfix merge which touches the same test |
…der encoding='latin1' There is a similar hack in place for arrays, but scalar unpickling was not covered. Provides a workaround for numpygh-4879
rebased |
ENH: core: add hack enabling unpickling Py2 pickled scalars on Py3 under encoding='latin1'
ENH: core: add hack enabling unpickling Py2 pickled scalars on Py3 under encoding='latin1'
thanks, also pushed to 1.9 |
There is a similar dirty hack in place for arrays, but scalar unpickling was not covered.
Should provide a workaround for gh-4879