8000 glibc detected: double free or corruption error on numpy.percentile · Issue #4836 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

glibc detected: double free or corruption error on numpy.percentile #4836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
habibrosyad opened this issue Jul 3, 2014 · 20 comments · Fixed by #4837
Closed

glibc detected: double free or corruption error on numpy.percentile #4836

habibrosyad opened this issue Jul 3, 2014 · 20 comments · Fixed by #4837

Comments

@habibrosyad
Copy link

Hi,
I've got this error while trying to run np.percentile on a series of images in a loop:

Python 2.7.6 (default, May 29 2014, 17:32:27)
[GCC Intel(R) C++ gcc 4.4 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from skimage.io import ImageCollection
>>> imc = ImageCollection('/share/experiments/ahm/projections/Block_1_*.tiff')
>>> for i in xrange(len(imc)):
...  low, hi = np.percentile(imc[i],(0,98))
...
*** glibc detected *** python2.7: double free or corruption (out): 0x000000000379cf60 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3d08876166]
/lib64/libc.so.6[0x3d08878ca3]
/usr/local/lib/python2.7/site-packages/numpy/core/multiarray.so(+0x1c4fc)[0x7f9ffccb84fc]
/usr/local/lib/python2.7/site-packages/numpy/core/multiarray.so(+0x1cfdb)[0x7f9ffccb8fdb]
/usr/local/lib/libpython2.7.so.1.0(__PGOSF184_frame_dealloc+0x542)[0x7fa005221482]
/usr/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x1aa)[0x7fa005243bda]
/usr/local/lib/libpython2.7.so.1.0(__PGOSF292_function_call+0x4a)[0x7fa0051fe03a]
/usr/local/lib/libpython2.7.so.1.0(+0x115a9e)[0x7fa005184a9e]
/usr/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6b56)[0x7fa00523f046]
/usr/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x180)[0x7fa005243bb0]
/usr/local/lib/libpython2.7.so.1.0(+0x1c789c)[0x7fa00523689c]
/usr/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x1a10)[0x7fa005239f00]
/usr/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x180)[0x7fa005243bb0]
/usr/local/lib/libpython2.7.so.1.0(+0x1c789c)[0x7fa00523689c]
/usr/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x1a10)[0x7fa005239f00]
/usr/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x147)[0x7fa0051756f7]
/usr/local/lib/libpython2.7.so.1.0(PyRun_InteractiveOneFlags+0x2c4)[0x7fa0051a62b4]
/usr/local/lib/libpython2.7.so.1.0(PyRun_InteractiveLoopFlags+0x143)[0x7fa0051a5f93]
/usr/local/lib/libpython2.7.so.1.0(PyRun_AnyFileExFlags+0x46)[0x7fa0051a57f6]
/usr/local/lib/libpython2.7.so.1.0(Py_Main+0x613)[0x7fa0051975e3]
python2.7(main+0x2f)[0x40088f]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3d0881ed1d]
python2.7[0x400799]
======= Memory map: ========
00400000-00401000 r-xp 00000000 fd:00 2490380                            /usr/local/bin/python2.7
00600000-00601000 rw-p 00000000 fd:00 2490380                            /usr/local/bin/python2.7
020c2000-03bbd000 rw-p 00000000 00:00 0                                  [heap]
31a5a00000-31a5a83000 r-xp 00000000 fd:00 2359299                        /lib64/libm-2.12.so
31a5a83000-31a5c82000 ---p 00083000 fd:00 2359299                        /lib64/libm-2.12.so
31a5c82000-31a5c83000 r--p 00082000 fd:00 2359299                        /lib64/libm-2.12.so
31a5c83000-31a5c84000 rw-p 00083000 fd:00 2359299                        /lib64/libm-2.12.so
322fa00000-322fae8000 r-xp 00000000 fd:00 2522823                        /usr/lib64/libstdc++.so.6.0.13
322fae8000-322fce8000 ---p 000e8000 fd:00 2522823                        /usr/lib64/libstdc++.so.6.0.13
322fce8000-322fcef000 r--p 000e8000 fd:00 2522823                        /usr/lib64/libstdc++.so.6.0.13
322fcef000-322fcf1000 rw-p 000ef000 fd:00 2522823                        /usr/lib64/libstdc++.so.6.0.13
322fcf1000-322fd06000 rw-p 00000000 00:00 0
3230200000-3230225000 r-xp 00000000 fd:00 2522830                        /usr/lib64/libpng12.so.0.49.0
3230225000-3230425000 ---p 00025000 fd:00 2522830                        /usr/lib64/libpng12.so.0.49.0
3230425000-3230426000 rw-p 00025000 fd:00 2522830                        /usr/lib64/libpng12.so.0.49.0
3d08400000-3d08420000 r-xp 00000000 fd:00 2359322                        /lib64/ld-2.12.so
3d0861f000-3d08620000 r--p 0001f000 fd:00 2359322                        /lib64/ld-2.12.so
3d08620000-3d08621000 rw-p 00020000 fd:00 2359322                        /lib64/ld-2.12.so
3d08621000-3d08622000 rw-p 00000000 00:00 0
3d08800000-3d0898b000 r-xp 00000000 fd:00 2359343                        /lib64/libc-2.12.so
3d0898b000-3d08b8a000 ---p 0018b000 fd:00 2359343                        /lib64/libc-2.12.so
3d08b8a000-3d08b8e000 r--p 0018a000 fd:00 2359343                        /lib64/libc-2.12.so
3d08b8e000-3d08b8f000 rw-p 0018e000 fd:00 2359343                        /lib64/libc-2.12.so
3d08b8f000-3d08b94000 rw-p 00000000 00:00 0
3d09000000-3d09017000 r-xp 00000000 fd:00 2359420                        /lib64/libpthread-2.12.so
3d09017000-3d09217000 ---p 00017000 fd:00 2359420                        /lib64/libpthread-2.12.so
3d09217000-3d09218000 r--p 00017000 fd:00 2359420                        /lib64/libpthread-2.12.so
3d09218000-3d09219000 rw-p 00018000 fd:00 2359420                        /lib64/libpthread-2.12.so
3d09219000-3d0921d000 rw-p 00000000 00:00 0
3d09400000-3d09402000 r-xp 00000000 fd:00 2359418                        /lib64/libdl-2.12.so
3d09402000-3d09602000 ---p 00002000 fd:00 2359418                        /lib64/libdl-2.12.so
3d09602000-3d09603000 r--p 00002000 fd:00 2359418                        /lib64/libdl-2.12.so
3d09603000-3d09604000 rw-p 00003000 fd:00 2359418                        /lib64/libdl-2.12.so
3d09800000-3d09815000 r-xp 00000000 fd:00 2359437                        /lib64/libz.so.1.2.3
3d09815000-3d09a14000 ---p 00015000 fd:00 2359437                        /lib64/libz.so.1.2.3
3d09a14000-3d09a15000 r--p 00014000 fd:00 2359437                        /lib64/libz.so.1.2.3
3d09a15000-3d09a16000 rw-p 00015000 fd:00 2359437                        /lib64/libz.so.1.2.3
3d0a000000-3d0a01d000 r-xp 00000000 fd:00 2359516                        /lib64/libselinux.so.1
3d0a01d000-3d0a21c000 ---p 0001d000 fd:00 2359516                        /lib64/libselinux.so.1
3d0a21c000-3d0a21d000 r--p 0001c000 fd:00 2359516                        /lib64/libselinux.so.1
3d0a21d000-3d0a21e000 rw-p 0001d000 fd:00 2359516                        /lib64/libselinux.so.1
3d0a21e000-3d0a21f000 rw-p 00000000 00:00 0
3d0a800000-3d0a816000 r-xp 00000000 fd:00 2359426                        /lib64/libresolv-2.12.so
3d0a816000-3d0aa16000 ---p 00016000 fd:00 2359426                        /lib64/libresolv-2.12.so
3d0aa16000-3d0aa17000 r--p 00016000 fd:00 2359426                        /lib64/libresolv-2.12.so
3d0aa17000-3d0aa18000 rw-p 00017000 fd:00 2359426                        /lib64/libresolv-2.12.so
3d0aa18000-3d0aa1a000 rw-p 00000000 00:00 0
3d0b000000-3d0b016000 r-xp 00000000 fd:00 2359411                        /lib64/libgcc_s-4.4.7-20120601.so.1
3d0b016000-3d0b215000 ---p 00016000 fd:00 2359411                        /lib64/libgcc_s-4.4.7-20120601.so.1
3d0b215000-3d0b216000 rw-p 00015000 fd:00 2359411                        /lib64/libgcc_s-4.4.7-20120601.so.1
3d0bc00000-3d0bc03000 r-xp 00000000 fd:00 2359710                        /lib64/libcom_err.so.2.1
3d0bc03000-3d0be02000 ---p 00003000 fd:00 2359710                        /lib64/libcom_err.so.2.1
3d0be02000-3d0be03000 r--p 00002000 fd:00 2359710                        /lib64/libcom_err.so.2.1
3d0be03000-3d0be04000 rw-p 00003000 fd:00 2359710                        /lib64/libcom_err.so.2.1Aborted (core dumped)

So I tried running it in different ways:

Python 2.7.6 (default, May 29 2014, 17:32:27)
[GCC Intel(R) C++ gcc 4.4 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from skimage.io import ImageCollection
>>> imc = ImageCollection('/share/experiments/ahm/projections/Block_1_*.tiff')
>>> for i in xrange(len(imc)):
...  low = np.percentile(imc[i],0)
...  hi = np.percentile(imc[i],98)
...
*** glibc detected *** python2.7: double free or corruption (out): 0x0000000003525120 ***

But, I have no problem in running it outside loop:

Python 2.7.6 (default, May 29 2014, 17:32:27)
[GCC Intel(R) C++ gcc 4.4 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from skimage.io import ImageCollection
>>> imc=ImageCollection('/share/experiments/ahm/projections/Block_1_*.tiff')
>>> low, hi = np.percentile(imc[0],(0,98))
>>> low
13783.0
>>> hi
23991.0

It's just when I used np.percentile in loop then things got messy. I used numpy 1.9.0.dev-c24cc4e in this case, but on my other machine with numpy 1.8.0 has no problem at all.

Best regards,
Habib

@juliantaylor
Copy link
Contributor

hm thats bad, can you provide the images so we can reproduce it?

@habibrosyad
Copy link
Author

Total sample of images is 361. Those images are X-Ray projection taken from angle 0-360 (in TIFF format). But 4-5 images is enough to reproduce the error, so this is the link to the 5 images I used in the trial https://www.dropbox.com/s/jtotuirzztybujl/samples.zip.

@juliantaylor
Copy link
Contributor

I can't reproduce the issue, I noticed you are using the ICC compiler. Do you still see the issue when you build numpy with GCC?

can you maybe create a valgrind log of the issue, this suppression file might keep the false positives down:
https://gist.github.com/juliantaylor/dd56e32376dbf447525e

@habibrosyad
Copy link
Author

Do I need to recompile my Python build using GCC so that numpy can be build using it? I found that I can't change my compiler flags, so though I specify CC=gcc the compiling process always fail. I used compiler flags that specifically tuned for ICC. I haven't used valgrind, maybe I'll look into that later.

Here, I rebuild numpy through ver 1.8.1 to the most recent one and found that 1.8.1 works fine, while >=1.9 has the issue. One thing that I find strange is this:

Python 2.7.6 (default, May 29 2014, 17:32:27)
[GCC Intel(R) C++ gcc 4.4 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from skimage.io import ImageCollection
>>> imc = ImageCollection('/share/experiments/ahm/projections/Block_1_*.tiff')
>>> len(imc)
361
>>> low, hi = np.percentile(imc[0],(0,98))
>>> low
0.0
>>> hi
23991.0
>>>

In numpy 1.8.1 np.percentile return low = 0.0 while in my previous individual test above on 1.9 return low = 13783.0 on the same image. Well, if on your end this is not the case, then maybe I suspect it's the compiler problem :(

@juliantaylor
Copy link
Contributor

yes that is the case for me too, there is a huge mistake in the pivot handling of the multiple selection algorithm :/
it assumes the pivots are sorted in ascending order which is not the case which triggers possible overwriting of already found order statistics, in this case the minimum element (0 index).
this is a pretty bad bug possibly worth a 1.8.2 release, but I don't see how it could cause a crash.

juliantaylor added a commit to juliantaylor/numpy that referenced this issue Jul 4, 2014
when orders are selected where the kth element falls into an equal range
the the last stored pivot was not the kth element, this leads to losing
the ordering of smaller orders as following selection steps can start at
index 0 again instead of the at the offset of the last selection.
Closes numpygh-4836
@juliantaylor
Copy link
Contributor

can you try the patch in the linked PR, it should fix the wrong result, but I don't see how it could fix the crash.

@habibrosyad
Copy link
Author

Yeah, it's still crash like crazy 😄

But I've checked the fix and still found an inconsistency with 1.8.1. I've checked with the first and second image (imc[0] and imc[1]) but on the third image the results display:

Python 2.7.6 (default, May 29 2014, 17:32:27)
[GCC Intel(R) C++ gcc 4.4 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from skimage.io import ImageCollection
>>> imc = ImageCollection('/share/experiments/ahm/projections/Block_1_*.tiff')
>>> len(imc)
361
>>> low, hi = np.percentile(imc[2],(0,98))
>>> low
28.0
>>> hi
15309.0
>>>

While in 1.8.1 low yield 0.0. Does this still count as a bug? All the images i tried on 1.8.1 always yield 0.0 for low.

@juliantaylor
Copy link
Contributor

I get low 0 for all images you sent me.

@juliantaylor
Copy link
Contributor

which version of skimage are you using?

@habibrosyad
Copy link
Author

I use skimage 0.10.0. Hmm, this is weird... so I tried using scipy.misc.imread:

Python 2.7.6 (default, May 29 2014, 17:32:27)
[GCC Intel(R) C++ gcc 4.4 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from scipy.misc import imread
>>> im = imread('/share/experiments/ahm/projections/Block_1_0002.tiff')         >>> im.shape
(1024, 1024)
>>> low, hi = np.percentile(im,(0,98))
>>> low
28.0
>>> hi
15309.0
>>>

More trial:

Python 2.7.6 (default, May 29 2014, 17:32:27)
[GCC Intel(R) C++ gcc 4.4 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from scipy.misc import imread
>>> im = imread('/share/experiments/ahm/projections/Block_1_0004.tiff')         >>> im.shape
(1024, 1024)
>>> low, hi = np.percentile(im,(0,98))
>>> low
0.0
>>> hi
15879.0
>>> im = imread('/share/experiments/ahm/projections/Block_1_0005.tiff')
>>> im.shape
(1024, 1024)
>>> low, hi = np.percentile(im,(0,98))
>>> low
0.0
>>> hi
16254.0
>>> im = imread('/share/experiments/ahm/projections/Block_1_0006.tiff')
>>> low, hi = np.percentile(im,(0,98))
>>> low
0.0
>>> hi
16983.0
>>> im = imread('/share/experiments/ahm/projections/Block_1_0100.tiff')
>>> im.shape
(1024, 1024)
>>> low, hi = np.percentile(im,(0,98))
>>> low
2.0
>>> hi
18598.0

The strangest thing is, it's always crash on image Block_1_0003.tiff. This is my current pip list:

[root@h1 src]# pip list
backports.ssl-match-hostname (3.4.0.2)
Cython (0.20.1)
h5py (2.3.1)
ipython (2.1.0)
matplotlib (1.4.x)
mock (1.0.1)
mpi4py (1.3.1)
nose (1.3.3)
numexpr (2.4)
numpy (1.9.0.dev-Unknown)
Pillow (2.5.0)
pip (1.5.6)
pyparsing (2.0.2)
pyserial (2.7)
python-dateutil (2.2)
pyzmq (14.3.0)
scikit-image (0.10.0)
scikit-learn (0.14.1)
scipy (0.15.0.dev-569b1af)
setuptools (3.5.1)
six (1.7.3)
tables (3.1.1)
tornado (3.2.1)
virtualenv (1.11.6)
vLFD (2014.04.07)
wsgiref (0.1.2)

Maybe I will try to recompile Python with GCC to see if that fix the issue :(

@juliantaylor
Copy link
Contributor

I ran it with gccs undefined behavior sanitizer and it found no problems, so I don't know what is going on.
You will have to try using valgrind or a debugger to see whats happing. I hang out in #scipy on freenode if you need any help with that.

@habibrosyad
Copy link
Author

Hi Julian, I've created valgrind log of my issue in https://gist.github.com/habibrosyad/7fbb8547350d3e2dee67, sorry for the late response. I tried to use Python 2.7.8 still compiled with ICC but with lower optimization flag (-O2) and the problem still there. From the log it's look like the problem comes from the compiler, as some lines there mention path to the compiler libs, am I right?

@juliantaylor
Copy link
Contributor

no valgrind never reached the numpy code because it encountered some instruction it doesn't know about, you could try valgrind 3.9 and see if that works.
also add --partial-loads-ok=yes as iccs stirng function apparently make use of the page boundary

fwiw I tried compiling python3.4 and numpy with the latest icc but could not reproduce any issue

@habibrosyad
Copy link
Author

I've created valgrind 3.9 log in https://gist.github.com/habibrosyad/dbd911752ae526ffd8d0. Have you tried on version 2.7.6 or 2.7.8?

@juliantaylor
Copy link
Contributor

I can't get python2.7 to compile with icc, so I used 3.4
that log also stops with unknown instruction before it reaches the relevant code, can you try a debugger?

@habibrosyad
Copy link
Author

I've found the root of my issue. Apparently, the culprit is the flags that I use in building numpy intelccompiler.py (in distutils) and intel.py (in fcompiler). I use those flags based on this guide https://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl for optimization. If I use the default flags then no issues arises and np.percentile also yield the correct results.

Me too can't get python2.7 to compile with icc at first, since it has problem with _ctypes module, so I used this patch https://github.com/atgreen/libffi/blob/master/src/x86/ffi64.c on ffi64.c (in Modules/_ctypes/libffi/) every time I need to recompile new version of Python 2.7 with icc.

@juliantaylor
Copy link
Contributor

does the current numpy master testsuite work with your build? you will get some errors due to subnormals but seeing if the partition tests work would be interesting

@habibrosyad
Copy link
Author

This is my test result:

[root@h1 src]# /opt/python2.7.8-O3/bin/python -c 'import numpy; numpy.test()'
Running unit tests for numpy
NumPy version 1.10.0.dev-aba6adb
NumPy is installed in /opt/python2.7.8-O3/lib/python2.7/site-packages/numpy
Python version 2.7.8 (default, Jul 10 2014, 13:17:00) [GCC Intel(R) C++ gcc 4.4 mode]
nose version 1.3.3

======================================================================
ERROR: test_callback.TestF77Callback.test_string_callback
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/python2.7.8-O3/lib/python2.7/site-packages/nose/case.py", line 381, in setUp
    try_run(self.inst, ('setup', 'setUp'))
  File "/opt/python2.7.8-O3/lib/python2.7/site-packages/nose/util.py", line 470, in try_run
    return func()
  File "/opt/python2.7.8-O3/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 348, in setUp
    module_name=self.module_name)
  File "/opt/python2.7.8-O3/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 74, in wrapper
    memo[key] = func(*a, **kw)
  File "/opt/python2.7.8-O3/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 163, in build_code
    module_name=module_name)
  File "/opt/python2.7.8-O3/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 74, in wrapper
    memo[key] = func(*a, **kw)
  File "/opt/python2.7.8-O3/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 144, in build_module
    __import__(module_name)
ImportError: /tmp/tmpvhI851/_test_ext_module_5403.so: undefined symbol: fun_

----------------------------------------------------------------------
Ran 5585 tests in 45.237s

FAILED (KNOWNFAIL=5, SKIP=5, errors=1)

I've got one error, don't know what's that mean though.

In my previous build which raise the previous issue, I got at least 5 errors for version 1.9 and 0 for version 1.8 with the same icc flags.

@juliantaylor
Copy link
Contributor

I can't reproduce it with the options on that page either, so I can't do more, you'll have to debug it yourself or use GCC.

@habibrosyad
Copy link
Author

Though I got one error on the test, the issue with np.percentile seems to be gone now. Btw, thanks for your supports :)

juliantaylor added a commit to juliantaylor/numpy that referenced this issue Aug 4, 2014
when orders are selected where the kth element falls into an equal range
the the last stored pivot was not the kth element, this leads to losing
the ordering of smaller orders as following selection steps can start at
index 0 again instead of the at the offset of the last selection.
Closes numpygh-4836
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
0