8000 CLR not working on 32 bit Linux · Issue #1210 · pythonnet/pythonnet · GitHub
[go: up one dir, main page]

Skip to content

CLR not working on 32 bit Linux #1210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
daniol opened this issue Aug 25, 2020 · 21 comments
Open

CLR not working on 32 bit Linux #1210

daniol opened this issue Aug 25, 2020 · 21 comments
Assignees

Comments

@daniol
Copy link
daniol commented Aug 25, 2020

Environment

  • Pythonnet version: 2.5.1 and also 3.0.0-dev
  • Python version: 3.5.3, 2.7.13 and also 3.8.5 (from Debian repositories and also if compiled locally with --enable-shared)
  • MONO Versions: 6.10, 5.16 and also 5.20
  • Operating System: Linux (Debian 9 'stretch' Live, i686)

Details

Pythonnet seems not to be working on a Linux 32-Bit environment (i686) at all. I tested it on a "Debian live" (USB) system in two different computers, using Python from the Debian repositories and also building python from source code, here are the results:

  • Using Python from the Debian repositories:
    The python execution crashes just doing import clr
    See attachment: mono_crash.4383a3eb2.0.zip, trace.txt

  • Building Python from source code (with --enable-shared):

It doesn't crash, but clr throws an error when trying to load an assembly ("AttributeError: module 'clr' has no attribute 'AddReference'"). This was reported in some places around internet, but I don't have any clr module already installed which could cause a name collision. See output:

user@debian:~$ /opt/python/python3.5.3/bin/python3
Python 3.5.3 (default, Sep 27 2018, 17:25:39) 
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import clr
>>> dir(clr)
['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'facade']
>>> str(clr)
"<module 'clr' from '/opt/python/python3.5.3/lib/python3.5/site-packages/clr.cpython-35m-i386-linux-gnu.so'>"
>>> 
user@debian:~$ /opt/python/python3.5.3/bin/pip3 list
clang (11.0)
pip (9.0.1)
pycparser (2.20)
pythonnet (2.5.1)
setuptools (28.8.0)
wheel (0.35.1)

Notes

  • On a x64-System it works, using the same version of Debian, Pythonnet and Python.
  • If this is really a limitation of 32-Bit-Systems that cannot be fixed, it should be documented on the "readme".
@lostmsu
Copy link
Member
lostmsu commented Aug 25, 2020

@daniol it should be working on 32 bit Linux, but we don't have CI set for it.

@daniol
Copy link 8000
Author
daniol commented Aug 25, 2020

@lostmsu maybe it should, but it doesn't 😄 you can check it with a live version of debian 9 under a 32-bit machine. I don't know if this can be also reproduced on a VM.

@jborbely
Copy link
jborbely commented Nov 3, 2020

I can also confirm this issue with pythonnet 2.5.0, 2.5.1 and the master branch (84e2735). The issue does not exist in pythonnet 2.4.0.

I am not using a VM nor a live USB stick. I have 32-bit ubuntu installed on a HDD. Below shows the crash report when using Python 3.5.2 from the OS. I also created Python 3.7.9 and 3.8.6 virtual environments and importing clr fails in these environments.

user@ubuntu:~$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.7 LTS
Release:	16.04
Codename:	xenial

user@ubuntu:~$ mono --version
Mono JIT compiler version 5.20.1.34 (tarball Tue Jul 16 22:57:14 UTC 2019)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           __thread
	SIGSEGV:       altstack
	Notifications: epoll
	Architecture:  x86
	Disabled:      none
	Misc:          softdebug 
	Interpreter:   yes
	LLVM:          yes(600)
	Suspend:       hybrid
	GC:            sgen (concurrent by default)

user@ubuntu:~$ python3
Python 3.5.2 (default, Oct  7 2020, 17:19:02) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import clr
=================================================================
	Native Crash Reporting
=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================
...

The complete crash report is available here

If I don't start the Python REPL to import clr but pass it in through the command line I get the following

user@ubuntu:~$ python3 -c "import clr"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
SystemError: initialization of clr raised unreported exception
Error in sys.excepthook:
SystemError: ../Objects/listobject.c:199: bad argument to internal function

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 57, in apport_excepthook
    from cStringIO import StringIO
  File "<frozen importlib._bootstrap>", line 968, in _find_and_load
SystemError: PyEval_EvalFrameEx returned a result with an error set

Original exception was:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
SystemError: initialization of clr raised unreported exception

@filmor
Copy link
Member
filmor commented Nov 4, 2020

@jborbely Would you mind bisecting the issue? I don't have a 32bit installation lying around right now.

@jborbely
Copy link
jborbely commented Nov 4, 2020

@filmor A summary of bisecting is available here

The criteria that I used to decide if a commit was good or bad was by running the following bash script (which I named issue1210.sh) after each bisect iteration. If pytest passed then that commit was considered good.

#!/bin/bash
export LD_PRELOAD=/lib/i386-linux-gnu/libSegFault.so
export SEGFAULT_SIGNALS=all
export PYTHONUNBUFFERED=True
export BUILD_OPTS=""
export NUNIT_PATH="./packages/NUnit.ConsoleRunner.3.7.0/tools/nunit3-console.exe"
export RUN_TESTS="mono $NUNIT_PATH"
export EMBED_TESTS_PATH=""
export PERF_TESTS_PATH=""
source ~/py37env/bin/activate
python --version
pip --version
PY_LIBDIR=$(python -c 'import sysconfig; print(sysconfig.get_config_var("LIBDIR"))')
export LD_LIBRARY_PATH=$PY_LIBDIR:$LD_LIBRARY_PATH
pip install --upgrade setuptools
pip install --upgrade -r requirements.txt
pip uninstall pythonnet -y
coverage run setup.py install $BUILD_OPTS
python -m pytest

Please let me know if you'd like further information.

@filmor
Copy link
Member
filmor commented Nov 4, 2020

This is great, thanks. It points to PR #971, which uses the library loading to allow overriding one of Python's flags. I'll have a closer look, the actual writingnlooks fine, but maybe the lib loading assumes 64bitness somewhere.

@jborbely
Copy link
jborbely commented Nov 5, 2020

I've gone a little further to see if there are additional issues if the changes from PR #971 were temporarily removed.

Here's what I did:

  1. git checkout 4271e57
  2. deleted the lines that correspond to the SetNoSiteFlag() function in src/runtime/runtime.cs and src/runtime/pythonengine.cs
  3. ran my issue1210.sh script and pytest passed
  4. git checkout 770fc01
  5. deleted the lines that correspond to the SetNoSiteFlag() function in src/runtime/runtime.cs and src/runtime/pythonengine.cs
  6. ran my issue1210.sh script and got this error

770fc01 immediately follows 4271e57 so 770fc01 seems to also introduce errors on 32-bit linux.

770fc01 passed the tests on travis (ignoring that one pip issue) so it is okay on 64-bit linux.

@filmor filmor self-assigned this Nov 16, 2020
@daniol
Copy link
Author
daniol commented Nov 19, 2020

I can also confirm that the issue does not exist in pythonnet 2.4.0.
However, in order to work on pythonnet 2.4.0, the version of mono MUST be 5.20.

The current mono version (6.12) throws the following error when importing the clr module in pythonnet 2.4.0:
ImportError: /usr/lib/libmonosgen-2.0.so.1: undefined symbol: __cxa_begin_catch

@filmor filmor closed this as completed Aug 8, 2021
@daniol
Copy link
Author
daniol commented Aug 9, 2021

@filmor why was this closed?

@lostmsu
Copy link
Member
lostmsu commented Aug 9, 2021

@daniol @jborbely can you please test if the current master still has the issue? We reworked import mechanism, which is where @jborbely error happened.

@filmor
Copy link
Member
filmor commented Aug 9, 2021

@daniol Sorry for the silent closure, I just skimmed the last message (was cleaning up old issues) and saw the __cxa_begin_catch which has been solved quite a while ago. I can confirm that the issue exists in the current master as well. One additional problem is that pythonnet uses the dotnet CLI for building now, which is not available for 32bit Linux. Installing from a wheel built on a 64bit machine works, though.

@filmor filmor reopened this Aug 9, 2021
@lostmsu
Copy link
Member
lostmsu commented Aug 9, 2021

@filmor we could consider separating .NET build and Python build. Then Python build would only need to download Python.Runtime.dll or extract it from downloaded NuGet package.

@filmor
Copy link
Member
filmor commented Aug 9, 2021

I'm not sure that's worth the hassle (and I like the current clean setup.py ;)). Building on CI and distributing wheels should be enough, and on all platforms that are officially supported to run .NET >5, dotnet build will work.

@filmor filmor added this to the 3.0.0 milestone Sep 2, 2021
@lostmsu
Copy link
Member
lostmsu commented Jan 5, 2022

Removing this from 3.0.0 milestone. @filmor if you are actively working on it - readd. Because this is not a breaking change there's no harm in releasing it with 3.0.1 or 3.1.0.

@lostmsu lostmsu removed this from the 3.0.0 milestone Jan 5, 2022
@filmor
Copy link
Member
filmor commented Jan 6, 2022

@jborbely @daniol Now that the respective section was adjusted in #1659, could you please check if that already fixes the problem?

@lostmsu
Copy link
Member
lostmsu commented Jan 6, 2022

Also, to officially support 32 bit we need it in CI.

@jborbely
Copy link

Since the build now depends on the dotnet CLI (which is not available on 32-bit Linux) I built a wheel on 64-bit Linux (using commit 75f4398) and copied the wheel to 32-bit Linux. Unfortunately, the issue importing clr remains, although with a different error compared to the error that I got back in November 2020.

$ mono --version
Mono JIT compiler version 5.20.1.34 (tarball Tue Jul 16 22:57:14 UTC 2019)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       altstack
        Notifications: epoll
        Architecture:  x86
        Disabled:      none
        Misc:          softdebug
        Interpreter:   yes
        LLVM:          yes(600)
        Suspend:       hybrid
        GC:            sgen (concurrent by default)

$ pip list
Package    Version
---------- ----------
cffi       1.15.0
clr-loader 0.1.7
pip        21.3.1
pycparser  2.21
pythonnet  3.0.0.dev1
setuptools 60.5.0
wheel      0.37.1

$ python
Python 3.7.10 (default, Feb 20 2021, 21:21:24)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import clr
Failed to initialize pythonnet: System.NullReferenceException: Object reference not set to an instance of an object.
  at Python.Runtime.NewReferenceExtensions.Borrow (Python.Runtime.NewReference& reference) [0x0000f] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.Runtime.InitPyMembers () [0x0000b] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.Runtime.Initialize (System.Boolean initSigs) [0x000a5] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.PythonEngine.Initialize (System.Collections.Generic.IEnumerable`1[T] args, System.Boolean setSysArgv, System.Boolean initSigs) [0x00012] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.PythonEngine.Initialize (System.Boolean setSysArgv, System.Boolean initSigs) [0x00005] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.PythonEngine.InitExt () [0x00032] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.Loader.Initialize (System.IntPtr data, System.Int32 size) [0x0002f] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.NewReferenceExtensions.Borrow (Python.Runtime.NewReference& reference) [0x0000f] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.Runtime.InitPyMembers () [0x0000b] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.Runtime.Initialize (System.Boolean initSigs) [0x000a5] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.PythonEngine.Initialize (System.Collections.Generic.IEnumerable`1[T] args, System.Boolean setSysArgv, System.Boolean initSigs) [0x00012] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.PythonEngine.Initialize (System.Boolean setSysArgv, System.Boolean initSigs) [0x00005] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.PythonEngine.InitExt () [0x00032] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.Loader.Initialize (System.IntPtr data, System.Int32 size) [0x0002f] in <64958489404d4e7ba32bf649886da953>:0 ValueError: Empty module name

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/venv/lib/python3.7/site-packages/clr.py", line 6, in <module>
    load()
  File "/home/user/venv/lib/python3.7/site-packages/pythonnet/__init__.py", line 43, in load
    if func(''.encode("utf8")) != 0:
  File "/home/user/venv/lib/python3.7/site-packages/clr_loader/wrappers.py", line 20, in __call__
    return self._callable(ffi.cast("void*", buf_arr), len(buf_arr))
  File "/home/user/venv/lib/python3.7/site-packages/clr_loader/mono.py", line 91, in __call__
    res = _MONO.mono_runtime_invoke(self._ptr, ffi.NULL, params, exception)
SystemError: <cdata 'MonoObject *(*)(MonoMethod *, void *, void * *, MonoObject * *)' 0xb6ec3610> returned a result with an error set
>>> import clr
Failed to initialize pythonnet: System.InvalidOperationException: This property must be set before runtime is initialized
  at Python.Runtime.Runtime.set_PythonDLL (System.String value) [0x00007] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.Loader.Initialize (System.IntPtr data, System.Int32 size) [0x00023] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.Runtime.set_PythonDLL (System.String value) [0x00007] in <64958489404d4e7ba32bf649886da953>:0
  at Python.Runtime.Loader.Initialize (System.IntPtr data, System.Int32 size) [0x00023] in <64958489404d4e7ba32bf649886da953>:0 Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/venv/lib/python3.7/site-packages/clr.py", line 6, in <module>
    load()
  File "/home/user/venv/lib/python3.7/site-packages/pythonnet/__init__.py", line 44, in load
    raise RuntimeError("Failed to initialize Python.Runtime.dll")
RuntimeError: Failed to initialize Python.Runtime.dll

Please let me know if there is something else that you would like me to try or if there is additional information about my system that you would like know.

@filmor
Copy link
Member
filmor commented Jan 12, 2022

Yes, I see the same error. The failing borrow (due to a null pointer being returned) is from the builtins import. I'll try to replace it by PyEval_GetBuiltins (which we should probably use anyhow).

@filmor
Copy link
Member
filmor commented Jan 13, 2022

I have replaced the PyModule_Import by PyEval_GetBuiltins, but this didn't improve things. In the debugger I can see that the address that we get for the Python functions is correct, but the return value is crap (nothing that I can immediately identify, but relatively close to our current code execution, so it could be that a value in one of the registers or on the stack is misinterpreted). It could be that this function call is the first one, where the return value matters and that's why we are seeing this problem. It could also be that this has to do with the unmanaged function pointers that we are using nowadays (we used to use P/Invoke). I'll try to create a minimal example to reproduce the issue, maybe this is a problem in Mono.

@filmor
Copy link
Member
filmor commented Jan 14, 2022

I have tested this now with Python 3.9, here it already fails on PyLong_FromLongLong which we call in Runtime.NewRun on a simple value. I have checked the stack and it looks wrong.

What I would expect: <return address (4 bytes)> <value as LE (8 bytes)> (this is also what you can see for "native" calls)
What I see: <some address (4 bytes)> <some other address (4 bytes)> <value as LE (8 bytes)>

So the calli that is generated for our unmanaged delegate call (which is correct, as said, I have verified the addresses) pushes another parameter (I think) onto the stack, which is then frankensteined with the lower half of the value into a wrong value that is returned. If it does that on all calls, it will quite quickly destroy the stack.

Does anyone have more insight here or knows a bit more about how Mono translates calli (or knows someone who does)?

@lostmsu
Copy link
Member
lostmsu commented Jan 14, 2022

@filmor have you tried with .NET Core? If the behavior is different, I would simply open a bug in Mono.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
0