[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfaults on MacOS #24

Open
manticore-projects opened this issue Nov 24, 2023 · 4 comments
Open

Segfaults on MacOS #24

manticore-projects opened this issue Nov 24, 2023 · 4 comments

Comments

@manticore-projects
Copy link

Greetings!

I am compiling and running this lib on GitHub MacOS Runner (macos latest, x64).
In general everything seems to work, but occasionally the tests segfaults:

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00000001014c3a93, pid=1342, tid=9731
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.21+9 (11.0.21+9) (build 11.0.21+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.21+9 (11.0.21+9, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64)
# Problematic frame:
# C  [libfpng.dylib+0x4a93]  fpng::fpng_adler32(void const*, unsigned long, unsigned int)+0xc3

This happens in maybe 6/10 runs. When it happens then always at the same call of fpng::fpng_adler32(void const*, unsigned long, unsigned int)+0xc3. Please see an example: https://github.com/manticore-projects/fpng-java/actions/runs/6976419875/job/18984960630

I run the very same example also on Windows and Linux and on those GitHub Runners, everything works rock solid.
Also the code would fail on fpng::encode_memory when my pointers or memories were wrongly allocated.

Can you please provide and hint or idea, why/how this fails on MacOS -- but always sometimes? Test images and settings are static/stable and do not change.

Thank you already and best!

@tresf
Copy link
tresf commented Nov 29, 2023

@manticore-projects I believe on the JNA mailing list, you clarified/discovered that some of the MacOS crashes were upstream and JVM related. Can you provide an update here?

@manticore-projects
Copy link
Author

I have been facing two issues on MacOS.

One was about ImageIO and InputStreams and has been worked around. This was not related to FPNG itself.

The second one is about this reoccurring segfault shown above and is relevant:

  1. it happens only on MacOS Github runner, not Windows or Linux. And it never happens on my Catalina VM.
  2. it does not always happen (seems random)
  3. but if it happens, it throws always the same problematic frame (not random at all)

Unfortunately I don't know what to make out of this. I can't debug on MacOS runner and I can't trigger it on my Catalina VM (same binaries are used).

I do not know if it is caused by JNA or by the native FPNG binary itself.

@richgel999
Copy link
Owner

Can you try switching the codec to not using SSE? (Or, just disabling the SSE version of Adler32.) The codec has plain C (scalar) fallbacks for everything that utilizes SSE. Does that fix it?

@manticore-projects
Copy link
Author

Greetings and thank you for responding.

I am not much concerned about SSE or the failure itself. I would not care when it always failed (because I simply would conclude "not working on x64 MacOS", case closed).
What disturbs me is that fact that it fails only sometimes on x64 MacOS and only on GitGub Runner, never on the Catalina VM. But when it fails, then always at the same spot. There should not be any random error, I reckon.

That said: I am experimenting with ARM64/Neon too and thus would need to amend the Adler and CRC (based on the NEON optimizations ZLIB-NG does provide). So I would test against ZLIB-NG's Adler and CRC, if this issue still occurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants