8000 x/sys/windows: some syscalls are creating dangling pointers · Issue #73199 · golang/go · GitHub
[go: up one dir, main page]

Skip to content
/ go Public

x/sys/windows: some syscalls are creating dangling pointers #73199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thepudds opened this issue Apr 7, 2025 · 16 comments
Closed

x/sys/windows: some syscalls are creating dangling pointers #73199

thepudds opened this issue Apr 7, 2025 · 16 comments
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Milestone

Comments

@thepudds
Copy link
Contributor
thepudds commented Apr 7, 2025

CL 653856 is an optimization that increases how often make slice results can be stored on the stack.

@dmitshur noticed that the x/sys/windows.TestBuildSecurityDescriptor test started failing on Windows after that CL was merged.

As a diagnostic step, I was able to get the test to pass again by forcing some make results in sys/windows/security_windows.go to be heap allocated in (*SECURITY_DESCRIPTOR).ToAbsolute.

I also then commented on the CL that from the Windows docs, it looked like the MakeAbsoluteSD Windows API was being used in a way that was creating dangling pointers:

It looks like MakeAbsoluteSD is an example of a syscall that asks the user pass in memory that is then filled in with the results, and it looks like the results end up containing pointers to each other for some of the filled in arguments:

https://learn.microsoft.com/en-us/windows/win32/api/securitybaseapi/nf-securitybaseapi-makeabsolutesd
[...]
Previously, this would all be heap memory, but with the change in this CL, it can be stack allocated memory.

I think that might mean there is a dangling pointer in absoluteSD pointing to the ToAbsolute stack after ToAbsolute returns, which might be why the test then fails.

Keith responded that there are probably other related problems as well, including in some of the related functions:

So it is allocating an object that contains pointers using a method that marks none of its fields as pointers. (And with a size which is Windows' idea of what the size is, which may or may not match what Go thinks the size is.)

Then it calls into Windows which is writing those pointer fields without any write barrier notification to the GC.

So once this thing returns, nothing is keeping the 4 things referenced from this struct alive. The next GC (or maybe the current GC, given the missing write barriers) will collect all the referenced objects, leading to dangling pointers.

That's even before this CL. After this CL things just fall apart more quickly 😊
[...]
Some of the other functions in this area have similar problems, like (*SECURITY_DESCRIPTOR).SetDACL.

Opening this issue to help track the scope of the problem and resolution.

CC @randall77

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Apr 7, 2025
@gopherbot gopherbot added this to the Unreleased milestone Apr 7, 2025
@gopherbot
Copy link
Contributor
8000 gopherbot commented Apr 7, 2025

Change https://go.dev/cl/663355 mentions this issue: windows: fix dangling pointers in (*SECURITY_DESCRIPTOR).ToAbsolute

@gabyhelp
Copy link
gabyhelp commented Apr 7, 2025

Related Code Changes

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label Apr 7, 2025
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/663575 mentions this issue: windows: fix dangling pointers in (*SECURITY_DESCRIPTOR).Set{DACL,SACL,Owner,Group}

@dmitshur dmitshur added OS-Windows NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 7, 2025
@dmitshur
Copy link
Member
dmitshur commented Apr 7, 2025

CC @golang/windows.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/663795 mentions this issue: cmd/compile/internal/escape: add debug flag to disable stack allocate variable-sized makeslice

@thepudds
Copy link
Contributor Author
thepudds commented Apr 8, 2025

Related: https://go.dev/cl/663715.

gopherbot pushed a commit that referenced this issue Apr 8, 2025
Windows' _PROC_THREAD_ATTRIBUTE_LIST can contain pointers to memory
owned by Go, but the GC is not aware of this. This can lead to the
memory being freed while the _PROC_THREAD_ATTRIBUTE_LIST is still in
use.

This CL uses the same approach as in x/sys/windows to ensure that the
attributes are not collected by the GC.

Fixes #73170.
Updates #73199.

Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-longtest
Change-Id: I7dca8d386aed4c02fdcd4a631d0fa4dc5747a96f
Reviewed-on: https://go-review.googlesource.com/c/go/+/663715
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/663715 mentions this issue: syscall: fix dangling pointers in Windows' process attributes

@thepudds
Copy link
Contributor Author
thepudds commented Apr 8, 2025

Hi @qmuntal, does this look like it could be a related bug, where pointer to buffer is stored via NtSetInformationFile?

	bufferSize := int(unsafe.Offsetof(dummyFileRenameInfo.FileName)) + fileNameLen
	buffer := make([]byte, bufferSize)
	typedBufferPtr := (*fileRenameInformation)(unsafe.Pointer(&buffer[0]))
	typedBufferPtr.ReplaceIfExists = windows.FILE_RENAME_REPLACE_IF_EXISTS | windows.FILE_RENAME_POSIX_SEMANTICS
	typedBufferPtr.FileNameLength = uint32(fileNameLen)
	copy((*[windows.MAX_LONG_PATH]uint16)(unsafe.Pointer(&typedBufferPtr.FileName[0]))[:fileNameLen/2:fileNameLen/2], newNameUTF16)
	err = windows.NtSetInformationFile(fileHandle, &iosb, &buffer[0], uint32(bufferSize), windows.FileRenameInformation)

(From x/sys/windows/syscall_windows_test.go).

If it is a bug, it's just affecting a test I think. (I only took a quick look, so maybe not a bug. I'll post more context here in a bit).

gopherbot pushed a commit to golang/sys that referenced this issue Apr 9, 2025
Prior to this CL, a byte slice was allocated via make to use as the
absoluteSD argument passed to the Windows API MakeAbsoluteSD.

MakeAbsoluteSD then sets pointers outside the view of the GC, including
pointers within absoluteSD that point to other chunks of memory
we pass into MakeAbsoluteSD.

CL 653856 recently allowed more make results to be stack allocated,
which worsened the problems here and made it easier for those
pointers in absoluteSD to become dangling pointers, though the
core problems here existed before.

This CL instead allocates absoluteSD as a proper SECURITY_DESCRIPTOR
struct so that the GC can be aware of its pointers. We also verify the
pointers are as we expect, and then set them explicitly
in view of the GC.

Updates golang/go#73199

Change-Id: Id8038d38a887bb8ff3ffc6eae603589b97e92cdc
Reviewed-on: https://go-review.googlesource.com/c/sys/+/663355
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@thepudds
Copy link
Contributor Author
thepudds commented Apr 9, 2025

With https://go.dev/cl/663355 submitted as of a few minutes ago, the x/sys Windows builders now seem to have flipped back to green with that commit:

https://ci.chromium.org/p/golang/g/x-sys-gotip-by-go/console

@mknyszek
Copy link
Contributor
mknyszek commented Apr 9, 2025

Hey @thepudds, in triage, we're wondering whether your landed fix is enough for the issue, or there needs to be more work done here. I see you've got 2 changes sent out, one merged, one WIP. Or maybe this is a broader problem with x/sys/windows? Thanks.

@dmitshur dmitshur marked this as a duplicate of #73298 Apr 9, 2025
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 9, 2025
@thepudds
Copy link
Contributor Author

Hi @mknyszek, I would guess there are still some problems in x/sys/windows, including the possible issue ~3 comments back in #73199 (comment) that I don't know if anyone has opined on yet.

The problems that https://go.dev/cl/663575 is attempting to address I think are probably real.

Maybe nothing else pops up, but it's probably worth keeping this open for a bit longer. I was also trying out a stress test just now, though I don't know how fruitful that will be.

@qmuntal
Copy link
Member
qmuntal commented Apr 10, 2025

Hi @qmuntal, does this look like it could be a related bug, where pointer to buffer is stored via NtSetInformationFile?

Hmm, the pointer to the buffer is not retained by NtSetInformationFile, and the buffer itself doesn't contain pointers to Go objects, so that call seems safe to me.

gopherbot pushed a commit that referenced this issue Apr 16, 2025
…of variable-sized makeslice

CL 653856 enabled stack allocation of variable-sized makeslice results.

This CL adds debug hashing of that change, plus a debug flag
to control the byte threshold used.

The debug hashing machinery means we also now have a way to disable just
the CL 653856 optimization by doing -gcflags='all=-d=variablemakehash=n'
or similar, though the stderr output will then typically have many
lines of debug hash output.

Using this CL plus the bisect command, I was able to retroactively
find one of the lines of code responsible for #73199:

  $ bisect -compile=variablemake go test -skip TestListWireGuardDrivers
  [...]
  bisect: FOUND failing change set
  --- change set #1 (enabling changes causes failure)
  ./security_windows.go:1321:38 (variablemake)
  ./security_windows.go:1321:38 (variablemake)
  ---

Previously, I had tracked down those lines by diffing '-gcflags=-m=1'
output and brief code inspection, but seeing the bisect was very nice.

This CL also adds a compiler debug flag to control the threshold for
stack allocation of variably sized make results. This can help
us identify more code that is relying on certain stack allocations.
This might be a temporary flag that we delete prior to Go 1.25
(given we would not want people to rely on it), or maybe it
might make sense to keep it for some period of time beyond the release
of Go 1.25 to help the ecosystem shake out other bugs.

Using these two flags together (and picking a threshold of 64 rather
than the default of 32), it looks for example like this
x/sys/windows code might be relying on stack allocation of
a byte slice:

  $ bisect -compile=variablemake go test -gcflags=-d=variablemakethreshold=64 -skip TestListWireGuardDrivers
  [...]
  bisect: FOUND failing change set
  --- change set #1 (enabling changes causes failure)
  ./syscall_windows_test.go:1178:16 (variablemake)

Updates #73199
Fixes #73253

Change-Id: I160179a0e3c148c3ea86be5c9b6cea8a52c3e5b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/663795
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/666116 mentions this issue: cmd/compile: align stack-allocated backing stores higher than required

@randall77
Copy link
Contributor
randall77 commented Apr 17, 2025

Just to note here, the problem with NtSetInformationFile is one of alignment. It does its allocation using make([]byte, ...) but then casts it to something that needs higher alignment (a *fileRenameInformation). If the byte slice is very unaligned, accesses through that pointer will also be unaligned. Normally on amd64, not a problem. But this pointer gets passed in a syscall to a windows function that complains about alignment. It would also be a problem to use some instruction that required alignment (something in sync/atomic, or SSE instructions in assembly, ...).
CL 666116 forces pointer-alignment for these stack-allocated backing stores, which better matches what malloc used to do.
It won't be perfect, but it will probably fix a lot of potential pitfalls like this.

@randall77
Copy link
Contributor
randall77 commented Apr 17, 2025

The godoc for NtSetInformationFile describes its argument at just *byte, with no indication that it needs a larger alignment.
The Windows docs do say it must be of a certain type, depending on the final parameter. We don't have any way in the Go type system to enforce that, unfortunately.
It would be nice to have an example usage that doesn't have this alignment issue.

There might be a similar potential problem for SetFileInformationByHandle, DeviceIoControl, and GetFileInformationByHandleEx. Probably others.

It is particularly tricky for NtSetInformationFile because the size of the argument is variable length.

@dmitshur dmitshur moved this from Todo to In Progress in Go Compiler / Runtime Apr 17, 2025
@alexbrainman
Copy link
Member

Here https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170&viewFallbackFrom=vs-2017#alignment are the windows/amd64 alignment rules, if others need them.

I did not find anything interesting there.

Alex

@github-project-automation github-project-automation bot moved this from In Progress to Done in Go Compiler / Runtime May 20, 2025
Sirherobrine23 pushed a commit to Sirherobrine23/go that referenced this issue May 24, 2025
Because that's what mallocgc did and some user code came to rely on it.

Fixes golang#73199

Change-Id: I45ca00d2ea448e6729ef9ac4cec3c1eb0ceccc89
Reviewed-on: https://go-review.googlesource.com/c/go/+/666116
Reviewed-by: t hepudds <thepudds1460@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Projects
Development

No branches or pull requests

8 participants
0