8000 Documentation improvement and minor code cleanups for the latch facil… · omgwtfun/postgres@6760a4d · GitHub
[go: up one dir, main page]

Skip to content

Commit 6760a4d

Browse files
committed
Documentation improvement and minor code cleanups for the latch facility.
Improve the documentation around weak-memory-ordering risks, and do a pass of general editorialization on the comments in the latch code. Make the Windows latch code more like the Unix latch code where feasible; in particular provide the same Assert checks in both implementations. Fix poorly-placed WaitLatch call in syncrep.c. This patch resolves, for the moment, concerns around weak-memory-ordering bugs in latch-related code: we have documented the restrictions and checked that existing calls meet them. In 9.2 I hope that we will install suitable memory barrier instructions in SetLatch/ResetLatch, so that their callers don't need to be quite so careful.
1 parent 028a0c5 commit 6760a4d

File tree

5 files changed

+163
-102
lines changed

5 files changed

+163
-102
lines changed

src/backend/port/unix_latch.c

Lines changed: 71 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -3,60 +3,6 @@
33
* unix_latch.c
44
* Routines for inter-process latches
55
*
6-
* A latch is a boolean variable, with operations that let you to sleep
7-
* until it is set. A latch can be set from another process, or a signal
8-
* handler within the same process.
9-
*
10-
* The latch interface is a reliable replacement for the common pattern of
11-
* using pg_usleep() or select() to wait until a signal arrives, where the
12-
* signal handler sets a global variable. Because on some platforms, an
13-
* incoming signal doesn't interrupt sleep, and even on platforms where it
14-
* does there is a race condition if the signal arrives just before
15-
* entering the sleep, the common pattern must periodically wake up and
16-
* poll the global variable. pselect() system call was invented to solve
17-
* the problem, but it is not portable enough. Latches are designed to
18-
* overcome these limitations, allowing you to sleep without polling and
19-
* ensuring a quick response to signals from other processes.
20-
*
21-
* There are two kinds of latches: local and shared. A local latch is
22-
* initialized by InitLatch, and can only be set from the same process.
23-
* A local latch can be used to wait for a signal to arrive, by calling
24-
* SetLatch in the signal handler. A shared latch resides in shared memory,
25-
* and must be initialized at postmaster startup by InitSharedLatch. Before
26-
* a shared latch can be waited on, it must be associated with a process
27-
* with OwnLatch. Only the process owning the latch can wait on it, but any
28-
* process can set it.
29-
*
30-
* There are three basic operations on a latch:
31-
*
32-
* SetLatch - Sets the latch
33-
* ResetLatch - Clears the latch, allowing it to be set again
34-
* WaitLatch - Waits for the latch to become set
35-
*
36-
* The correct pattern to wait for an event is:
37-
*
38-
* for (;;)
39-
* {
40-
* ResetLatch();
41-
* if (work to do)
42-
* Do Stuff();
43-
*
44-
* WaitLatch();
45-
* }
46-
*
47-
* It's important to reset the latch *before* checking if there's work to
48-
* do. Otherwise, if someone sets the latch between the check and the
49-
* ResetLatch call, you will miss it and Wait will block.
50-
*
51-
* To wake up the waiter, you must first set a global flag or something
52-
* else that the main loop tests in the "if (work to do)" part, and call
53-
* SetLatch *after* that. SetLatch is designed to return quickly if the
54-
* latch is already set.
55-
*
56-
*
57-
* Implementation
58-
* --------------
59-
*
606
* The Unix implementation uses the so-called self-pipe trick to overcome
617
* the race condition involved with select() and setting a global flag
628
* in the signal handler. When a latch is set and the current process
@@ -65,8 +11,8 @@
6511
* interrupt select() on all platforms, and even on platforms where it
6612
* does, a signal that arrives just before the select() call does not
6713
* prevent the select() from entering sleep. An incoming byte on a pipe
68-
* however reliably interrupts the sleep, and makes select() to return
69-
* immediately if the signal arrives just before select() begins.
14+
* however reliably interrupts the sleep, and causes select() to return
15+
* immediately even if the signal arrives before select() begins.
7016
*
7117
* When SetLatch is called from the same process that owns the latch,
7218
* SetLatch writes the byte directly to the pipe. If it's owned by another
@@ -99,7 +45,7 @@
9945
/* Are we currently in WaitLatch? The signal handler would like to know. */
10046
static volatile sig_atomic_t waiting = false;
10147

102-
/* Read and write end of the self-pipe */
48+
/* Read and write ends of the self-pipe */
10349
static int selfpipe_readfd = -1;
10450
static int selfpipe_writefd = -1;
10551

@@ -115,7 +61,7 @@ static void sendSelfPipeByte(void);
11561
void
11662
InitLatch(volatile Latch *latch)
11763
{
118-
/* Initialize the self pipe if this is our first latch in the process */
64+
/* Initialize the self-pipe if this is our first latch in the process */
11965
if (selfpipe_readfd == -1)
12066
initSelfPipe();
12167

@@ -126,13 +72,14 @@ InitLatch(volatile Latch *latch)
12672

12773
/*
12874
* Initialize a shared latch that can be set from other processes. The latch
129-
* is initially owned by no-one, use OwnLatch to associate it with the
75+
* is initially owned by no-one; use OwnLatch to associate it with the
13076
* current process.
13177
*
13278
* InitSharedLatch needs to be called in postmaster before forking child
13379
* processes, usually right after allocating the shared memory block
134-
* containing the latch with ShmemInitStruct. The Unix implementation
135-
* doesn't actually require that, but the Windows one does.
80+
* containing the latch with ShmemInitStruct. (The Unix implementation
81+
* doesn't actually require that, but the Windows one does.) Because of
82+
* this restriction, we have no concurrency issues to worry about here.
13683
*/
13784
void
13885
InitSharedLatch(volatile Latch *latch)
@@ -144,23 +91,30 @@ InitSharedLatch(volatile Latch *latch)
14491

14592
/*
14693
* Associate a shared latch with the current process, allowing it to
147-
* wait on it.
94+
* wait on the latch.
14895
*
149-
* Make sure that latch_sigusr1_handler() is called from the SIGUSR1 signal
150-
* handler, as shared latches use SIGUSR1 to for inter-process communication.
96+
* Although there is a sanity check for latch-already-owned, we don't do
97+
* any sort of locking here, meaning that we could fail to detect the error
98+
* if two processes try to own the same latch at about the same time. If
99+
* there is any risk of that, caller must provide an interlock to prevent it.
100+
*
101+
* In any process that calls OwnLatch(), make sure that
102+
* latch_sigusr1_handler() is called from the SIGUSR1 signal handler,
103+
* as shared latches use SIGUSR1 for inter-process communication.
151104
*/
152105
void
153106
OwnLatch(volatile Latch *latch)
154107
{
155108
Assert(latch->is_shared);
156109

157-
/* Initialize the self pipe if this is our first latch in the process */
110+
/* Initialize the self-pipe if this is our first latch in this process */
158111
if (selfpipe_readfd == -1)
159112
initSelfPipe();
160113

161114
/* sanity check */
162115
if (latch->owner_pid != 0)
163116
elog(ERROR, "latch already owned");
117+
164118
latch->owner_pid = MyProcPid;
165119
}
166120

@@ -172,6 +126,7 @@ DisownLatch(volatile Latch *latch)
172126
{
173127
Assert(latch->is_shared);
174128
Assert(latch->owner_pid == MyProcPid);
129+
175130
latch->owner_pid = 0;
176131
}
177132

@@ -229,29 +184,38 @@ WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead,
229184
int hifd;
230185

231186
/*
232-
* Clear the pipe, and check if the latch is set already. If someone
187+
* Clear the pipe, then check if the latch is set already. If someone
233188
* sets the latch between this and the select() below, the setter will
234189
* write a byte to the pipe (or signal us and the signal handler will
235190
* do that), and the select() will return immediately.
191+
*
192+
* Note: we assume that the kernel calls involved in drainSelfPipe()
193+
* and SetLatch() will provide adequate synchronization on machines
194+
* with weak memory ordering, so that we cannot miss seeing is_set
195+
* if the signal byte is already in the pipe when we drain it.
236196
*/
237197
drainSelfPipe();
198+
238199
if (latch->is_set)
239200
{
240201
result = 1;
241202
break;
242203
}
243204

205+
/* Must wait ... set up the event masks for select() */
244206
FD_ZERO(&input_mask);
207+
FD_ZERO(&output_mask);
208+
245209
FD_SET(selfpipe_readfd, &input_mask);
246210
hifd = selfpipe_readfd;
211+
247212
if (sock != PGINVALID_SOCKET && forRead)
248213
{
249214
FD_SET(sock, &input_mask);
250215
if (sock > hifd)
251216
hifd = sock;
252217
}
253218

254-
FD_ZERO(&output_mask);
255219
if (sock != PGINVALID_SOCKET && forWrite)
256220
{
257221
FD_SET(sock, &output_mask);
@@ -288,14 +252,23 @@ WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead,
288252
}
289253

290254
/*
291-
* Sets a latch and wakes up anyone waiting on it. Returns quickly if the
292-
* latch is already set.
255+
* Sets a latch and wakes up anyone waiting on it.
256+
*
257+
* This is cheap if the latch is already set, otherwise not so much.
293258
*/
294259
void
295260
SetLatch(volatile Latch *latch)
296261
{
297262
pid_t owner_pid;
298263

264+
/*
265+
* XXX there really ought to be a memory barrier operation right here,
266+
* to ensure that any flag variables we might have changed get flushed
267+
* to main memory before we check/set is_set. Without that, we have to
268+
* require that callers provide their own synchronization for machines
269+
* with weak memory ordering (see latch.h).
270+
*/
271+
299272
/* Quick exit if already set */
300273
if (latch->is_set)
301274
return;
@@ -307,13 +280,21 @@ SetLatch(volatile Latch *latch)
307280
* we're in a signal handler. We use the self-pipe to wake up the select()
308281
* in that case. If it's another process, send a signal.
309282
*
310-
* Fetch owner_pid only once, in case the owner simultaneously disowns the
311-
* latch and clears owner_pid. XXX: This assumes that pid_t is atomic,
312-
* which isn't guaranteed to be true! In practice, the effective range of
313-
* pid_t fits in a 32 bit integer, and so should be atomic. In the worst
314-
* case, we might end up signaling wrong process if the right one disowns
315-
* the latch just as we fetch owner_pid. Even then, you're very unlucky if
316-
* a process with that bogus pid exists.
283+
* Fetch owner_pid only once, in case the latch is concurrently getting
284+
* owned or disowned. XXX: This assumes that pid_t is atomic, which isn't
285+
* guaranteed to be true! In practice, the effective range of pid_t fits
286+
* in a 32 bit integer, and so should be atomic. In the worst case, we
287+
* might end up signaling the wrong process. Even then, you're very
288+
* unlucky if a process with that bogus pid exists and belongs to
289+
* Postgres; and PG database processes should handle excess SIGUSR1
290+
* interrupts without a problem anyhow.
291+
*
292+
* Another sort of race condition that's possible here is for a new process
293+
* to own the latch immediately after we look, so we don't signal it.
294+
* This is okay so long as all callers of ResetLatch/WaitLatch follow the
295+
* standard coding convention of waiting at the bottom of their loops,
296+
* not the top, so that they'll correctly process latch-setting events that
297+
* happen before they enter the loop.
317298
*/
318299
owner_pid = latch->owner_pid;
319300
if (owner_pid == 0)
@@ -335,11 +316,23 @@ ResetLatch(volatile Latch *latch)
335316
Assert(latch->owner_pid == MyProcPid);
336317

337318
latch->is_set = false;
319+
320+
/*
321+
* XXX there really ought to be a memory barrier operation right here, to
322+
* ensure that the write to is_set gets flushed to main memory before we
323+
* examine any flag variables. Otherwise a concurrent SetLatch might
324+
* falsely conclude that it needn't signal us, even though we have missed
325+
* seeing some flag updates that SetLatch was supposed to inform us of.
326+
* For the moment, callers must supply their own synchronization of flag
327+
* variables (see latch.h).
328+
*/
338329
}
339330

340331
/*
341-
* SetLatch uses SIGUSR1 to wake up the process waiting on the latch. Wake
342-
* up WaitLatch.
332+
* SetLatch uses SIGUSR1 to wake up the process waiting on the latch.
333+
*
334+
* Wake up WaitLatch, if we're waiting. (We might not be, since SIGUSR1 is
335+
* overloaded for multiple purposes.)
343336
*/
344337
void
345338
latch_sigusr1_handler(void)

src/backend/port/win32_latch.c

Lines changed: 13 additions & 12 deletions
10000
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11
/*-------------------------------------------------------------------------
22
*
33
* win32_latch.c
4-
* Windows implementation of latches.
4+
* Routines for inter-process latches
55
*
6-
* See unix_latch.c for information on usage.
6+
* See unix_latch.c for header comments for the exported functions;
7+
* the API presented here is supposed to be the same as there.
78
*
89
* The Windows implementation uses Windows events that are inherited by
910
* all postmaster child processes.
@@ -23,7 +24,6 @@
2324
#include <unistd.h>
2425

2526
#include "miscadmin.h"
26-
#include "replication/walsender.h"
2727
#include "storage/latch.h"
2828
#include "storage/shmem.h"
2929

@@ -88,7 +88,7 @@ WaitLatch(volatile Latch *latch, long timeout)
8888
}
8989

9090
int
91-
WaitLatchOrSocket(volatile Latch *latch, SOCKET sock, bool forRead,
91+
WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead,
9292
bool forWrite, long timeout)
9393
{
9494
DWORD rc;
@@ -98,6 +98,9 @@ WaitLatchOrSocket(volatile Latch *latch, SOCKET sock, bool forRead,
9898
int numevents;
9999
int result = 0;
100100

101+
if (latch->owner_pid != MyProcPid)
102+
elog(ERROR, "cannot wait on a latch owned by another process");
103+
101104
latchevent = latch->event;
102105

103106
events[0] = latchevent;
@@ -187,15 +190,10 @@ SetLatch(volatile Latch *latch)
187190

188191
/*
189192
* See if anyone's waiting for the latch. It can be the current process if
190-
* we're in a signal handler. Use a local variable here in case the latch
191-
* is just disowned between the test and the SetEvent call, and event
192-
* field set to NULL.
193+
* we're in a signal handler.
193194
*
194-
* Fetch handle field only once, in case the owner simultaneously disowns
195-
* the latch and clears handle. This assumes that HANDLE is atomic, which
196-
* isn't guaranteed to be true! In practice, it should be, and in the
197-
* worst case we end up calling SetEvent with a bogus handle, and SetEvent
198-
* will return an error with no harm done.
195+
* Use a local variable here just in case somebody changes the event field
196+
* concurrently (which really should not happen).
199197
*/
200198
handle = latch->event;
201199
if (handle)
@@ -212,5 +210,8 @@ SetLatch(volatile Latch *latch)
212210
void
213211
ResetLatch(volatile Latch *latch)
214212
{
213+
/* Only the owner should reset the latch */
214+
Assert(latch->owner_pid == MyProcPid);
215+
215216
latch->is_set = false;
216217
}

src/backend/replication/syncrep.c

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -166,13 +166,6 @@ SyncRepWaitForLSN(XLogRecPtr XactCommitLSN)
166166
{
167167
int syncRepState;
168168

169-
/*
170-
* Wait on latch for up to 60 seconds. This allows us to check for
171-
* postmaster death regularly while waiting. Note that timeout here
172-
* does not necessarily release from loop.
173-
*/
174-
WaitLatch(&MyProc->waitLatch, 60000000L);
175-
176169
/* Must reset the latch before testing state. */
177170
ResetLatch(&MyProc->waitLatch);
178171

@@ -184,6 +177,12 @@ SyncRepWaitForLSN(XLogRecPtr XactCommitLSN)
184177
* walsender changes the state to SYNC_REP_WAIT_COMPLETE, it will
185178
* never update it again, so we can't be seeing a stale value in that
186179
* case.
180+
*
181+
* Note: on machines with weak memory ordering, the acquisition of
182+
* the lock is essential to avoid race conditions: we cannot be sure
183+
* the sender's state update has reached main memory until we acquire
184+
* the lock. We could get rid of this dance if SetLatch/ResetLatch
185+
* contained memory barriers.
187186
*/
188187
syncRepState = MyProc->syncRepState;
189188
if (syncRepState == SYNC_REP_WAITING)
@@ -246,6 +245,13 @@ SyncRepWaitForLSN(XLogRecPtr XactCommitLSN)
246245
SyncRepCancelWait();
247246
break;
248247
}
248+
249+
/*
250+
* Wait on latch for up to 60 seconds. This allows us to check for
251+
* cancel/die signal or postmaster death regularly while waiting. Note
252+
* that timeout here does not necessarily release from loop.
253+
*/
254+
WaitLatch(&MyProc->waitLatch, 60000000L);
249255
}
250256

251257
/*

src/backend/storage/lmgr/proc.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -338,7 +338,7 @@ InitProcess(void)
338338
MyProc->waitLSN.xrecoff = 0;
339339
MyProc->syncRepState = SYNC_REP_NOT_WAITING;
340340
SHMQueueElemInit(&(MyProc->syncRepLinks));
341-
OwnLatch((Latch *) &MyProc->waitLatch);
341+
OwnLatch(&MyProc->waitLatch);
342342

343343
/*
344344
* We might be reusing a semaphore that belonged to a failed process. So

0 commit comments

Comments
 (0)
0