8000 Make parallel worker shutdown complete entirely via before_shmem_exit(). · postgres/postgres@fa91d4c · GitHub
[go: up one dir, main page]

Skip to content
  • Commit fa91d4c

    Browse files
    committed
    Make parallel worker shutdown complete entirely via before_shmem_exit().
    This is a step toward storing stats in dynamic shared memory. As dynamic shared memory segments are detached from just after before_shmem_exit() callbacks are processed, but before on_shmem_exit() callbacks are, no stats can be collected after before_shmem_exit() callbacks have been processed. Parallel worker shutdown can cause stats to be emitted during DSM detach callbacks, e.g. for SharedFileSet (which closes its files, which can causes fd.c to emit stats about temporary files). Therefore parallel worker shutdown needs to complete during the processing of before_shmem_exit callbacks. One might think this problem could instead be solved by carefully ordering the attaching to DSM segments, so that the pgstats segments get detached from later than the parallel query ones. That turns out to not work because the stats hash might need to grow which can cause new segments to be allocated, which then will be detached from earlier. There are two code changes: First, call ParallelWorkerShutdown() via before_shmem_exit. That's a good idea on its own, because other shutdown callbacks like ShutdownPostgres and ShutdownAuxiliaryProcess are called via before_*. Second, explicitly detach from the parallel query DSM segment, thereby ensuring all stats are emitted during ParallelWorkerShutdown(). There are nicer solutions to these problems, but it's not obvious which of those solutions is the correct one. As the shared memory stats work already is a huge amount of work... Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.ana 8000 razel.de Discussion: https://postgr.es/m/20210803023612.iziacxk5syn2r4ut@alap3.anarazel.de
    1 parent ee3f8d3 commit fa91d4c

    File tree

    1 file changed

    +13
    -1
    lines changed

    1 file changed

    +13
    -1
    lines changed

    src/backend/access/transam/parallel.c

    Lines changed: 13 additions & 1 deletion
    Original file line numberDiff line numberDiff line change
    @@ -1305,7 +1305,7 @@ ParallelWorkerMain(Datum main_arg)
    13051305
    /* Arrange to signal the leader if we exit. */
    13061306
    ParallelLeaderPid = fps->parallel_leader_pid;
    13071307
    ParallelLeaderBackendId = fps->parallel_leader_backend_id;
    1308-
    on_shmem_exit(ParallelWorkerShutdown, (Datum) 0);
    1308+
    before_shmem_exit(ParallelWorkerShutdown, PointerGetDatum(seg));
    13091309

    13101310
    /*
    13111311
    * Now we can find and attach to the error queue provided for us. That's
    @@ -1507,13 +1507,25 @@ ParallelWorkerReportLastRecEnd(XLogRecPtr last_xlog_end)
    15071507
    * This guards against the case where we exit uncleanly without sending an
    15081508
    * ErrorResponse to the leader, for example because some code calls proc_exit
    15091509
    * directly.
    1510+
    *
    1511+
    * Also explicitly detach from dsm segment so that subsystems using
    1512+
    * on_dsm_detach() have a chance to send stats before the stats subsystem is
    1513+
    * shut down as as part of a before_shmem_exit() hook.
    1514+
    *
    1515+
    * One might think this could instead be solved by carefully ordering the
    1516+
    * attaching to dsm segments, so that the pgstats segments get detached from
    1517+
    * later than the parallel query one. That turns out to not work because the
    1518+
    * stats hash might need to grow which can cause new segments to be allocated,
    1519+
    * which then will be detached from earlier.
    15101520
    */
    15111521
    static void
    15121522
    ParallelWorkerShutdown(int code, Datum arg)
    15131523
    {
    15141524
    SendProcSignal(ParallelLeaderPid,
    15151525
    PROCSIG_PARALLEL_MESSAGE,
    15161526
    ParallelLeaderBackendId);
    1527+
    1528+
    dsm_detach((dsm_segment *) DatumGetPointer(arg));
    15171529
    }
    15181530

    15191531
    /*

    0 commit comments

    Comments
     (0)
    0