8000 Replace checkpoint_segments with min_wal_size and max_wal_size. · postgres/postgres@88e9823 · GitHub
[go: up one dir, main page]

Skip to content

Commit 88e9823

Browse files
committed
Replace checkpoint_segments with min_wal_size and max_wal_size.
Instead of having a single knob (checkpoint_segments) that both triggers checkpoints, and determines how many checkpoints to recycle, they are now separate concerns. There is still an internal variable called CheckpointSegments, which triggers checkpoints. But it no longer determines how many segments to recycle at a checkpoint. That is now auto-tuned by keeping a moving average of the distance between checkpoints (in bytes), and trying to keep that many segments in reserve. The advantage of this is that you can set max_wal_size very high, but the system won't actually consume that much space if there isn't any need for it. The min_wal_size sets a floor for that; you can effectively disable the auto-tuning behavior by setting min_wal_size equal to max_wal_size. The max_wal_size setting is now the actual target size of WAL at which a new checkpoint is triggered, instead of the distance between checkpoints. Previously, you could calculate the actual WAL usage with the formula "(2 + checkpoint_completion_target) * checkpoint_segments + 1". With this patch, you set the desired WAL usage with max_wal_size, and the system calculates the appropriate CheckpointSegments with the reverse of that formula. That's a lot more intuitive for administrators to set. Reviewed by Amit Kapila and Venkata Balaji N.
1 parent 0fec000 commit 88e9823

File tree

9 files changed

+327
-108
lines changed
  • include
  • 9 files changed

    +327
    -108
    lines changed

    doc/src/sgml/config.sgml

    Lines changed: 30 additions & 10 deletions
    Original file line numberDiff line numberDiff line change
    @@ -1325,7 +1325,7 @@ include_dir 'conf.d'
    13251325
    40% of RAM to <varname>shared_buffers</varname> will work better than a
    13261326
    smaller amount. Larger settings for <varname>shared_buffers</varname>
    13271327
    usually require a corresponding increase in
    1328-
    <varname>checkpoint_segments</varname>, in order to spread out the
    1328+
    <varname>max_wal_size</varname>, in order to spread out the
    13291329
    process of writing large quantities of new or changed data over a
    13301330
    longer period of time.
    13311331
    </para>
    @@ -2394,18 +2394,20 @@ include_dir 'conf.d'
    23942394
    <title>Checkpoints</title>
    23952395

    23962396
    <variablelist>
    2397-
    <varlistentry id="guc-checkpoint-segments" xreflabel="checkpoint_segments">
    2398-
    <term><varname>checkpoint_segments</varname> (<type>integer</type>)
    2397+
    <varlistentry id="guc-max-wal-size" xreflabel="max_wal_size">
    2398+
    <term><varname>max_wal_size</varname> (<type>integer</type>)</term>
    23992399
    <indexterm>
    2400-
    <primary><varname>checkpoint_segments</> configuration parameter</primary>
    2400+
    <primary><varname>max_wal_size</> configuration parameter</primary>
    24012401
    </indexterm>
    2402-
    </term>
    24032402
    <listitem>
    24042403
    <para>
    2405-
    Maximum number of log file segments between automatic WAL
    2406-
    checkpoints (each segment is normally 16 megabytes). The default
    2407-
    is three segments. Increasing this parameter can increase the
    2408-
    amount of time needed for crash recovery.
    2404+
    Maximum size to let the WAL grow to between automatic WAL
    2405+
    checkpoints. This is a soft limit; WAL size can exceed
    2406+
    <varname>max_wal_size</> under special circumstances, like
    2407+
    under heavy load, a failing <varname>archive_command</>, or a high
    2408+
    <varname>wal_keep_segments</> setting. The default is 128 MB.
    2409+
    Increasing this parameter can increase the amount of time needed for
    2410+
    crash recovery.
    24092411
    This parameter can only be set in the <filename>postgresql.conf</>
    24102412
    file or on the server command line.
    24112413
    </para>
    @@ -2458,7 +2460,7 @@ include_dir 'conf.d'
    24582460
    Write a message to the server log if checkpoints caused by
    24592461
    the filling of checkpoint segment files happen closer together
    24602462
    than this many seconds (which suggests that
    2461-
    <varname>checkpoint_segments</> ought to be raised). The default is
    2463+
    <varname>max_wal_size</> ought to be raised). The default is
    24622464
    30 seconds (<literal>30s</>). Zero disables the warning.
    24632465
    No warnings will be generated if <varname>checkpoint_timeout</varname>
    24642466
    is less than <varname>checkpoint_warning</varname>.
    @@ -2468,6 +2470,24 @@ include_dir 'conf.d'
    24682470
    </listitem>
    24692471
    </varlistentry>
    24702472

    2473+
    <varlistentry id="guc-min-wal-size" xreflabel="min_wal_size">
    2474+
    <term><varname>min_wal_size</varname> (<type>integer</type>)</term>
    2475+
    <indexterm>
    2476+
    <primary><varname>min_wal_size</> configuration parameter</primary>
    2477+
    </indexterm>
    2478+
    <listitem>
    2479+
    <para>
    2480+
    As long as WAL disk usage stays below this setting, old WAL files are
    2481+
    always recycled for future use at a checkpoint, rather than removed.
    2482+
    This can be used to ensure that enough WAL space is reserved to
    2483+
    handle spikes in WAL usage, for example when running large batch
    2484+
    jobs. The default is 80 MB.
    2485+
    This parameter can only be set in the <filename>postgresql.conf</>
    2486+
    file or on the server command line.
    2487+
    </para>
    2488+
    </listitem>
    2489+
    </varlistentry>
    2490+
    24712491
    </variablelist>
    24722492
    </sect2>
    24732493
    <sect2 id="runtime-config-wal-archiving">

    doc/src/sgml/perform.sgml

    Lines changed: 8 additions & 8 deletions
    Original file line numberDiff line numberDiff line change
    @@ -1328,19 +1328,19 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
    13281328
    </para>
    13291329
    </sect2>
    13301330

    1331-
    <sect2 id="populate-checkpoint-segments">
    1332-
    <title>Increase <varname>checkpoint_segments</varname></title>
    1331+
    <sect2 id="populate-max-wal-size">
    1332+
    <title>Increase <varname>max_wal_size</varname></title>
    13331333

    13341334
    <para>
    1335-
    Temporarily increasing the <xref
    1336-
    linkend="guc-checkpoint-segments"> configuration variable can also
    1335+
    Temporarily increasing the <xref linkend="guc-max-wal-size">
    1336+
    configuration variable can also
    13371337
    make large data loads faster. This is because loading a large
    13381338
    amount of data into <productname>PostgreSQL</productname> will
    13391339
    cause checkpoints to occur more often than the normal checkpoint
    13401340
    frequency (specified by the <varname>checkpoint_timeout</varname>
    13411341
    configuration variable). Whenever a checkpoint occurs, all dirty
    13421342
    pages must be flushed to disk. By increasing
    1343-
    <varname>checkpoint_segments</varname> temporarily during bulk
    1343+
    <varname>max_wal_size</varname> temporarily during bulk
    13441344
    data loads, the number of checkpoints that are required can be
    13451345
    reduced.
    13461346
    </para>
    @@ -1445,7 +1445,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
    14451445
    <para>
    14461446
    Set appropriate (i.e., larger than normal) values for
    14471447
    <varname>maintenance_work_mem</varname> and
    1448-
    <varname>checkpoint_segments</varname>.
    1448+
    <varname>max_wal_size</varname>.
    14491449
    </para>
    14501450
    </listitem>
    14511451
    <listitem>
    @@ -1512,7 +1512,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
    15121512

    15131513
    So when loading a data-only dump, it is up to you to drop and recreate
    15141514
    indexes and foreign keys if you wish to use those techniques.
    1515-
    It's still useful to increase <varname>checkpoint_segments</varname>
    1515+
    It's still useful to increase <varname>max_wal_size</varname>
    15161516
    while loading the data, but don't bother increasing
    15171517
    <varname>maintenance_work_mem</varname>; rather, you'd do that while
    15181518
    manually recreating indexes and foreign keys afterwards.
    @@ -1577,7 +1577,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
    15771577

    15781578
    <listitem>
    15791579
    <para>
    1580-
    Increase <xref linkend="guc-checkpoint-segments"> and <xref
    1580+
    Increase <xref linkend="guc-max-wal-size"> and <xref
    15811581
    linkend="guc-checkpoint-timeout"> ; this reduces the frequency
    15821582
    of checkpoints, but increases the storage requirements of
    15831583
    <filename>/pg_xlog</>.

    doc/src/sgml/wal.sgml

    Lines changed: 43 additions & 26 deletions
    < 10000 /tr>
    Original file line numberDiff line numberDiff line change
    @@ -472,9 +472,10 @@
    472472
    <para>
    473473
    The server's checkpointer process automatically performs
    474474
    a checkpoint every so often. A checkpoint is begun every <xref
    475-
    linkend="guc-checkpoint-segments"> log segments, or every <xref
    476-
    linkend="guc-checkpoint-timeout"> seconds, whichever comes first.
    477-
    The default settings are 3 segments and 300 seconds (5 minutes), respectively.
    475+
    linkend="guc-checkpoint-timeout"> seconds, or if
    476+
    <xref linkend="guc-max-wal-size"> is about to be exceeded,
    477+
    whichever comes first.
    478+
    The default settings are 5 minutes and 128 MB, respectively.
    478479
    If no WAL has been written since the previous checkpoint, new checkpoints
    479480
    will be skipped even if <varname>checkpoint_timeout</> has passed.
    480481
    (If WAL archiving is being used and you want to put a lower limit on how
    @@ -486,8 +487,8 @@
    486487
    </para>
    487488

    488489
    <para>
    489-
    Reducing <varname>checkpoint_segments</varname> and/or
    490-
    <varname>checkpoint_timeout</varname> causes checkpoints to occur
    490+
    Reducing <varname>checkpoint_timeout</varname> and/or
    491+
    <varname>max_wal_size</varname> causes checkpoints to occur
    491492
    more often. This allows faster after-crash recovery, since less work
    492493
    will need to be redone. However, one must balance this against the
    493494
    increased cost of flushing dirty data pages more often. If
    @@ -510,11 +511,11 @@
    510511
    parameter. If checkpoints happen closer together than
    511512
    <varname>checkpoint_warning</> seconds,
    512513
    a message will be output to the server log recommending increasing
    513-
    <varname>checkpoint_segments</varname>. Occasional appearance of such
    514+
    <varname>max_wal_size</varname>. Occasional appearance of such
    514515
    a message is not cause for alarm, but if it appears often then the
    515516
    checkpoint control parameters should be increased. Bulk operations such
    516517
    as large <command>COPY</> transfers might cause a number of such warnings
    517-
    to appear if you have not set <varname>checkpoint_segments</> high
    518+
    to appear if you have not set <varname>max_wal_size</> high
    518519
    enough.
    519520
    </para>
    520521

    @@ -525,10 +526,10 @@
    525526
    <xref linkend="guc-checkpoint-completion-target">, which is
    526527
    given as a fraction of the checkpoint interval.
    527528
    The I/O rate is adjusted so that the checkpoint finishes when the
    528-
    given fraction of <varname>checkpoint_segments</varname> WAL segments
    529-
    have been consumed since checkpoint start, or the given fraction of
    530-
    <varname>checkpoint_timeout</varname> seconds have elapsed,
    531-
    whichever is sooner. With the default value of 0.5,
    529+
    given fraction of
    530+
    <varname>checkpoint_timeout</varname> seconds have elapsed, or before
    531+
    <varname>max_wal_size</varname> is exceeded, whichever is sooner.
    532+
    With the default value of 0.5,
    532533
    <productname>PostgreSQL</> can be expected to complete each checkpoint
    533534
    in about half the time before the next checkpoint starts. On a system
    534535
    that's very close to maximum I/O throughput during normal operation,
    @@ -545,18 +546,35 @@
    545546
    </para>
    546547

    547548
    <para>
    548-
    There will always be at least one WAL segment file, and will normally
    549-
    not be more than (2 + <varname>checkpoint_completion_target</varname>) * <varname>checkpoint_segments</varname> + 1
    550-
    or <varname>checkpoint_segments</> + <xref linkend="guc-wal-keep-segments"> + 1
    551-
    files. Each segment file is normally 16 MB (though this size can be
    552-
    altered when building the server). You can use this to estimate space
    553-
    requirements for <acronym>WAL</acronym>.
    554-
    Ordinarily, when old log segment files are no longer needed, they
    555-
    are recycled (that is, renamed to become future segments in the numbered
    556-
    sequence). If, due to a short-term peak of log output rate, there
    557-
    are more than 3 * <varname>checkpoint_segments</varname> + 1
    558-
    segment files, the unneeded segment files will be deleted instead
    559-
    of recycled until the system gets back under this limit.
    549+
    The number of WAL segment files in <filename>pg_xlog</> directory depends on
    550+
    <varname>min_wal_size</>, <varname>max_wal_size</> and
    551+
    the amount of WAL generated in previous checkpoint cycles. When old log
    552+
    segment files are no longer needed, they are removed or recycled (that is,
    553+
    renamed to become future segments in the numbered sequence). If, due to a
    554+
    short-term peak of log output rate, <varname>max_wal_size</> is
    555+
    exceeded, the unneeded segment files w 558 ill be removed until the system
    556+
    gets back under this limit. Below that limit, the system recycles enough
    557+
    WAL files to cover the estimated need until the next checkpoint, and
    558+
    removes the rest. The estimate is based on a moving average of the number
    559+
    of WAL files used in previous checkpoint cycles. The moving average
    560+
    is increased immediately if the actual usage exceeds the estimate, so it
    561+
    accommodates peak usage rather average usage to some extent.
    562+
    <varname>min_wal_size</> puts a minimum on the amount of WAL files
    563+
    recycled for future usage; that much WAL is always recycled for future use,
    564+
    even if the system is idle and the WAL usage estimate suggests that little
    565+
    WAL is needed.
    566+
    </para>
    567+
    568+
    <para>
    569+
    Independently of <varname>max_wal_size</varname>,
    570+
    <xref linkend="guc-wal-keep-segments"> + 1 most recent WAL files are
    571+
    kept at all times. Also, if WAL archiving is used, old segments can not be
    572+
    removed or recycled until they are archived. If WAL archiving cannot keep up
    573+
    with the pace that WAL is generated, or if <varname>archive_command</varname>
    574+
    fails repeatedly, old WAL files will accumulate in <filename>pg_xlog</>
    575+
    until the situation is resolved. A slow or failed standby server that
    576+
    uses a replication slot will have the same effect (see
    577+
    <xref linkend="streaming-replication-slots">).
    560578
    </para>
    561579

    562580
    <para>
    @@ -571,9 +589,8 @@
    571589
    master because restartpoints can only be performed at checkpoint records.
    572590
    A restartpoint is triggered when a checkpoint record is reached if at
    573591
    least <varname>checkpoint_timeout</> seconds have passed since the last
    574-
    restartpoint. In standby mode, a restartpoint is also triggered if at
    575-
    least <varname>checkpoint_segments</> log segments have been replayed
    576-
    since the last restartpoint.
    592+
    restartpoint, or if WAL size is about to exceed
    593+
    <varname>max_wal_size</>.
    577594
    </para>
    578595

    579596
    <para>

    0 commit comments

    Comments
     (0)
    0