8000 Replace max_standby_delay with two parameters, max_standby_archive_de… · postgrespro/postgres_cluster@e76c1a0 · GitHub
[go: up one dir, main page]

Skip to content

Commit e76c1a0

Browse files
committed
Replace max_standby_delay with two parameters, max_standby_archive_delay and
max_standby_streaming_delay, and revise the implementation to avoid assuming that timestamps found in WAL records can meaningfully be compared to clock time on the standby server. Instead, the delay limits are compared to the elapsed time since we last obtained a new WAL segment from archive or since we were last "caught up" to WAL data arriving via streaming replication. This avoids problems with clock skew between primary and standby, as well as other corner cases that the original coding would misbehave in, such as the primary server having significant idle time between transactions. Per my complaint some time ago and considerable ensuing discussion. Do some desultory editing on the hot standby documentation, too.
1 parent e6a7416 commit e76c1a0

File tree

12 files changed

+498
-353
lines changed
  • replication
  • storage
  • 12 files changed

    +498
    -353
    lines changed

    doc/src/sgml/config.sgml

    Lines changed: 73 additions & 38 deletions
    Original file line numberDiff line numberDiff line change
    @@ -1,4 +1,4 @@
    1-
    <!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.288 2010/06/30 02:43:10 momjian Exp $ -->
    1+
    <!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.289 2010/07/03 20:43:57 tgl Exp $ -->
    22

    33
    <chapter Id="runtime-config">
    44
    <title>Server Configuration</title>
    @@ -1841,6 +1841,8 @@ SET ENABLE_SEQSCAN TO OFF;
    18411841
    <para>
    18421842
    These settings control the behavior of the built-in
    18431843
    <firstterm>streaming replication</> feature.
    1844+
    These parameters would be set on the primary server that is
    1845+
    to send replication data to one or more standby servers.
    18441846
    </para>
    18451847

    18461848
    <variablelist>
    @@ -1866,7 +1868,7 @@ SET ENABLE_SEQSCAN TO OFF;
    18661868
    </indexterm>
    18671869
    <listitem>
    18681870
    <para>
    1869-
    Specifies the delay between activity rounds for the WAL sender.
    1871+
    Specifies the delay between activity rounds for WAL sender processes.
    18701872
    In each round the WAL sender sends any WAL accumulated since the last
    18711873
    round to the standby server. It then sleeps for
    18721874
    <varname>wal_sender_delay</> milliseconds, and repeats. The default
    @@ -1887,34 +1889,42 @@ SET ENABLE_SEQSCAN TO OFF;
    18871889
    </indexterm>
    18881890
    <listitem>
    18891891
    <para>
    1890-
    Specifies the number of past log file segments kept in the
    1892+
    Specifies the minimum number of past log file segments kept in the
    18911893
    <filename>pg_xlog</>
    18921894
    directory, in case a standby server needs to fetch them for streaming
    18931895
    replication. Each segment is normally 16 megabytes. If a standby
    18941896
    server connected to the primary falls behind by more than
    18951897
    <varname>wal_keep_segments</> segments, the primary might remove
    18961898
    a WAL segment still needed by the standby, in which case the
    1897-
    replication connection will be terminated.
    1899+
    replication connection will be terminated. (However, the standby
    1900+
    server can recover by fetching the segment from archive, if WAL
    1901+
    archiving is in use.)
    18981902
    </para>
    18991903

    19001904
    <para>
    1901-
    This sets only the minimum number of segments retained for standby
    1902-
    purposes; the system might need to retain more segments for WAL
    1903-
    archival or to recover from a checkpoint. If <varname>wal_keep_segments</>
    1904-
    is zero (the default), the system doesn't keep any extra segments
    1905-
    for standby purposes, and the number of old WAL segments available
    1906-
    for standbys is determined based only on the location of the previous
    1907-
    checkpoint and status of WAL archiving.
    1908-
    This parameter can only be set in the <filename>postgresql.conf</>
    1909-
    file or on the server command line.
    1905+
    This sets only the minimum number of segments retained in
    1906+
    <filename>pg_xlog</>; the system might need to retain more segments
    1907+
    for WAL archival or to recover from a checkpoint. If
    1908+
    <varname>wal_keep_segments</> is zero (the default), the system
    1909+
    doesn't keep any extra segments for standby purposes, and the number
    1910+
    of old WAL segments available to standby servers is a function of
    1911+
    the location of the previous checkpoint and status of WAL
    1912+
    archiving. This parameter can only be set in the
    1913+
    <filename>postgresql.conf</> file or on the server command line.
    19101914
    </para>
    19111915
    </listitem>
    19121916
    </varlistentry>
    19131917
    </variablelist>
    19141918
    </sect2>
    1919+
    19151920
    <sect2 id="runtime-config-standby">
    19161921
    <title>Standby Servers</title>
    19171922

    1923+
    <para>
    1924+
    These settings control the behavior of a standby server that is
    1925+
    to receive replication data.
    1926+
    </para>
    1927+
    19181928
    <variablelist>
    19191929

    19201930
    <varlistentry id="guc-hot-standby" xreflabel="hot_standby">
    @@ -1933,39 +1943,64 @@ SET ENABLE_SEQSCAN TO OFF;
    19331943
    </listitem>
    19341944
    </varlistentry>
    19351945

    1936-
    <varlistentry id="guc-max-standby-delay" xreflabel="max_standby_delay">
    1937-
    <term><varname>max_standby_delay</varname> (<type>integer</type>)</term>
    1946+
    <varlistentry id="guc-max-standby-archive-delay" xreflabel="max_standby_archive_delay">
    1947+
    <term><varname>max_standby_archive_delay</varname> (<type>integer</type>)</term>
    19381948
    <indexterm>
    1939-
    <primary><varname>max_standby_delay</> configuration parameter</primary>
    1949+
    <primary><varname>max_standby_archive_delay</> configuration parameter</primary>
    19401950
    </indexterm>
    19411951
    <listitem>
    19421952
    <para>
    1943-
    When Hot Standby is active, this parameter specifies a wait policy
    1944-
    for applying WAL entries that conflict with active queries.
    1945-
    If a conflict should occur the server will delay up to this long
    1946-
    before it cancels conflicting queries, as
    1947-
    described in <xref linkend="hot-standby-conflict">.
    1948-
    The default is 30 seconds (30 s). Units are milliseconds.
    1949-
    A value of -1 causes the standby to wait forever for a conflicting
    1950-
    query to complete.
    1953+
    When Hot Standby is active, this parameter determines how long the
    1954+
    standby server should wait before canceling standby queries that
    1955+
    conflict with about-to-be-applied WAL entries, as described in
    1956+
    <xref linkend="hot-standby-conflict">.
    1957+
    <varname>max_standby_archive_delay</> applies when WAL data is
    1958+
    being read from WAL archive (and is therefore not current).
    1959+
    The default is 30 seconds. Units are milliseconds if not specified.
    1960+
    A value of -1 allows the standby to wait forever for conflicting
    1961+
    queries to complete.
    19511962
    This parameter can only be set in the <filename>postgresql.conf</>
    19521963
    file or on the server command line.
    19531964
    </para>
    19541965
    <para>
    1955-
    A high value makes query cancel less likely.
    1956-
    Increasing this parameter or setting it to -1 might delay master server
    1957-
    changes from appearing on the standby.
    1958-
    </para>
    1959-
    <para>
    1960-
    While it is tempting to believe that <varname>max_standby_delay</>
    1961-
    is the maximum length of time a query can run before
    1962-
    cancellation is possible, this is not true. When a long-running
    1963-
    query ends, there is a finite time required to apply backlogged
    1964-
    WAL logs. If a second long-running query appears before the
    1965-
    WAL has caught up, the snapshot taken by the second query will
    1966-
    allow significantly less than <varname>max_standby_delay</> seconds
    1967-
    before query cancellation is possible.
    1968-
    </para>
    1966+
    Note that <varname>max_standby_archive_delay</> is not the same as the
    1967+
    maximum length of time a query can run before cancellation; rather it
    1968+
    is the maximum total time allowed to apply any one WAL segment's data.
    1969+
    Thus, if one query has resulted in significant delay earlier in the
    1970+
    WAL segment, subsequent conflicting queries will have much less grace
    1971+
    time.
    1972+
    </para>
    1973+
    </listitem>
    1974+
    </varlistentry>
    1975+
    1976+
    <varlistentry id="guc-max-standby-streaming-delay" xreflabel="max_standby_streaming_delay">
    1977+
    <term><varname>max_standby_streaming_delay</varname> (<type>integer</type>)</term>
    1978+
    <indexterm>
    1979+
    <primary><varname>max_standby_streaming_delay</> configuration parameter</primary>
    1980+
    </indexterm>
    1981+
    <listitem>
    1982+
    <para>
    1983+
    When Hot Standby is active, this parameter determines how long the
    1984+
    standby server should wait before canceling standby queries that
    1985+
    conflict with about-to-be-applied WAL entries, as described in
    1986+
    <xref linkend="hot-standby-conflict">.
    1987+
    <varname>max_standby_streaming_delay</> applies when WAL data is
    1988+
    being received via streaming replication.
    1989+
    The default is 30 seconds. Units are milliseconds if not specified.
    1990+
    A value of -1 allows the standby to wait forever for conflicting
    1991+
    queries to complete.
    1992+
    This parameter can only be set in the <filename>postgresql.conf</>
    1993+
    file or on the server command line.
    1994+
    </para>
    1995+
    <para>
    1996+
    Note that <varname>max_standby_streaming_delay</> is not the same as
    1997+
    the maximum length of time a query can run before cancellation; rather
    1998+
    it is the maximum total time allowed to apply WAL data once it has
    1999+
    been received from the primary server. Thus, if one query has
    2000+
    resulted in significant delay, subsequent conflicting queries will
    2001+
    have much less grace time until the standby server has caught up
    2002+
    again.
    2003+
    </para>
    19692004
    </listitem>
    19702005
    </varlistentry>
    19712006

    0 commit comments

    Comments
     (0)
    0