8000 Don't require dynamic timezone abbreviations to match underlying time… · koderP/postgres@380dad2 · GitHub
[go: up one dir, main page]

Skip to content

Commit 380dad2

Browse files
committed
Don't require dynamic timezone abbreviations to match underlying time zone.
Previously, we threw an error if a dynamic timezone abbreviation did not match any abbreviation recorded in the referenced IANA time zone entry. That seemed like a good consistency check at the time, but it turns out that a number of the abbreviations in the IANA database are things that Olson and crew made up out of whole cloth. Their current policy is to remove such names in favor of using simple numeric offsets. Perhaps unsurprisingly, a lot of these made-up abbreviations have varied in meaning over time, which meant that our commit b2cbced and later changes made them into dynamic abbreviations. So with newer IANA database versions that don't mention these abbreviations at all, we fail, as reported in bug #14307 from Neil Anderson. It's worse than just a few unused-in-the-wild abbreviations not working, because the pg_timezone_abbrevs view stops working altogether (since its underlying function tries to compute the whole view result in one call). We considered deleting these abbreviations from our abbreviations list, but the problem with that is that we can't stay ahead of possible future IANA changes. Instead, let's leave the abbreviations list alone, and treat any "orphaned" dynamic abbreviation as just meaning the referenced time zone. It will behave a bit differently than it used to, in that you can't any longer override the zone's standard vs. daylight rule by using the "wrong" abbreviation of a pair, but that's better than failing entirely. (Also, this solution can be interpreted as adding a small new feature, which is that any abbreviation a user wants can be defined as referencing a time zone name.) Back-patch to all supported branches, since this problem affects all of them when using tzdata 2016f or newer. Report: <20160902031551.15674.67337@wrigleys.postgresql.org> Discussion: <6189.1472820913@sss.pgh.pa.us>
1 parent e3439a4 commit 380dad2

File tree

5 files changed

+132
-32
lines changed
  • sql
  • 5 files changed

    +132
    -32
    lines changed

    doc/src/sgml/catalogs.sgml

    Lines changed: 7 additions & 0 deletions
    Original file line numberDiff line numberDiff line change
    @@ -8225,6 +8225,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
    82258225
    </tgroup>
    82268226
    </table>
    82278227

    8228+
    <para>
    8229+
    While most timezone abbreviations represent fixed offsets from UTC,
    8230+
    there are some that have historically varied in value
    8231+
    (see <xref linkend="datetime-config-files"> for more information).
    8232+
    In such cases this view presents their current meaning.
    8233+
    </para>
    8234+
    82288235
    </sect1>
    82298236

    82308237
    <sect1 id="view-pg-timezone-names">

    doc/src/sgml/datetime.sgml

    Lines changed: 30 additions & 11 deletions
    Original file line numberDiff line numberDiff line change
    @@ -384,19 +384,38 @@
    384384

    385385
    <para>
    386386
    A <replaceable>zone_abbreviation</replaceable> is just the abbreviation
    387-
    being defined. The <replaceable>offset</replaceable> is the equivalent
    388-
    offset in seconds from UTC, positive being east from Greenwich and
    389-
    negative being west. For example, -18000 would be five hours west
    390-
    of Greenwich, or North American east coast standard time. <literal>D</>
    391-
    indicates that the zone name represents local daylight-savings time rather
    392-
    than standard time. Alternatively, a <replaceable>time_zone_name</> can
    393-
    be given, in which case that time zone definition is consulted, and the
    394-
    abbreviation's meaning in that zone is used. This alternative is
    395-
    recommended only for abbreviations whose meaning has historically varied,
    396-
    as looking up the meaning is noticeably more expensive than just using
    397-
    a fixed integer value.
    387+
    being defined. An <replaceable>offset</replaceable> is an integer giving
    388+
    the equivalent offset in seconds from UTC, positive being east from
    389+
    Greenwich and negative being west. For example, -18000 would be five
    390+
    hours west of Greenwich, or North American east coast standard time.
    391+
    <literal>D</> indicates that the zone name represents local
    392+
    daylight-savings time rather than standard time.
    398393
    </para>
    399394

    395+
    <para>
    396+
    Alternatively, a <replaceable>time_zone_name</> can be given, referencing
    397+
    a zone name defined in the IANA timezone database. The zone's definition
    398+
    is consulted to see whether the abbreviation is or has been in use in
    399+
    that zone, and if so, the appropriate meaning is used &mdash; that is,
    400+
    the meaning that was currently in use at the timestamp whose value is
    401+
    being determined, or the meaning in use immediately before that if it
    402+
    wasn't current at that time, or the oldest meaning if it was used only
    403+
    after that time. This behavior is essential for dealing with
    404+
    abbreviations whose meaning has historically varied. It is also allowed
    405+
    to define an abbreviation in terms of a zone name in which that
    406+
    abbreviation does not appear; then using the abbreviation is just
    407+
    equivalent to writing out the zone name.
    408+
    </para>
    409+
    410+
    <tip>
    411+
    <para>
    412+
    Using a simple integer <replaceable>offset</replaceable> is preferred
    413+
    when defining an abbreviation whose offset from UTC has never changed,
    414+
    as such abbreviations are much cheaper to process than those that
    415+
    require consulting a time zone definition.
    416+
    </para>
    417+
    </tip>
    418+
    400419
    <para>
    401420
    The <literal>@INCLUDE</> syntax allows inclusion of another file in the
    402421
    <filename>.../share/timezonesets/</> directory. Inclusion can be nested,

    src/backend/utils/adt/datetime.c

    Lines changed: 64 additions & 21 deletions
    Original file line numberDiff line numberDiff line change
    @@ -53,8 +53,9 @@ static void AdjustFractDays(double frac, struct pg_tm * tm, fsec_t *fsec,
    5353
    int scale);
    5454
    static int DetermineTimeZoneOffsetInternal(struct pg_tm * tm, pg_tz *tzp,
    5555
    pg_time_t *tp);
    56-
    static int DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t, const char *abbr,
    57-
    pg_tz *tzp, int *isdst);
    56+
    static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
    57+
    const char *abbr, pg_tz *tzp,
    58+
    int *offset, int *isdst);
    5859
    static pg_tz *FetchDynamicTimeZone(TimeZoneAbbrevTable *tbl, const datetkn *tp);
    5960

    6061

    @@ -1602,19 +1603,40 @@ DetermineTimeZoneOffsetInternal(struct pg_tm * tm, pg_tz *tzp, pg_time_t *tp)
    16021603
    * This differs from the behavior of DetermineTimeZoneOffset() in that a
    16031604
    * standard-time or daylight-time abbreviation forces use of the corresponding
    16041605
    * GMT offset even when the zone was then in DS or standard time respectively.
    1606+
    * (However, that happens only if we can match the given abbreviation to some
    1607+
    * abbreviation that appears in the IANA timezone data. Otherwise, we fall
    1608+
    * back to doing DetermineTimeZoneOffset().)
    16051609
    */
    16061610
    int
    16071611
    DetermineTimeZoneAbbrevOffset(struct pg_tm * tm, const char *abbr, pg_tz *tzp)
    16081612
    {
    16091613
    pg_time_t t;
    1614+
    int zone_offset;
    1615+
    int abbr_offset;
    1616+
    int abbr_isdst;
    16101617

    16111618
    /*
    16121619
    * Compute the UTC time we want to probe at. (In event of overflow, we'll
    16131620
    * probe at the epoch, which is a bit random but probably doesn't matter.)
    16141621
    */
    1615-
    (void) DetermineTimeZoneOffsetInternal(tm, tzp, &t);
    1622+
    zone_offset = DetermineTimeZoneOffsetInternal(tm, tzp, &t);
    16161623

    1617-
    return DetermineTimeZoneAbbrevOffsetInternal(t, abbr, tzp, &tm->tm_isdst);
    1624+
    /*
    1625+
    * Try to match the abbreviation to something in the zone definition.
    1626+
    */
    1627+
    if (DetermineTimeZoneAbbrevOffsetInternal(t, abbr, tzp,
    1628+
    &abbr_offset, &abbr_isdst))
    1629+
    {
    1630+
    /* Success, so use the abbrev-specific answers. */
    1631+
    tm->tm_isdst = abbr_isdst;
    1632+
    return abbr_offset;
    1633+
    }
    1634+
    1635+
    /*
    1636+
    * No match, so use the answers we already got from
    1637+
    * DetermineTimeZoneOffsetInternal.
    1638+
    */
    1639+
    return zone_offset;
    16181640
    }
    16191641

    16201642

    @@ -1628,19 +1650,41 @@ DetermineTimeZoneAbbrevOffsetTS(TimestampTz ts, const char *abbr,
    16281650
    pg_tz *tzp, int *isdst)
    16291651
    {
    16301652
    pg_time_t t = timestamptz_to_time_t(ts);
    1653+
    int zone_offset;
    1654+
    int abbr_offset;
    1655+
    int tz;
    1656+
    struct pg_tm tm;
    1657+
    fsec_t fsec;
    16311658

    1632-
    return DetermineTimeZoneAbbrevOffsetInternal(t, abbr, tzp, isdst);
    1659+
    /*
    1660+
    * If the abbrev matches anything in the zone data, this is pretty easy.
    1661+
    */
    1662+
    if (DetermineTimeZoneAbbrevOffsetInternal(t, abbr, tzp,
    1663+
    &abbr_offset, isdst))
    1664+
    return abbr_offset;
    1665+
    1666+
    /*
    1667+
    * Else, break down the timestamp so we can use DetermineTimeZoneOffset.
    1668+
    */
    1669+
    if (timestamp2tm(ts, &tz, &tm, &fsec, NULL, tzp) != 0)
    1670+
    ereport(ERROR,
    1671+
    (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
    1672+
    errmsg("timestamp out of range")));
    1673+
    1674+
    zone_offset = DetermineTimeZoneOffset(&tm, tzp);
    1675+
    *isdst = tm.tm_isdst;
    1676+
    return zone_offset;
    16331677
    }
    16341678

    16351679

    16361680
    /* DetermineTimeZoneAbbrevOffsetInternal()
    16371681
    *
    16381682
    * Workhorse for above two functions: work from a pg_time_t probe instant.
    1639-
    * DST status is returned into *isdst.
    1683+
    * On success, return GMT offset and DST status into *offset and *isdst.
    16401684
    */
    1641-
    static int
    1642-
    DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t, const char *abbr,
    1643-
    pg_tz *tzp, int *isdst)
    1685+
    static bool
    1686+
    DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t, const char *abbr, pg_tz *tzp,
    1687+
    int *offset, int *isdst)
    16441688
    {
    16451689
    char upabbr[TZ_STRLEN_MAX + 1];
    16461690
    unsigned char *p;
    @@ -1652,18 +1696,17 @@ DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t, const char *abbr,
    16521696
    *p = pg_toupper(*p);
    16531697

    16541698
    /* Look up the abbrev's meaning at this time in this zone */
    1655-
    if (!pg_interpret_timezone_abbrev(upabbr,
    1656-
    &t,
    1657-
    &gmtoff,
    1658-
    isdst,
    1659-
    tzp))
    1660-
    ereport(ERROR,
    1661-
    (errcode(ERRCODE_CONFIG_FILE_ERROR),
    1662-
    errmsg("time zone abbreviation \"%s\" is not used in time zone \"%s\"",
    1663-
    abbr, pg_get_timezone_name(tzp))));
    1664-
    1665-
    /* Change sign to agree with DetermineTimeZoneOffset() */
    1666-
    return (int) -gmtoff;
    1699+
    if (pg_interpret_timezone_abbrev(upabbr,
    1700+
    &t,
    1701+
    &gmtoff,
    1702+
    isdst,
    1703+
    tzp))
    1704+
    {
    1705+
    /* Change sign to agree with DetermineTimeZoneOffset() */
    1706+
    *offset = (int) -gmtoff;
    1707+
    return true;
    1708+
    }
    1709+
    return false;
    16671710
    }
    16681711

    16691712

    src/test/regress/expected/timestamptz.out

    Lines changed: 20 additions & 0 deletions
    Original file line numberDiff line numberDiff line change
    @@ -2345,3 +2345,23 @@ SELECT '2007-12-09 07:30:00 UTC'::timestamptz AT TIME ZONE 'VET';
    23452345
    Sun Dec 09 03:00:00 2007
    23462346
    (1 row)
    23472347

    2348+
    --
    2349+
    -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
    2350+
    -- more-or-less working. We can't test their contents in any great detail
    2351+
    -- without the outputs changing anytime IANA updates the underlying data,
    2352+
    -- but it seems reasonable to expect at least one entry per major meridian.
    2353+
    -- (At the time of writing, the actual counts are around 38 because of
    2354+
    -- zones using fractional GMT offsets, so this is a pretty loose test.)
    2355+
    --
    2356+
    select count(distinct utc_offset) >= 24 as ok from pg_timezone_names;
    2357+
    ok
    2358+
    ----
    2359+
    t
    2360+
    (1 row)
    2361+
    2362+
    select count(distinct utc_offset) >= 24 as ok from pg_timezone_abbrevs;
    2363+
    ok
    2364+
    ----
    2365+
    t
    2366+
    (1 row)
    2367+

    src/test/regress/sql/timestamptz.sql

    Lines changed: 11 additions & 0 deletions
    Original file line numberDiff line numberDiff line change
    @@ -379,3 +379,14 @@ SELECT '2007-12-09 07:00:00 UTC'::timestamptz AT TIME ZONE 'VET';
    379379
    SELECT '2007-12-09 07:00:01 UTC'::timestamptz AT TIME ZONE 'VET';
    380380
    SELECT '2007-12-09 07:29:59 UTC'::timestamptz AT TIME ZONE 'VET';
    381381
    SELECT '2007-12-09 07:30:00 UTC'::timestamptz AT TIME ZONE 'VET';
    382+
    383+
    --
    384+
    -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
    385+
    -- more-or-less working. We can't test their contents in any great detail
    386+
    -- without the outputs changing anytime IANA updates the underlying data,
    387+
    -- but it seems reasonable to expect at least one entry per major meridian.
    388+
    -- (At the time of writing, the actual counts are around 38 because of
    389+
    -- zones using fractional GMT offsets, so this is a pretty loose test.)
    390+
    --
    391+
    select count(distinct utc_offset) >= 24 as ok from pg_timezone_names;
    392+
    select count(distinct utc_offset) >= 24 as ok from pg_timezone_abbrevs;

    0 commit comments

    Comments
     (0)
    0