8000 Support timezone abbreviations that sometimes change. · c2j/postgres@2784b68 · GitHub
[go: up one dir, main page]

Skip to content

Commit 2784b68

Browse files
committed
Support timezone abbreviations that sometimes change.
Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06d et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
1 parent b4d45f0 commit 2784b68

File tree

28 files changed

+2003
-649
lines changed

28 files changed

+2003
-649
lines changed

contrib/btree_gist/btree_ts.c

Lines changed: 3 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -210,27 +210,11 @@ tstz_dist(PG_FUNCTION_ARGS)
210210
**************************************************/
211211

212212

213-
static Timestamp
213+
static inline Timestamp
214214
tstz_to_ts_gmt(TimestampTz ts)
215215
{
216-
Timestamp gmt;
217-
int val,
218-
tz;
219-
220-
gmt = ts;
221-
DecodeSpecial(0, "gmt", &val);
222-
223-
if (ts < DT_NOEND && ts > DT_NOBEGIN)
224-
{
225-
tz = val * 60;
226-
227-
#ifdef HAVE_INT64_TIMESTAMP
228-
gmt -= (tz * INT64CONST(1000000));
229-
#else
230-
gmt -= tz;
231-
#endif
232-
}
233-
return gmt;
216+
/* No timezone correction is needed, since GMT is offset 0 by definition */
217+
return (Timestamp) ts;
234218
}
235219

236220

doc/src/sgml/config.sgml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5001,9 +5001,9 @@ SET XML OPTION { DOCUMENT | CONTENT };
50015001
Sets the collection of time zone abbreviations that will be accepted
50025002
by the server for datetime input. The default is <literal>'Default'</>,
50035003
which is a collection that works in most of the world; there are
5004-
also <literal>'Australia'</literal> and <literal>'India'</literal>, and other collections can be defined
5005-
for a particular installation. See <xref
5006-
linkend="datetime-appendix"> for more information.
5004+
also <literal>'Australia'</literal> and <literal>'India'</literal>,
5005+
and other collections can be defined for a particular installation.
5006+
See <xref linkend="datetime-config-files"> for more information.
50075007
</para>
50085008
</listitem>
50095009
</varlistentry>

doc/src/sgml/datatype.sgml

Lines changed: 28 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2288,7 +2288,7 @@ January 8 04:05:06 1999 PST
22882288
but continue to be prone to arbitrary changes, particularly with
22892289
respect to daylight-savings rules.
22902290
<productname>PostgreSQL</productname> uses the widely-used
2291-
<literal>zoneinfo</> time zone database for information about
2291+
IANA (Olson) time zone database for information about
22922292
historical time zone rules. For times in the future, the assumption
22932293
is that the latest known rules for a given time zone will
22942294
continue to be observed indefinitely far into the future.
@@ -2353,8 +2353,8 @@ January 8 04:05:06 1999 PST
23532353
The recognized time zone names are listed in the
23542354
<literal>pg_timezone_names</literal> view (see <xref
23552355
linkend="view-pg-timezone-names">).
2356-
<productname>PostgreSQL</productname> uses the widely-used
2357-
<literal>zoneinfo</> time zone data for this purpose, so the same
2356+
<productname>PostgreSQL</productname> uses the widely-used IANA
2357+
time zone data for this purpose, so the same time zone
23582358
names are also recognized by much other software.
23592359
</para>
23602360
</listitem>
@@ -2387,7 +2387,7 @@ January 8 04:05:06 1999 PST
23872387
be functionally equivalent to United States East Coast time. When a
23882388
daylight-savings zone name is present, it is assumed to be used
23892389
according to the same daylight-savings transition rules used in the
2390-
<literal>zoneinfo</> time zone database's <filename>posixrules</> entry.
2390+
IANA time zone database's <filename>posixrules</> entry.
23912391
In a standard <productname>PostgreSQL</productname> installation,
23922392
<filename>posixrules</> is the same as <literal>US/Eastern</>, so
23932393
that POSIX-style time zone specifications follow USA daylight-savings
@@ -2398,9 +2398,25 @@ January 8 04:05:06 1999 PST
23982398
</itemizedlist>
23992399

24002400
In short, this is the difference between abbreviations
2401-
and full names: abbreviations always represent a fixed offset from
2402-
UTC, whereas most of the full names imply a local daylight-savings time
2403-
rule, and so have two possible UTC offsets.
2401+
and full names: abbreviations represent a specific offset from UTC,
2402+
whereas many of the full names imply a local daylight-savings time
2403+
rule, and so have two possible UTC offsets. As an example,
2404+
<literal>2014-06-04 12:00 America/New_York</> represents noon local
2405+
time in New York, which for this particular date was Eastern Daylight
2406+
Time (UTC-4). So <literal>2014-06-04 12:00 EDT</> specifies that
2407+
same time instant. But <literal>2014-06-04 12:00 EST</> specifies
2408+
noon Eastern Standard Time (UTC-5), regardless of whether daylight
2409+
savings was nominally in effect on that date.
2410+
</para>
2411+
2412+
<para>
2413+
To complicate matters, some jurisdictions have used the same timezone
2414+
abbreviation to mean different UTC offsets at different times; for
2415+
example, in Moscow <literal>MSK</> has meant UTC+3 in some years and
2416+
UTC+4 in others. <application>PostgreSQL</> interprets such
2417+
abbreviations according to whatever they meant (or had most recently
2418+
meant) on the specified date; but, as with the <literal>EST</> example
2419+
above, this is not necessarily the same as local civil time on that date.
24042420
</para>
24052421

24062422
<para>
@@ -2417,13 +2433,14 @@ January 8 04:05:06 1999 PST
24172433
</para>
24182434

24192435
<para>
2420-
In all cases, timezone names are recognized case-insensitively.
2421-
(This is a change from <productname>PostgreSQL</productname> versions
2422-
prior to 8.2, which were case-sensitive in some contexts but not others.)
2436+
In all cases, timezone names and abbreviations are recognized
2437+
case-insensitively. (This is a change from <productname>PostgreSQL</>
2438+
versions prior to 8.2, which were case-sensitive in some contexts but
2439+
not others.)
24232440
</para>
24242441

24252442
<para>
2426-
Neither full names nor abbreviations are hard-wired into the server;
2443+
Neither timezone names nor abbreviations are hard-wired into the server;
24272444
they are obtained from configuration files stored under
24282445
<filename>.../share/timezone/</> and <filename>.../share/timezonesets/</>
24292446
of the installation directory

doc/src/sgml/datetime.sgml

Lines changed: 20 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -374,22 +374,27 @@
374374
these formats:
375375

376376
<synopsis>
377-
<replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable>
378-
<replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable> D
377+
<replaceable>zone_abbreviation</replaceable> <replaceable>offset</replaceable>
378+
<replaceable>zone_abbreviation</replaceable> <replaceable>offset</replaceable> D
379+
<replaceable>zone_abbreviation</replaceable> <replaceable>time_zone_name</replaceable>
379380
@INCLUDE <replaceable>file_name</replaceable>
380381
@OVERRIDE
381382
</synopsis>
382383
</para>
383384

384385
<para>
385-
A <replaceable>time_zone_name</replaceable> is just the abbreviation
386-
being defined. The <replaceable>offset</replaceable> is the zone's
386+
A <replaceable>zone_abbreviation</replaceable> is just the abbreviation
387+
being defined. The <replaceable>offset</replaceable> is the equivalent
387388
offset in seconds from UTC, positive being east from Greenwich and
388389
negative being west. For example, -18000 would be five hours west
389390
of Greenwich, or North American east coast standard time. <literal>D</>
390-
indicates that the zone name represents local daylight-savings time
391-
rather than standard time. Since all known time zone offsets are on
392-
15 minute boundaries, the number of seconds has to be a multiple of 900.
391+
indicates that the zone name represents local daylight-savings time rather
392+
than standard time. Alternatively, a <replaceable>time_zone_name</> can
393+
be given, in which case that time zone definition is consulted, and the
394+
abbreviation's meaning in that zone is used. This alternative is
395+
recommended only for abbreviations whose meaning has historically varied,
396+
as looking up the meaning is noticeably more expensive than just using
397+
a fixed integer value.
393398
</para>
394399

395400
<para>
@@ -400,24 +405,24 @@
400405

401406
<para>
402407
The <literal>@OVERRIDE</> syntax indicates that subsequent entries in the
403-
file can override previous entries (i.e., entries obtained from included
404-
files). Without this, conflicting definitions of the same timezone
405-
abbreviation are considered an error.
408+
file can override previous entries (typically, entries obtained from
409+
included files). Without this, conflicting definitions of the same
410+
timezone abbreviation are considered an error.
406411
</para>
407412

408413
<para>
409414
In an unmodified installation, the file <filename>Default</> contains
410415
all the non-conflicting time zone abbreviations for most of the world.
411416
Additional files <filename>Australia</> and <filename>India</> are
412417
provided for those regions: these files first include the
413-
<literal>Default</> file and then add or modify timezones as needed.
418+
<literal>Default</> file and then add or modify abbreviations as needed.
414419
</para>
415420

416421
<para>
417422
For reference purposes, a standard installation also contains files
418423
<filename>Africa.txt</>, <filename>America.txt</>, etc, containing
419424
information about every time zone abbreviation known to be in use
420-
according to the <literal>zoneinfo</> timezone database. The zone name
425+
according to the IANA timezone database. The zone name
421426
definitions found in these files can be copied and pasted into a custom
422427
configuration file as needed. Note that these files cannot be directly
423428
referenced as <varname>timezone_abbreviations</> settings, because of
@@ -426,9 +431,9 @@
426431

427432
<note>
428433
<para>
429-
If an error occurs while reading the time zone data sets, no new value is
430-
applied but the old set is kept. If the error occurs while starting the
431-
database, startup fails.
434+
If an error occurs while reading the time zone abbreviation set, no new
435+
value is applied and the old set is kept. If the error occurs while
436+
starting the database, startup fails.
432437
</para>
433438
</note>
434439

doc/src/sgml/installation.sgml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1066,7 +1066,7 @@ su - postgres
10661066
<para>
10671067
<productname>PostgreSQL</> includes its own time zone database,
10681068
which it requires for date and time operations. This time zone
1069-
database is in fact compatible with the <quote>zoneinfo</> time zone
1069+
database is in fact compatible with the IANA time zone
10701070
database provided by many operating systems such as FreeBSD,
10711071
Linux, and Solaris, so it would be redundant to install it again.
10721072
When this option is used, the system-supplied time zone database

src/backend/utils/adt/date.c

Lines changed: 23 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2596,24 +2596,39 @@ timetz_zone(PG_FUNCTION_ARGS)
25962596
pg_tz *tzp;
25972597

25982598
/*
2599-
* Look up the requested timezone. First we look in the date token table
2600-
* (to handle cases like "EST"), and if that fails, we look in the
2601-
* timezone database (to handle cases like "America/New_York"). (This
2602-
* matches the order in which timestamp input checks the cases; it's
2603-
* important because the timezone database unwisely uses a few zone names
2604-
* that are identical to offset abbreviations.)
2599+
* Look up the requested timezone. First we look in the timezone
2600+
* abbreviation table (to handle cases like "EST"), and if that fails, we
2601+
* look in the timezone database (to handle cases like
2602+
* "America/New_York"). (This matches the order in which timestamp input
2603+
* checks the cases; it's important because the timezone database unwisely
2604+
* uses a few zone names that are identical to offset abbreviations.)
26052605
*/
26062606
text_to_cstring_buffer(zone, tzname, sizeof(tzname));
2607+
2608+
/* DecodeTimezoneAbbrev requires lowercase input */
26072609
lowzone = downcase_truncate_identifier(tzname,
26082610
strlen(tzname),
26092611
false);
26102612

2611-
type = DecodeSpecial(0, lowzone, &val);
2613+
type = DecodeTimezoneAbbrev(0, lowzone, &val, &tzp);
26122614

26132615
if (type == TZ || type == DTZ)
2614-
tz = val * MINS_PER_HOUR;
2616+
{
2617+
/* fixed-offset abbreviation */
2618+
tz = -val;
2619+
}
2620+
else if (type == DYNTZ)
2621+
{
2622+
/* dynamic-offset abbreviation, resolve using current time */
2623+
pg_time_t now = (pg_time_t) time(NULL);
2624+
struct pg_tm *tm;
2625+
2626+
tm = pg_localtime(&now, tzp);
2627+
tz = DetermineTimeZoneAbbrevOffset(tm, tzname, tzp);
2628+
}
26152629
else
26162630
{
2631+
/* try it as a full zone name */
26172632
tzp = pg_tzset(tzname);
26182633
if (tzp)
26192634
{

0 commit comments

Comments
 (0)
0