Chapter 14
Chapter 14
pandas provides sophisticated tools for creating and manipulating dates such as the Timestamp object, and is
the preferred method for working with dates. Section 16.3.1 builds on the content of this chapter and shows
how pandas is used with dates.
Date and time manipulation is provided by a built-in Python module datetime. This chapter assumes that
datetime has been imported using import datetime as dt.
Dates created using date do not allow times, and dates which require a time stamp can be created using
datetime, which combine the inputs from date and time, in the same order.
>>> dt.datetime(yr, mo, dd, hr, mm, ss, ms)
datetime.datetime(2012, 12, 21, 12, 21, 12, 21)
>>> d2 + dt.timedelta(30,0,0)
datetime.datetime(2014, 1, 20, 12, 21, 12, 20)
If times stamps are important, date types can be promoted to datetime using combine and a time.
>>> d3 = dt.date(2012,12,21)
>>> dt.datetime.combine(d3, dt.time(0))
datetime.datetime(2012, 12, 21, 0, 0)
Values in dates, times and datetimes can be modified using replace through keyword arguments.
>>> d3 = dt.datetime(2012,12,21,12,21,12,21)
>>> d3.replace(month=11,day=10,hour=9,minute=8,second=7,microsecond=6)
datetime.datetime(2012, 11, 10, 9, 8, 7, 6)
>>> d.strftime("%c")
'Fri Dec 31 23:59:59 1999'
>>> d.strftime("Prince's favorite day is %x, and his favorite year is %Y")
"Prince's favorite day is 12/31/99, and his favorit year is 1999"
Date Unit Common Name Range Time Unit Common Name Range
Y Year ±9.2 × 1018 years h Hour ±1.0 × 1015 years
M Month 17
±7.6 × 10 years m Minute ±1.7 × 1013 years
W Week ±2.5 × 1016 years s Second ±2.9 × 1011 years
D Day ±2.5 × 1016 years ms Millisecond ±2.9 × 108 years
us Microsecond ±2.9 × 105 years
ns Nanosecond ±292 years
ps Picosecond ±106 days
fs Femtosecond ±2.3 hours
as Attosecond ±9.2 seconds
Table 14.1: NumPy datetime64 range. The absolute range is January 1, 1970 plus the range.
14.4 Numpy
pandas provides a closely related format for dates and times known as a Timestamp, which should be preferred
in most cases to direct use of NumPy’s datetime64. See Section 16.3.1 for more information.
Version 1.7.0 of NumPy introduces a NumPy native date and time type known as datetime64 (to distin-
guish it from the Python-provided datetime type). The NumPy date and time type is still maturing and is
always fully supported in the scientific python stack at the time of writing these notes. This said, it is already
widely used and should see complete support in the near future. Additionally, the native NumPy data type
is generally better suited to data storage and analysis and extends the Python date and time with additional
features such as business day functionality.
NumPy contains both date and time (datetime64) and time-difference (timedelta64) objects. These differ
from the standard Python datetime since they always store the date and time or time difference using a 64-bit
integer plus a date or time unit. The choice of the date/time unit affects both the resolution of the datetime64
as well as the permissible range. The unit directly determines the resolution - using a date unit of a day ('D')
limits the resolution to days. Using a date unit of a week ('W') will allow a minimum of 1 week difference.
Similarly, using a time unit of a second ('s') will allow resolution up to the second (but not millisecond). The
set of date and time units, and their range are presented in Table 14.1.
NumPy datetime64s can be initialized using either human readable strings or using numeric values. The
string initialization is simple and datetime64s can be initialized using year only, year and month, the complete
date or the complete date including a time. The default time resolution is nanoseconds (10−9 ) and T is used to
separate the time from the date.
>>> datetime64('2013')
numpy.datetime64('2013')
>>> datetime64('2013-09')
numpy.datetime64('2013-09')
>>> datetime64('2013-09-01')
numpy.datetime64('2013-09-01')
Date or time units can be explicitly included as the second input. The final example shows that rounding can
occur if the date input is not exactly representable using the date unit chosen.
>>> datetime64('2013-01-01T00','h')
numpy.datetime64('2013-01-01T00:00+0000','h')
>>> datetime64('2013-01-01T00','s')
numpy.datetime64('2013-01-01T00:00:00+0000')
>>> datetime64('2013-01-01T00','ms')
numpy.datetime64('2013-01-01T00:00:00.000+0000')
>>> datetime64('2013-01-01','W')
numpy.datetime64('2012-12-27')
>>> dates[0]
numpy.datetime64('2013-09-01')
Note that datetime64 is not timezone aware. For timezone support use pandas Timestamp.
Dates which are initialized using one of the shorter forms are initialized at the earliest date (and time) in
the period.
>>> datetime64('2013')==datetime64('2013-01-01')
True
>>> datetime64('2013-09')==datetime64('2013-09-01')
True
A corresponding time difference class, similarly named timedelta64, is created when dates are differenced.
>>> datetime64('2013-09-02') - datetime64('2013-09-01')
numpy.timedelta64(1,'D')
timedelta64 types contain two pieces of information, a number indicating the number of steps between the
two dates and the size of the step.
14.5 Exercises
1. Use datetime.now to get the current time.
2. Construct the date 1 day, 1 week and 1 year from the time you get in 1 without directly entering it as a
datetime object.
3. Repeat 2 for initial dates February 28, 1999 and March 1, 1999.
(b) yyyy/mm/dd
i. HH:MM:SS LongMonthName dd, yyyy