8000 bpo-29654 : Support If-Modified-Since HTTP header (browser cache) by PierreQuentel · Pull Request #298 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

bpo-29654 : Support If-Modified-Since HTTP header (browser cache) #298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Apr 2, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
20cc898
Support If-Modified-Since HTTP headers, return 304 response if file w…
Feb 25, 2017
f1d8a48
Fix bug in datetime comparisons. Ignore If-Modified-Since if If-None-…
Feb 26, 2017
c4403ec
Update Misc/NEWS and What's New
Feb 26, 2017
3225080
Fix typo
Feb 26, 2017
415e4af
Use parsedate_to_datetime to extract datetime from If-Modified-Since,…
Feb 26, 2017
8595735
Change computing of dates used to test browser cache
Feb 26, 2017
2be2f8c
Remove microseconds from time of last modification with .replace(micr…
Feb 26, 2017
c3337f4
Put os.fstat() call inside the try block ; handle obsolete HTTP date …
Feb 26, 2017
143e1e9
Restore alphabetical order in section Improved Modules
Feb 26, 2017
42edfe3
Store last modification date in setUp ; split browser cache tests in …
Feb 26, 2017
1e68be9
Specify the exceptions to catch in parsedate_to_datetime
PierreQuentel Feb 27, 2017
fc20596
Use except/else for parsedate_to_datetime exception handling
PierreQuentel Feb 27, 2017
9436e45
Restore blank line
Feb 28, 2017
708ff03
Put imports in alphabetical order. Change version to 0.7.
Mar 1, 2017
6a58894
Presentation changes.
Mar 1, 2017
db46671
Change order of imports.
Mar 1, 2017
41f74ef
Update Misc/NEWS (conflict)
Mar 2, 2017
204c503
Merge branch 'master' into master
PierreQuentel Mar 2, 2017
e2fee49
Merge branch 'master' into master
PierreQuentel Mar 4, 2017
6dadf6e
Restore version number 0.6
Mar 7, 2017
c389bf6
Replace "datetime" by "time"
Mar 7, 2017
3cc4f6d
Merge branch 'master' of https://github.com/PierreQuentel/cpython
Mar 7, 2017
294e164
Proposal for an update of the http.server module documentation
Mar 7, 2017
6c5ba39
Merge branch 'master' into master
PierreQuentel Mar 8, 2017
1e0f3e3
Merge branch 'master' into master
PierreQuentel Mar 15, 2017
2503add
Merge branch 'master' into master
serhiy-storchaka Mar 20, 2017
36a8e2a
Merge branch 'master' into master
PierreQuentel Mar 23, 2017
c8b108f
Add a "Changed in version 3.7" comment
Mar 24, 2017
bd91346
Merge branch 'master' of https://github.com/PierreQuentel/cpython
Mar 24, 2017
d3238d2
Merge branch 'master' into master
PierreQuentel Apr 1, 2017
16f9e11
Remove trailing whitespaces
Apr 1, 2017
c669909
Merge branch 'master' of https://github.com/PierreQuentel/cpython
Apr 1, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix bug in datetime comparisons. Ignore If-Modified-Since if If-None-…
…Match is present. Add tests.
  • Loading branch information
Pierre Quentel committed Feb 26, 2017
commit f1d8a484dfa2812f57decb259f6bbcde72d59c05
22 changes: 13 additions & 9 deletions Lib/http/server.py
10000
Original file line number Diff line number Diff line change
Expand Up @@ -687,25 +687,30 @@ def send_head(self):
except OSError:
self.send_error(HTTPStatus.NOT_FOUND, "File not found")
return None

fs = os.fstat(f.fileno())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f is leaked if os.fstat() raises an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


try:
# Use browser cache if possible
if "If-Modified-Since" in self.headers:
if "If-Modified-Since" in self.headers \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is more preferable to use parenthesis rather than line continuation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

and not "If-None-Match" in self.headers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done too

# compare If-Modified-Since and date of last file modification
fs = os.stat(path)
ims = email.utils.parsedate(self.headers["If-Modified-Since"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This email function accepts a timezone (e.g. from the documentation: -0500), and seems to ignore it. But it may make more sense for a HTTP server to either adjust the date according to the timezone (e.g. add 5 hours), or treat it as an error (required by RFC 7232).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks a problem to me that email.utils.parsedate() silently ignores timezone. Maybe use email.utils.parsedate_to_datetime() instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the latest commit uses parsedate_to_datetime() and checks that it has a timezone set to UTC, which should be the case for the header (cf RFC 7231). To compare with last modification time, both must be timezone-aware datetimes.

if ims is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can email.utils.parsedate_to_datetime() return None? If it can't, just use else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

email.utils.parsedate_to_datetime() either returns a datetime.datetime() or raises an exception, in which case ims is set to None.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just else:

except (TypeError, IndexError, OverflowError, ValueError):
    # ignore ill-formed values
    pass
else:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in commit fc20596

ims_datetime = datetime.datetime(*ims[:7])
ims_dtstring = ims_datetime.strftime("%d %b %Y %H:%M:%S")
last_modif = datetime.datetime.utcfromtimestamp(
fs.st_mtime).strftime("%d %b %Y %H:%M:%S")
if last_modif <= ims_dtstring:
# If-Modified-Since is UTC, rounded to the second
tzinfo = datetime.timezone(datetime.timedelta(hours=0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datetime.timezone.utc

But maybe use naive datetime objects?

ims_datetime = datetime.datetime(*ims[:7], tzinfo=tzinfo)
# compare to UTC datetime of last modification, also
# rounded to the second
mtime = int(fs.st_mtime)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“Int” rounds towards 1970. The actual rounding that “date_time_string” uses seems to be a combination of rounding half to even microseconds (thanks to datetime), and then rounding down (always backwards in time) to seconds.

It might be more sensible to call datetime.replace(microseconds=0).

last_modif = datetime.datetime.fromtimestamp(mtime, tzinfo)
if last_modif <= ims_datetime:
self.send_response(HTTPStatus.NOT_MODIFIED)
self.end_headers()
f.close()
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return None. If there are return statements with value, all None values should be returned explicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.send_response(HTTPStatus.OK)
self.send_header("Content-type", ctype)
fs = os.fstat(f.fileno())
self.send_header("Content-Length", str(fs[6]))
self.send_header("Last-Modified", self.date_time_string(fs.st_mtime))
self.end_headers()
Expand Down Expand Up @@ -1229,4 +1234,3 @@ def test(HandlerClass=BaseHTTPRequestHandler,
handler_class = SimpleHTTPRequestHandler
test(HandlerClass=handler_class, port=args.port, bind=args.bind)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't add this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit fc20596


30 changes: 28 additions & 2 deletions Lib/test/test_httpservers.py
Original file line number Diff line number Diff line change
Expand Up @@ -448,16 +448,42 @@ def test_browser_cache(self):
#constructs the path relative to the root directory of the HTTPServer
response = self.request(self.base_url + '/test')
self.check_status_and_reason(response, HTTPStatus.OK, data=self.data)
# get Last-Modified response heaer
last_modif = response.headers['Last-modified']

# send new request to the same url with request header
# If-Modified-Since
# If-Modified-Since set to Last-Modified : must return 304
from email.message import Message
headers = Message()
headers['If-Modified-Since'] = last_modif
response = self.request(self.base_url + '/test', headers=headers)
self.check_status_and_reason(response, HTTPStatus.NOT_MODIFIED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please repeat the test with an earlier date, verifying that a response body is returned.


# if If-None-Match header is present, ignore If-Modified-Since
headers['If-None-Match'] = "*"
response = self.request(self.base_url + '/test', headers=headers)
self.check_status_and_reason(response, HTTPStatus.OK)

# with If-Modified-Since earlier than Last-Modified, must return 200
import datetime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import at module level. Importing inside functions is used in exceptional cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in commit 42edfe3

import email.utils
dt = email.utils.parsedate(last_modif)
# build datetime object : one year before last modification
old_dt = [dt[0] - 1] + list(dt[1:7])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the date reflects when the test suite is run, this will fail on a leap day (e.g. 2020-02-29). Perhaps try something like “datetime(*old_dt) - timedelta(days=365)”.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I have modified the test.

old_dt = datetime.datetime(*old_dt)
headers = Message()
headers['If-Modified-Since'] = email.utils.format_datetime(old_dt)
response = self.request(self.base_url + '/test', headers=headers)
self.check_status_and_reason(response, HTTPStatus.OK)

# build datetime object : one hours after last modification
new_dt = [dt[0]] + [dt[1] + 1] + list(dt[2:7])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dt[1] is attribute tm_mon, but the comment suggests you want to adjust by one hour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the implementation with timedelta(hours=1) instead

new_dt = datetime.datetime(*new_dt)
headers = Message()
headers['If-Modified-Since'] = email.utils.format_datetime(new_dt)
response = self.request(self.base_url + '/test', headers=headers)
self.check_status_and_reason(response, HTTPStatus.NOT_MODIFIED)


def test_invalid_requests(self):
response = self.request('/', method='FOO')
self.check_status_and_reason(response, HTTPStatus.NOT_IMPLEMENTED)
Expand Down
0