8000 Verify bytes written in Filesystem::dumpFile() by lstrojny · Pull Request #36952 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

Verify bytes written in Filesystem::dumpFile() #36952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Verify bytes written in Filesystem::dumpFile() #36952

wants to merge 4 commits into from

Conversation

lstrojny
Copy link
Contributor
Q A
Branch? 3.4
Bug fix? yes
New feature? no
Deprecations? no
Tickets -
License MIT
Doc PR

This PR fixes a problem where dumpFile() can write some bytes of a file but not all.

@nicolas-grekas nicolas-grekas added this to the 3.4 milestone May 28, 2020
if (\is_resource($content)) {
$expectedSize = fstat($content)['size'] - ftell($content);
} else {
$expectedSize = array_sum(array_map('strlen', (array) $content));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to add support for arrays - they worked "by chance" but that's not supported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are not dealing with arrays in 3.4 it would break the current behavior – the one that is only explicitly deprecated in 4.0. That doesn’t seem the right thing to do.

if (false === @file_put_contents($tmpFile, $content)) {
throw new IOException(sprintf('Failed to write file "%s".', $filename), 0, null, $filename);
if (\is_resource($content)) {
$expectedSize = fstat($content)['size'] - ftell($content);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really portable? I think some streams cannot be stat'ed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular you have in mind so we can add test cases? The stream interface requires implementing stat, don't know if there are some that don’t do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think e.g. the http stream wrapper when the remote server uses chunked encoding and doesn't return a content-length. Sure, stats would return a size, but a wrong one for the use case here.
The only way would be to buffer the full response in memory then write it, but that's break streaming of resources...

@nicolas-grekas
Copy link
Member

I fear we might end up in a dead-end here, see my comments. Did you encounter any issue related to this or is it a pure safety measure?
What about returning the number of written bytes, if one cares about it?

@lstrojny
Copy link
Contributor Author
lstrojny commented Jun 8, 2020

I fear we might end up in a dead-end here, see my comments. Did you encounter any issue related to this or is it a pure safety measure?

This happens whenever you hit a quota limit or a disk is full. As soon as that happens, Symfony’s FS will write incomplete files. That’s why projects like https://github.com/webimpress/safe-writer exist, as it’s unfortunately non-trivial to correctly write a file and care for all the edge cases.

If we cannot get the stream support correct, I wonder if it wouldn't be better to deprecate stream support over doubling down on incorrect writes. Basically following the precedence with the accidental array support.

@nicolas-grekas
Copy link
Member
nicolas-grekas commented Jun 8, 2020

What's the behavior of file_put_contents() when this situation happens (and a stream is provided?) Could it return false instead of an int?
We can always bypass file_put_contents() and do fread+fwrite in a loop.

@lstrojny
Copy link
Contributor Author
lstrojny commented Jun 9, 2020

What's the behavior of file_put_contents() when this situation happens (and a stream is provided?) Could it return false instead of an int?
We can always bypass file_put_contents() and do fread+fwrite in a loop.

Added a test case + the necessary fixes to handle https streams. If stat does not work, we just expect !== false for file_put_contents() now.


if (\is_resource($content)) {
$stat = fstat($content);
if (false !== $stat) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that's not generic enough: a userland stream wrapper could very well decode to return 0 ou -1 here. The only safe way would be to stream "manually".
BUT I would really like to know how PHP behaves in out of space situations here.
Could you give it a try? Mounting a loopback filesystem could be a way to try maybe? (I don't know how better than Google sorry).

Copy link
Contributor Author
@lstrojny lstrojny Jun 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to check again! Here is an example using a loopback device:

Warning: file_put_contents(): Only 401408 of 1048576 bytes written, possibly out of free disk space in Command line code on line 1
bool(false)

It even leaves behind a file:

 ls -lh another-file
-rw-r--r-- 1 vagrant vagrant 392K Jun  9 19:14 another-file

If this is indeed cross-platform behavior than I misdiagnosed the issue here – in fact there wouldn't be an issue. The tempfile couldn’t be fully written and file_put_contents() returns false, the exception is thrown, everything is fine.

Setup for the loopback device:

# Create loopback image file with 100x 10K blocks
$ dd if=/dev/zero of=limited.img bs=10K count=100
# Initiate the image as a loopback device
$ losetup -fP limited.img
# Create ext4 filesystem on the loopback image
$ mkfs.ext4 limited.img
# Create mountpoint
$ mkdir limited-mountpoint
# Mount the loopback device. Check that loop0 is indeed your device with losetup --list
$ sudo mount -o loop /dev/loop0 limited-mountpoint
# Fill the disk
dd if=/dev/zero of=limited-mountpoint/file bs=10K count=50
# Try file_put_contents()
php -ddisplay_errors=1 -r 'var_dump(file_put_contents("limited-mountpoint/another-file", str_repeat("a", 1024 * 1024)));'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for double-checking. This means we can close as everything is fine: an error will happen as expected, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that behavior is consistent across platform, yes. I don’t have access to Windows nor would I know how to simulate this. Do you have anybody in mind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense that this behaves the same on Windows.
I'm not able to verify. But let's close until someone proves there is something to do.
Thanks for having a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0