10000 [Serializer] Prevent `Cannot traverse an already closed generator` error by materializing Traversable input by santysisi · Pull Request #60260 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

[Serializer] Prevent Cannot traverse an already closed generator error by materializing Traversable input #60260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 12, 2025

Conversation

santysisi
Copy link
Contributor
Q A
Branch? 6.4
Bug fix? yes
New feature? no
Deprecations? no
Issues Fix #60141
License MIT

✅ Pull Request description:

This PR addresses the issue reported in the linked ticket, specifically the error:
Cannot traverse an already closed generator

The fix involves converting Traversable input into an array using iterator_to_array($data), in order to allow multiple traversals of generator-based inputs.

At first glance, it might seem that this approach significantly increases memory usage, since all generator values are stored in memory. However, this is not the case. In the current logic, the following loop already forces all values from the generator into memory:

foreach ($data as &$value) {
    $flattened = [];
    $this->flatten($value, $flattened, $keySeparator, '', $escapeFormulas);
    $value = $flattened;
}

Therefore, materializing the generator with iterator_to_array() does not increase peak memory usage in any meaningful way.
As an example, here's the comparison of peak memory usage (measured with memory_get_peak_usage) when processing an array of 1 million integers:

  • With the fix: 90,044,272 bytes
  • Without the fix: 89,936,680 bytes

The difference is negligible, confirming that the fix does not introduce a meaningful performance cost in terms of memory.

@santysisi santysisi requested a review from dunglas as a code owner April 23, 2025 22:07
@carsonbot carsonbot added this to the 6.4 milestone Apr 23, 2025
@carsonbot carsonbot changed the title Fix: prevent "Cannot traverse an already closed generator" error by materializing Traversable input [Serializer] Fix: prevent "Cannot traverse an already closed generator" error by materializing Traversable input Apr 24, 2025
@nicolas-grekas nicolas-grekas changed the title [Serializer] Fix: prevent "Cannot traverse an already closed generator" error by materializing Traversable input [Serializer] Prevent "Cannot traverse an already closed generator" error by materializing Traversable input Apr 24, 2025
@mtarld
Copy link
Contributor
mtarld commented Apr 30, 2025

Hey @santysisi , thanks for this PR!

Indeed I'm a bit worried about the memory usage. Could you give it a try with more complex and filled rows?

At least, I think this behavior should be opt-in (using a context option) to let applications needing to be really memory efficient work properly.

@OskarStark OskarStark changed the title [Serializer] Prevent "Cannot traverse an already closed generator" error by materializing Traversable input [Serializer] Prevent Cannot traverse an already closed generator error by materializing Traversable input Apr 30, 2025
@santysisi
Copy link
Contributor Author

Hi @mtarld , thanks for your comment 😄 , I really appreciate it!

I agree with your concern, and I think it's a really good idea. Yes, I can give it a try with more complex and filled row. I’d be happy to do that if you want! 😄
That said, I believe such a refactor should go into the 7.4 branch, since this PR is just a fix. Also, from what I can tell, the current fix doesn’t significantly impact memory usage, but maybe I’m wrong.

Copy link
Member
@GromNaN GromNaN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after some minor changes.

Indeed I'm a bit worried about the memory usage. Could you give it a try with more complex and filled rows?
At least, I think this behavior should be opt-in (using a context option) to let applications needing to be really memory efficient work properly.

@mtarld the $data is iterated recursively multiple times during the encoding, so all the data must be in-memory to find the column names.
If the iterator is able to re-generate each item on-demand, the memory usage could be lower, but the CPU or network would be more stressed.

@santysisi santysisi force-pushed the fix-csv-encoder-generator-traversal branch from c6d056d to a159cbd Compare May 11, 2025 22:13
Copy link
Contributor
@mtarld mtarld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GromNaN, indeed you're right, didn't see about the column names. Then LGTM after a minor suggestion.

CSV, $this->encoder->encode($data, 'csv'));
}

public function provideIterable()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public function provideIterable()
public static function provideIterable(): iterable

@fabpot fabpot force-pushed the fix-csv-encoder-generator-traversal branch from a159cbd to c7206a9 Compare May 12, 2025 08:02
@fabpot
Copy link
Member
fabpot commented May 12, 2025

Thank you @santysisi.

@fabpot fabpot merged commit 1fb9d17 into symfony:6.4 May 12, 2025
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0