E524 Fix for ForEach-Object -Parallel perf problem with many runspaces by PaulHigin · Pull Request #10455 · PowerShell/PowerShell · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@PaulHigin
Copy link
Contributor

PR Summary

This is a fix for the ForEach-Object perf Issue: #10450.

measure-command { 1..254 | foreach -Parallel {ping "192.168.0.$_" -n 1 | where {$_ -match "ttl="}} -ThrottleLimit 300 }

# Time for ThreadJob and ForEach -Parallel -AsJob is around 8 seconds
# Time for ForEach -Parallel is around 60 seconds

PR Context

The problem was that the number of concurrent runspaces was being limited to around 40-50, even though the ThrottleLimit was set to 300. It turns out it was the Cmdlet input processing thread that was causing the bottleneck. I don't know what the conflict is while spinning up runspaces, but the fix is to now do this on a dedicated thread similar to how it currently works for -AsJob.

PR Checklist

@adityapatwardhan
Copy link
Member

@PaulHigin

The change is causing a lot memory allocations as compared to before.

Scenario: measure-command { 1..10kb | ForEach-Object -Parallel { }}

Before change: 105 MB in 2.6 minutes
After change: 10,200 MB in 2.6 minutes

@daxian-dbw daxian-dbw dismissed adityapatwardhan’s stale review August 30, 2019 23:31

Dismiss Aditya's review given his new finding regarding the PR.

@PaulHigin
Copy link
Contributor Author

@adityapatwardhan
I am not too surprised at this, as this change increases the processing rate of creating/running scripts. Even though the default Throttle limit is 5, since the script block is empty the code is basically creating and disposing threads/runspaces as fast as it can, 10000 times. So I am not surprised that memory consumption is large.

As long as the CLR eventually collects the memory, I think we are Ok.

I have run tests that loop for hours, detecting handle and working set usage. So far I see the usage plateau after a while so there seems to be no resource leak. However, the CLR and GC is getting a lot of exercise!

@vexx32
Copy link
Collaborator
vexx32 commented Sep 4, 2019

Speed is awesome, but I don't think most folks will agree that... what, 10 gigabytes(?) of RAM usage for a task that is literally doing no work is something they'll want to work with regularly if at all.

Are currently available alternatives this excessive in terms of memory usage? (e.g., PoshRSJobs?)

@PaulHigin
Copy link
Contributor Author

@vexx32

measure-command { 1..10kb | ForEach-Object -Parallel { }}

The code is useless and is essentially creating 100s of thousands of managed objects and releasing them as fast as possible. It is no surprise that memory usage peaks until GC can deal with it. This is more about managed code and the CLR than about PowerShell. You can do the same thing by writing silly C# code.

@adityapatwardhan adityapatwardhan added the CL-Performance Indicates that a PR should be marked as a performance improvement in the Change Log label Sep 5, 2019
@adityapatwardhan adityapatwardhan merged commit cfdbd71 into PowerShell:master Sep 5, 2019
@TravisEz13 TravisEz13 added this to the 7.0.0-preview.4 milestone Sep 5, 2019
TravisEz13 pushed a commit to TravisEz13/PowerShell that referenced this pull request Sep 14, 2019
@ghost
Copy link
ghost commented Sep 19, 2019

🎉v7.0.0-preview.4 has been released which incorporates this pull request.:tada:

Handy links:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CL-Performance Indicates that a PR should be marked as a performance improvement in the Change Log

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

0