8000 Optimize console output in ConsoleHost by iSazonov · Pull Request #6882 · PowerShell/PowerShell · GitHub
[go: up one dir, main page]

Skip to content

Optimize con 8000 sole output in ConsoleHost#6882

Merged
daxian-dbw merged 18 commits intoPowerShell:masterfrom
iSazonov:optimize-console-output
Apr 17, 2019
Merged

Optimize console output in ConsoleHost#6882
daxian-dbw merged 18 commits intoPowerShell:masterfrom
iSazonov:optimize-console-output

Conversation

@iSazonov
Copy link
Collaborator
@iSazonov iSazonov commented May 16, 2018

PR Summary

  • Implement WriteLine method to write to a host. Win is that we exclude extra string allocations like value + Crlf and extra locks and kernel calls.
  • Exclude extra allocation for every char in console input (keyInfo.Character.ToString()) in ReadKey(ReadKeyOptions options).
  • Exclude extra allocations in the console host by implementation of WriteLine-like methods to replace Write(strValue + Crlf) with WriteLine(strValue).
  • Remove unneeded locks because of replacing Write(strValue); Write(Crlf) with WriteLine(strValue).
  • Add new lock in WriteToConsole(ConsoleColor foregroundColor, ConsoleColor backgroundColor, string text, bool newLine = false).
  • Use CoreFX pattern for P/Invoke to support ReadOnlySpan.

PR Checklist


This change is Reviewable

@iSazonov
Copy link
Collaborator Author

CI Appveyor temporary failed on Update-Help tests.

@iSazonov iSazonov closed this May 22, 2018
@iSazonov iSazonov reopened this May 22, 2018
@iSazonov
Copy link
Collaborator Author

Reopen the PR to restart CIs.

fixed (char* bufferPtr = &MemoryMarshal.GetReference(buffer))
{
return WriteConsole(consoleOutput, bufferPtr, numberOfCharsToWrite, out numberOfCharsWritten, reserved);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This obviously complexes the code. Actually, all changes here make code more complex. Do we have measurable perf improvement to justify the changes? I don't see a large amount of string manipulation here (please point it out if I miss them). Note that many string APIs and StringBuilder already received perf improvements in 2.1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like complement - I'm not sure I can create a complex code 😄

Exactly the pattern I found in CoreFX. This is due to C# language limitations.

Really the code isn't getting any more complicated. Rather on the contrary it becomes more complete.
By analyzing the code to clean up psl I found that in this place we make a lot of kernel calls and extra locks because of the multitude of:

WriteToConsole(text, true);
WriteToConsole(Crlf, true);

I was very surprised that we don't have a WriteLine() method to write to a host and use many allocations like value + Crlf - so I just implemented the method.
The only complicity of the code is that I had to modify three levels of interfaces. So really the PR is very simple.

Note that many string APIs and StringBuilder already received perf improvements in 2.1.

Yes, I saw up to 1100000 allocations in VS Profiler at pwsh.exe 6.0 startup and up to 850000 allocation at 6.1 (already on .Net 2.1) startup. Nevertheless, I still see a lot of allocations. The profile show that up to 36% allocations is chars and strings.
This shows that CoreCLR does not solve all allocation problems and we should continue to consume resources cautiously and to optimize our code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced whether the startup is the right scenario to measure for changes in this PR, but still, for the startup scenario, what's the improvement in object allocation with your changes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was only sample that 2.1 reduce allocations but not remove its at all. I don't know how measure wins in the PR - VS Profiler is crashed after first run. I look https://benchmarkdotnet.org/ but it takes a lot of effort. At least in the first steps.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attached a file with simple perf tests. I saw win ~2 second on 100000 outputs.
measure_conhost.txt

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need any of this code anymore? It was written back in V1 when the Console API was very limited. I suspect that we could remove most of the code with just a thin wrapper around the .NET APIs. (Which, in fact, were somewhat inspired by this code.)

Copy link
Collaborator Author
@iSazonov iSazonov Jul 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this code would have to be cleaned more thoroughly. The problem is that we do not have tests for the console code. Therefore, I would prefer to make small changes step by step so as not to break something. I intend to continue working in this direction (if only you do not chop it in one your PR).

@iSazonov iSazonov changed the title Exclude extra allocations in ConsoleHost Optimize console output in ConsoleHost Jun 6, 2018
@iSazonov
Copy link
Collaborator Author
iSazonov commented Jun 6, 2018

@daxian-dbw I changed PR title - seems old one confuse you that is main win in the PR.

@iSazonov
Copy link
Collaborator Author
iSazonov commented Jun 6, 2018

We could use https://benchmarkdotnet.org/ for perf analyses.

@daxian-dbw
Copy link
Member

@iSazonov I like the perf analysis idea. We definitely need it for benefits in the long run. I will look into it.

@iSazonov
Copy link
Collaborator Author
iSazonov commented Jun 8, 2018

@daxian-dbw Not related the PR - I passed 200 MB file through ConvertFrom-Csv and get alarming results.
On Windows PowerShell - 28 sec for every run.
On PowerShell Core 6.1 (crossgened) - from 33 to 55 sec! If I run and then wait some time - all results is ~33 sec. But if I run the test sequentially without pauses, the time starts with 33 and increase up to 55 sec.
I have two fears. (1) .Net Core 2.1 has a problem with a caching. (2) why we are slower on PowerShell Core.

@stale
Copy link
stale bot commented Jul 8, 2018

This PR has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs within 10 days.
Thank you for your contributions.
Community members are welcome to grab these works.

@stale stale bot added the Stale label Jul 8, 2018
@iSazonov
Copy link
Collaborator Author
iSazonov commented Jul 9, 2018

@daxian-dbw @anmenaga @SteveL-MSFT Please review the PR.

@stale stale bot removed the Stale label Jul 9, 2018
@iSazonov
Copy link
Collaborator Author

@daxian-dbw Could you please continue the PR review?

@daxian-dbw
Copy link
Member

@iSazonov I'm busy on some investigations and 6.1-must-have issues recently (RC will be next week and need to get those in). Will resume reviewing this PR in a few days.

@daxian-dbw
Copy link
Member

A ping for myself so that the stale bot doesn't kick in :)

@iSazonov
Copy link
Collaborator Author

ping -l 1496
:-)

@iSazonov iSazonov force-pushed the optimize-console-output branch from 8fa07db to 7d5c1ec Compare September 19, 2018 14:02
@iSazonov
Copy link
Collaborator Author

Rebased to update CIs.

@anmenaga
Copy link

@daxian-dbw friendly ping for review.

@iSazonov iSazonov force-pushed the optimize-console-output branch from 7d5c1ec to 3d87f96 Compare October 12, 2018 09:20
@iSazonov
Copy link
Collaborator Author
iSazonov commented Nov 6, 2018

@SteveL-MSFT Have MSFT team special thoughts about the PR? Can we continue or close?

@SteveL-MSFT
Copy link
Member

@iSazonov @daxian-dbw is currently on vacation and will be back next week.

@anmenaga
Copy link

@daxian-dbw friendly ping for review.

@daxian-dbw
Copy link
Member

I think @adityapatwardhan made some changes so that now Linux/macOS CIs always run feature tests. It looks to me your changes cause some feature tests to fail.

@iSazonov
Copy link
Collaborator Author
iSazonov commented Apr 9, 2019

@daxian-dbw Thanks! I I looked through the CIs history - feature tests was turned on but not reported in right way then this was fixed and now we see the CIs fail.
I reviewed my changes and not found cause of the failure. I'm going to create a new git branch to find a problem step by step if you don't have a better idea.

@iSazonov iSazonov force-pushed the optimize-console-output branch from 1a181ac to dee0dfd Compare April 9, 2019 08:58
@PoshChan

This comment has been minimized.

@PoshChan

This comment has been minimized.

@iSazonov iSazonov force-pushed the optimize-console-output branch from dee0dfd to c97670d Compare April 9, 2019 11:34
@iSazonov
Copy link
Collaborator Author
iSazonov commented Apr 9, 2019

I discover that input echoing to output in the failed tests but I can not find the cause.

@PoshChan

This comment has been minimized.

@PoshChan

This comment has been minimized.

@iSazonov
Copy link
Collaborator Author
iSazonov commented Apr 9, 2019

Last commit fix the problem but I wonder and don't understand how it works :-)

[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal void WriteToConsole(char c, bool transcribeResult)
{
Span<char> value = stackalloc char[1] { c };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in #6882 (comment), please use ReadonlySpan directly to avoid the unneeded conversion from Span to ReadonlySpan

Write(char c)
{
this.Write(new String(c, 1));
Span<char> c1 = stackalloc char[1] { c };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, use ReadonlySpan directly.

{
if (newLine)
{
WriteConsole(consoleHandle, ConsoleHostUserInterface.Crlf.AsSpan());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you don't need the AsSpan part.

@daxian-dbw
Copy link
Member

@iSazonov I left 3 comments, but pushed 2 commits to address them and do some tiny refactoring. The second commit's message describes itself:
Rename 'WriteHelper' to 'WriteImpl'; refactor 'WriteLine' and 'WriteLineToConsole'

Please take a look to see if the changes make sense. Thanks.

@iSazonov
Copy link
Collaborator Author

@daxian-dbw Sorry, I missed some fixes again. I only have short periods of time to do something useful and for forced pauses I lose context :-(
Thanks for pushing additional commits!

@lzybkr lzybkr removed their request for review April 16, 2019 04:20
@daxian-dbw daxian-dbw merged commit 2174dd8 into PowerShell:master Apr 17, 2019
@iSazonov
Copy link
Collaborator Author

@daxian-dbw Thanks for review and help!

I do not remember what I planned in the continuation of this work. I will look again :-)

The other day I was playing with a small application and added output to the console - this application catastrophically slowed down (up to 10 times). Now I am concerned that our intentions to replace our code with Console.* can lead to negative consequences.

@iSazonov iSazonov deleted the optimize-console-output branch April 18, 2019 03:33
@daxian-dbw
Copy link
Member

The plan was to replace P/Invoke calls with dotnet Console APIs.

added output to the console - this application catastrophically slowed down (up to 10 times)

I double it's because of the Console API. It could be the fact of writing to console itself caused this difference, no matter you were using P/Invoke or the managed Console API.

@iSazonov
Copy link
Collaborator Author
iSazonov commented Apr 18, 2019

The plan was to replace P/Invoke calls with dotnet Console APIs.

The dotnet API (Readline()) is not symmetric for all platforms. It seems we'll have to use ReadKey() method like we do for Unix currently. Is it valid for Windows?

I double it's because of the Console API.

Yes, but there is still quite a thick wrap in dotnet.

@TravisEz13 TravisEz13 added this to the 7.0.0-preview.1 milestone Apr 18, 2019
@TravisEz13 TravisEz13 added the CL-Engine Indicates that a PR should be marked as an engine change in the Change Log label Apr 18, 2019
@daxian-dbw
Copy link
Member

The dotnet API (Readline()) is not symmetric for all platforms.

Let's first focus on replacing the write P/Invoke. We can get back to the read operations later. I would like small PRs :)

Yes, but there is still quite a thick wrap in dotnet.

PSReadline is refactored to move away from P/Invoke to make it more portable. I think perf impact won't be a concern unless we can prove it's noticeable.

@iSazonov
Copy link
Collaborator Author

Let's first focus on replacing the write P/Invoke. We can get back to the read operations later. I would like small PRs :)

My current understanding that we can't even do it for Windows - It seems there is no "input completion by tab" in Readline().
https://source.dot.net/#System.Private.CoreLib/shared/System/IO/TextReader.cs,182

@daxian-dbw
Copy link
Member

Readline() is the reading part, right? We can focus on the writing part for now, and leave the reading related code untouched.

@iSazonov
Copy link
Collaborator Author

I can try it. Until I am sure to turn out to completely remove P/Invokes for writing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CL-Engine Indicates that a PR should be marked as an engine change in the Change Log

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants

0