csv parser used in import alias refactor#9091
csv parser used in import alias refactor#9091SytzeAndr wants to merge 95 commits intoPowerShell:masterfrom
Conversation
|
@SytzeAndr the |
reuse of the old implementation
Thanks for the comment, I'll then reuse the part from the old implementation for when to return an empty collection. |
|
@SytzeAndr Thanks for your contribution! We are working on enhancing the cmdlet and have plans to improve performance. |
|
@iSazonov I haven't tested it yet, but performance is one of the reasons I think refactoring this code would make sense. I haven't set up the testing environment locally (so therefore WIP), but when I do and the tests are passing I'll do a measure in terms of performance against the old implementation aswell. |
|
Are you aware that this method is only called from the The implementation of |
|
@SytzeAndr If you want to improve performance, I would suggest a different approach. Start with a scenario you know is slow. Then run it under a profiler, and try to understand if there are any bottlenecks. 8000Then try to understand the code around the bottleneck - think about it, why is it written the way it is, how could it be done instead etc. Keep recordings of the original scenario. Try out changes, and measure them, while at the same time making sure that the tests that covers your changes still passes. The area you have selected here is very hard to improve significantly upon, especially since it is not in the hot path. |
removed old commented code and implemented a version using the System.IO.Stringreader
|
some updates on this PR: I've implemented the method by using the StringReader and StringBuilder. It currently passes all CI tests, so I assume my implementation is fine. I assume use of the Stringbuilder has a positive impact on performance since the I've reduced the complexity of this method according to CodeFactor (and is regarded as being a fixed issue to the repo). Although I still think it can be improved by splitting some things and might do so. @powercode I used this PR aswell to get a bit used to contributing to powershell, and I chose to tackle an issue which could be fixed with some refactoring. Therefore, I didn't start with analyzing the bottlenecks since I had to set up things yet and wanted to start small by just refactoring some code. I agree with you in the sense that it would make more sense to analyze from a bit higher perspective to find the true bottlenecks. So the next step for me is to test it for performance against the old implementation, after that I'll remove the WIP tag (assuming performance is not worse). |
src/Microsoft.PowerShell.Commands.Utility/commands/utility/Csv.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.PowerShell.Commands.Utility/commands/utility/Csv.cs
Outdated
Show resolved
Hide resolved
|
@Geweldig makes sense, thanks for the suggestions! |
… values somehow. That throws an error instead of skipping the line
|
Sorry for the delay. I managed to get rid of the |
src/Microsoft.PowerShell.Commands.Utility/commands/utility/ImportAliasCommand.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.PowerShell.Commands.Utility/commands/utility/ImportAliasCommand.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.PowerShell.Commands.Utility/commands/utility/ImportAliasCommand.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.PowerShell.Commands.Utility/commands/utility/ImportAliasCommand.cs
Outdated
Show resolved
Hide resolved
test/powershell/Modules/Microsoft.PowerShell.Utility/Import-Alias.Tests.ps1
Outdated
Show resolved
Hide resolved
test/powershell/Modules/Microsoft.PowerShell.Utility/Import-Alias.Tests.ps1
Outdated
Show resolved
Hide resolved
test/powershell/Modules/Microsoft.PowerShell.Utility/Import-Alias.Tests.ps1
Outdated
Show resolved
Hide resolved
…ortAliasCommand.cs Co-Authored-By: Ilya <darpa@yandex.ru>
|
@iSazonov @daxian-dbw do you see any pending issues with this PR? |
src/Microsoft.PowerShell.Commands.Utility/commands/utility/ImportAliasCommand.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.PowerShell.Commands.Utility/commands/utility/ImportAliasCommand.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
@SytzeAndr Please fix Codacy issues in the new code: specifically about converting 'CreateItemOptions','ConstructAlias','IsValidParsedLine' to be static methods. Thank you.
…ortAliasCommand.cs Co-Authored-By: Ilya <darpa@yandex.ru>
|
Hi, sorry for the radio silence. I fixed the
Regarding static methods, I didn't made them static due to the following reasons. I'm not sure why codefactor doesn't pick these up. Let me know if you agree, if so I suspect it is fine for merging.
|
| while (i < csvTrimmed.Length && inQuotes) | ||
| { | ||
| i++; | ||
| nextChar = csvTrimmed[i]; |
There was a problem hiding this comment.
Boundary after i++ can be = csvTrimmed.Length:
while (++i < csvTrimmed.Length && inQuotes)There was a problem hiding this comment.
Wouldn't that introduce a bug? Replacing i with ++i would increment i when the conditional statement is checked, regardless whether inQuotes is true or false. In other words, when i is at the index of the second " (the escape quote), it will increment i one additional time, skipping the first character after we escape the quote sequence.
For example, a"b"cd would then be parsed to abd, which is incorrect as it should be parsed to abcd.
There was a problem hiding this comment.
@iSazonov , can you please looks at this again? thank you.
There was a problem hiding this comment.
My comment is only about that there can be boundary overflow because of i++ without follow check.
There was a problem hiding this comment.
That is true but would require some different solution, which is replacing line 147 with i + 1 < csvTrimmed.Length && inQuotes.
I just commited and pushed this change to this branch
|
@daxian-dbw Can you please review this again? thank you. |
|
|
||
| private static Collection<string> ParseCsvLine(string csv) | ||
| { | ||
| ReadOnlySpan<char> csvTrimmed = csv.Trim(); |
There was a problem hiding this comment.
csv should be ReadOnlySpan to avoid allocation here.
|
This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days. |
|
Since there is no activity for a long time and this is too ed 4E22 ge case, we can close the PR. |
PR Summary
The old implementation used a for loop with lots of if statements to parse a csv string, which is used in the
import-aliascmdlet. I think the old implementation is bug sensitive and hard to read. A new implementation can have a reduced complexity and uses theStringReaderandStringBuilderto convert a string in csv format to a collection of elements, while keeping functionality intact. Experimental analysis shows this new implementation performs better, especially for large Strings.PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:or[ WIP ]to the beginning of the title (theWIPbot will keep its status check atPendingwhile the prefix is present) and remove the prefix when the PR is ready.[feature]to your commit messages if the change is significant or affects feature tests