8000 Make ConvertFrom-Json deserialize an array of objects with multiple lines. by Francisco-Gamino · Pull Request #3823 · PowerShell/PowerShell · GitHub
[go: up one dir, main page]

Skip to content

Make ConvertFrom-Json deserialize an array of objects with multiple lines. #3823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 25, 2017
Merged

Make ConvertFrom-Json deserialize an array of objects with multiple lines. #3823

merged 2 commits into from
May 25, 2017

Conversation

Francisco-Gamino
Copy link
Contributor
@Francisco-Gamino Francisco-Gamino commented May 19, 2017

Addressing issue #3284: ConvertFrom-JSON fails to parse syntactically-correct JSON array

For PowerShell on CoreCLR, we use Json.Net. However, this deserialzer does not throw an exception if an invalid array is passed, so we need to add extra logic to handle this. An invalid array can occurred when the user reads a file via get-content as a collection of PSObjects. E.g.,

# test.json has the following content:
PS D:\> Get-Content .\test.json
[
    1,
    2
]

# The output of Get-Content is a collection of Object[]
PS D:\> (Get-Content .\test.json).GetType()

IsPublic IsSerial Name                                     BaseType
-------- -------- ----                                     --------
True     True     Object[]                                 System.Array

# When we pipe the output of Get-Content to ConvertFrom-Json, the logic in the cmdlet
# sends the first object to be deserialized, and if an Argument exception 
# is thrown, then it sends the whole collection of objects as one single Json content. 
# For CoreCLR PowerShell, we use Json.Net which does not throw an error when an 
# invalid array is deserialize, so this PR addresses this issue. 

# Here is the output of ConvertFrom-Json after this code change:
PS D:\> Get-Content .\test.json | ConvertFrom-Json
1
2

Added tests for the following:

  • Deserialize a Json array of strings (in multiple lines)
  • Deserialize a Json array of PSObjects (in multiple lines)

error = null;
#if CORECLR
object obj = null;
try
{
// JsonConvert.DeserializeObject does not throw an exception when an invalid array is passed,
// so we need to do extra logic to identify this scenario using JArray.Parse(input).
if ((input.IndexOf("[", StringComparison.OrdinalIgnoreCase) >= 0) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a && so we check for existence of [ and ] ? If any of those is missing it is an invalid array anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this should be an '||' because we are trying to detect an invalid array. These are some examples:

  1. '["a"'
  2. '['
  3. '"a"]'

(input.IndexOf("]", StringComparison.OrdinalIgnoreCase) >= 0))
{
// Remove carriage return, new-line and empty space from the string.
input = input.Trim(System.Environment.NewLine.ToCharArray());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trim will only remove newline characters at the start and end. To remove all newline characters use Replace instead. And .ToCharArray() is unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// Remove carriage return, new-line and empty space from the string.
input = input.Trim(System.Environment.NewLine.ToCharArray());
input = input.Replace(" ", "");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a String.Empty instead of "". More efficient as it does not create a new zero length string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// The input from the pipeline is can be represented in the following two ways:
// 1. Each input to the buffer is a complete Json content. There can be multiple inputs of this format.
// 2. The complete buffer input collectively represent a single JSon format. This is typically the majority of the case.
if (input.StartsWith("[", StringComparison.OrdinalIgnoreCase) || input.EndsWith("]", StringComparison.OrdinalIgnoreCase))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a && here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment above.

3]
#>
$filePath = Join-Path $TESTDRIVE test2.json
"[1,`r`n2,`r`n3]" | Out-File $filePath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use [Environment]::NewLine instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test code updated as per Jim's recommendation below.

@@ -1455,4 +1454,43 @@ Describe "Json Bug fixes" -Tags "Feature" {
{
RunJsonTest $testCase
}

It "ConvertFrom-Json can parse an array of PSObjects as a single string." {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the variations be converted to us 10000 e -TestCase for It? So if one of the Should fails the others are still run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented two different test cases instead.

Add-Member -InputObject $obj2 -NotePropertyName "objectName" -NotePropertyValue "object2Name"
Add-Member -InputObject $obj2 -NotePropertyName "objectValue" -NotePropertyValue "object2Value"

# Add them to an array
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should just be:

$array = [pscustomobject]@{ objectName = "object1Name"; objectValue = "object1Value" },
         [pscustomobject]@{ objectName = "object2Name"; objectValue = "object2Value" }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion. I will update the test code.

2,
3]
#>
$filePath = Join-Path $TESTDRIVE test2.json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this (or the above) needs to be done with a file at all. Does the following cover the case?

$result = "[1,","2,","3]" | convertfrom-json
$result.count | Should be 3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test code updated.

Copy link
Member
@adityapatwardhan adityapatwardhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JamesWTruher
Copy link
Collaborator

tests are much cleaner now - LGTM

@SteveL-MSFT
Copy link
Member

@JamesWTruher can you mark your review as Approved?

@Francisco-Gamino
Copy link
Contributor Author

@PowerShell/powershell-maintainers : This is ready to be merged, please let me know if you need anything else. Thanks.

@TravisEz13 TravisEz13 self-assigned this May 19, 2017
// JsonConvert.DeserializeObject does not throw an exception when an invalid array is passed,
// so we need to do extra logic to identify this scenario using JArray.Parse(input).

// If input contains '[' or ']', we would like to analyze it further as these markers for a Json data array.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says contains, but the code says starts with/ends with.

Besides that, it just looks more complicated than it needs to be, e.g. 2 calls to IndexOf, then strip newlines then spaces. Why not just use a regex, like:

if (Regex.Match(input, "(^\s*\[)|(\s*\]$)"))
{
    JArray.Parse(input);
}

I'd also like to see some sort of note about whether or not this is considered a NewtonSoft bug - ideally a link to an issue so we can track it.

// JsonConvert.DeserializeObject does not throw an exception when an invalid array is passed,
// so we need to do extra logic to identify this scenario using JArray.Parse(input).

// If input contains '[' or ']', we would like to analyze it further as these markers for a Json data array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as these markers for

Do you mean these markers are for?

(input.IndexOf("]", StringComparison.OrdinalIgnoreCase) >= 0))
{
// Remove carriage return, new-line and empty space from the string.
string stringToValidate = input.Replace(System.Environment.NewLine, string.Empty);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment doesn't match the code. Carriage return is \r, but Environemnt.NewLine is \r\n on windows and \n on Unix. And what if the json string is has \r\n but we are calling ConvertFrom-Json on Unix?

{
// Remove carriage return, new-line and empty space from the string.
string stringToValidate = input.Replace(System.Environment.NewLine, string.Empty);
stringToValidate = stringToValidate.Replace(" ", string.Empty);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about other white spaces? For example what if there is \t?
By the way, new lines are white spaces too.

string stringToValidate = input.Replace(System.Environment.NewLine, string.Empty);
stringToValidate = stringToValidate.Replace(" ", string.Empty);

// After removing all white space and new-lines, we are only interested in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

white spaces include new lines.

[char]::IsWhiteSpace([environment]::NewLine, 0)
True

// Use JArray.Parse to see if this is a valid array. When an exception is thrown, this means that
// we are potentially processing partial content -- this would be the case for number 2 below.

// A Json input from the pipeline is can be represented in the following two ways:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is can be

Typo


// A Json input from the pipeline is can be represented in the following two ways:
// 1. Each input to the buffer is a complete Json content. There can be multiple inputs of this format.
// 2. The complete buffer input collectively represent a single JSon format. This is typically the majority of the case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments here are confusing to me -- what is the buffer here? Could you please make the comments easier to understand?

@daxian-dbw
Copy link
Member

@Francisco-Gamino One more question -- why do you have to remove all white spaces before calling JArray.Parse? Won't it be enough to just trim the leading and trailing white spaces?

@TravisEz13
Copy link
Member
TravisEz13 commented May 19, 2017

Please update the title and description based on the instructions in Contributing - Pull Request Submission

  • Remove the issues number from the title and into the description (it's already in the description, so, just remove it from the title.)
  • Update the title to use "the present tense and imperative mood" to describe the change, not the issue you are fixing.

@daxian-dbw
Copy link
Member

Chatted with @Francisco-Gamino offline, and it turns out the JsonException thrown by JArray.Parse will be captured and re-thrown as an ArgumentException, and then the enclosing code will capture the ArgumentException and assume that it's caused by partial input and hen keep accumulating pipeline input. Francisco will add more comments to explain this process.

@Francisco-Gamino
Copy link
Contributor Author

@lzybkr: Thank you Jason for your suggestion. Using regular expressions makes the code much simpler.
@lzybkr, @daxian-dbw, and @TravisEz13 : Could you please take another look?

@Francisco-Gamino Francisco-Gamino changed the title Fixing #3284 ConvertFrom-JSON fails to parse syntactically-correct JSON array ConvertFrom-Json deserializes an array of objects in multiple lines. May 24, 2017
@Francisco-Gamino Francisco-Gamino changed the title ConvertFrom-Json deserializes an array of objects in multiple lines. Make ConvertFrom-Json deserialize an array of objects with multiple lines. May 25, 2017
@TravisEz13 TravisEz13 merged commit 7aa7f38 into PowerShell:master May 25, 2017
// To work around this, we need to identify when input is a Json array, and then try to parse it via JArray.Parse().

// If input starts with '[' or ends with ']' (ignoring white spaces).
if ((Regex.Match(input, @"(^\s*\[)|(\s*\]$)")).Success)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reg expression used here seems not right ... @Francisco-Gamino let's discuss it tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
0