From 72be2691dae0046bf4b863a48f33aac7618a44d2 Mon Sep 17 00:00:00 2001 From: Mark Davis Date: Thu, 20 Feb 2025 18:10:53 -0800 Subject: [PATCH 1/3] Literal keys contents / values --- spec/syntax.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/spec/syntax.md b/spec/syntax.md index 3a18886b90..09fdbd0683 100644 --- a/spec/syntax.md +++ b/spec/syntax.md @@ -369,7 +369,7 @@ otherwise, a corresponding _Data Model Error_ will be produced during processing - _Duplicate Variant_: Each _variant_ MUST use a list of _keys_ that is unique from that of all other _variants_ in the _message_. - _Literal_ _keys_ are compared by their contents, not their syntactical appearance. + _Literal_ _keys_ are compared by their values, not their syntactical appearance. ```abnf matcher = match-statement s variant *(o variant) @@ -470,7 +470,7 @@ that matches all values for a given _selector_. The value of each _literal_ _key_ MUST be treated as if it were in [Unicode Normalization Form C](https://unicode.org/reports/tr15/) ("NFC"). -Two _literal_ _keys_ are considered equal if they are canonically equivalent strings, +Two _literal_ _keys_ are considered equal if their values are canonically equivalent strings, that is, if they consist of the same sequence of Unicode code points after Unicode Normalization Form C has been applied to both. @@ -742,8 +742,9 @@ escaped as `\\` and `\|`. An **_unquoted literal_** is a _literal_ that does not require the `|` quotes around it to be distinct from the rest of the _message_ syntax. -An _unquoted literal_ MAY be used when the content of the _literal_ -contains no whitespace and otherwise matches the `unquoted-literal` production. +An _unquoted literal_ MAY be used when the value of the _literal_ +matches the `unquoted-literal` production. +It will thus contain no whitespace (nor certain other characters). Implementations MUST NOT distinguish between _quoted literals_ and _unquoted literals_ that have the same sequence of code points. @@ -755,6 +756,8 @@ literal = quoted-literal / unquoted-literal quoted-literal = "|" *(quoted-char / escaped-char) "|" unquoted-literal = 1*name-char ``` +The **_value of a literal_** is the text after removing the terminating | characters if the literal is quoted, +then unescaping any escaped characters. #### Names and Identifiers From 777dd31f7c5f54dc7ef5a867dcecccb769647214 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Fri, 21 Feb 2025 11:27:03 -0800 Subject: [PATCH 2/3] Apply suggestions from code review Merge @eemeli's suggestions plus the "forked" suggestion from Mark Co-authored-by: Eemeli Aro --- spec/syntax.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/spec/syntax.md b/spec/syntax.md index 09fdbd0683..1053235bd1 100644 --- a/spec/syntax.md +++ b/spec/syntax.md @@ -369,7 +369,7 @@ otherwise, a corresponding _Data Model Error_ will be produced during processing - _Duplicate Variant_: Each _variant_ MUST use a list of _keys_ that is unique from that of all other _variants_ in the _message_. - _Literal_ _keys_ are compared by their values, not their syntactical appearance. + _Literal_ _keys_ are compared by their _string values_, not their syntactical appearance. ```abnf matcher = match-statement s variant *(o variant) @@ -470,7 +470,7 @@ that matches all values for a given _selector_. The value of each _literal_ _key_ MUST be treated as if it were in [Unicode Normalization Form C](https://unicode.org/reports/tr15/) ("NFC"). -Two _literal_ _keys_ are considered equal if their values are canonically equivalent strings, +Two _literal_ _keys_ are considered equal if their _string values_ are canonically equivalent strings, that is, if they consist of the same sequence of Unicode code points after Unicode Normalization Form C has been applied to both. @@ -742,7 +742,7 @@ escaped as `\\` and `\|`. An **_unquoted literal_** is a _literal_ that does not require the `|` quotes around it to be distinct from the rest of the _message_ syntax. -An _unquoted literal_ MAY be used when the value of the _literal_ +An _unquoted literal_ MAY be used when the _string value_ of the _literal_ matches the `unquoted-literal` production. It will thus contain no whitespace (nor certain other characters). Implementations MUST NOT distinguish between _quoted literals_ and _unquoted literals_ @@ -756,7 +756,11 @@ literal = quoted-literal / unquoted-literal quoted-literal = "|" *(quoted-char / escaped-char) "|" unquoted-literal = 1*name-char ``` -The **_value of a literal_** is the text after removing the terminating | characters if the literal is quoted, +The **_string value_** of a _literal_ +for _unquoted literals_ is the text content of that _literal_; +or for _quoted literals_, the text content of that _literal_ +after removing the enclosing `|` characters +and unescaping any escaped characters then unescaping any escaped characters. #### Names and Identifiers From df1055712f7f93a8d089aaa75c8fcab1a88bf713 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Fri, 21 Feb 2025 11:28:00 -0800 Subject: [PATCH 3/3] fix merge fiasco --- spec/syntax.md | 1 - 1 file changed, 1 deletion(-) diff --git a/spec/syntax.md b/spec/syntax.md index 1053235bd1..a6dbe0739f 100644 --- a/spec/syntax.md +++ b/spec/syntax.md @@ -760,7 +760,6 @@ The **_string value_** of a _literal_ for _unquoted literals_ is the text content of that _literal_; or for _quoted literals_, the text content of that _literal_ after removing the enclosing `|` characters -and unescaping any escaped characters then unescaping any escaped characters. #### Names and Identifiers