8000 Define optionality separately for each u: option by eemeli · Pull Request #1012 · unicode-org/message-format-wg · GitHub
[go: up one dir, main page]

Skip to content

Define optionality separately for each u: option #1012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 17, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 22 additions & 3 deletions spec/u-namespace.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# MessageFormat 2.0 Unicode Namespace
# MessageFormat Unicode Namespace

The `u:` _namespace_ is reserved for the definition of _options_
which affect the _function context_ of the specific _expressions_
Expand All @@ -12,15 +12,26 @@ manages the specification for this namespace, hence the _namespace_ `u:`.

## Options

This section describes common **_<dfn>`u:` options</dfn>_** which each implementation SHOULD support
for all _functions_ and _markup_.
This section describes **_<dfn>`u:` options</dfn>_**.
When implemented, they apply to all _functions_ and _markup_,
including user-defined _functions_ in that implementation.

### `u:id`

Implementations providing a formatting target other than a concatenated string
SHOULD support this option.

A string value that is included as an `id` or other suitable value
in the formatted parts for the _placeholder_,
or any other structured formatted results.

> For example, `u:id` could be used to distinguish
> two otherwise matching placeholders from each other:
>
> ```
> The first number was {$a :number u:id=first} and the second {$b :number u:id=second}.
> ```

Ignored when formatting a message to a string.

The value of the `u:id` _option_ MUST be a _literal_ or a
Expand All @@ -31,6 +42,12 @@ and the `u:id` option is ignored.

### `u:locale`

> [!IMPORTANT]
> This _option_ has a status of **Draft**.
> It is proposed for inclusion in a future release and is not Stable.

Implementations MAY support this option.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be SHOULD.

I know we're discussing the ICU4X objection. The common use case for this option is is basically "I18N demos", which I write a lot of 😉:

The default date format in locale: {$locale :x:localeGetDisplayName} is {$now :date u:locale=$locale}

There are cases in which users wish to tailor the locale, for example using the -u extension for options but do not wish to transmit the "decorated" locale everywhere. Some of these items are option values on built-in functions, but overriding (for example) the digit shaping in the locale works for functions that have no such options.

A fairly common case is wanting to override the locale to be und/Locale.ROOT or some POSIX variant to get a locale-neutral representation.

Encouraging implementations to provide this feature with the stronger normative SHOULD does not mean that implementations that cannot support it are "bad".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think i18n demos is a use case that we ought to support if/as it adds complexity to an implementation.

If the numbering system is desirable to fix for a single function in a single message, wouldn't it also be desirable to fix for all messages, and therefore be encoded in the locale that's used to format the message or messages?

For ensuring that something is formatted in a locale-neutral representation, wouldn't a programmer want to submit that to the message formatter in a pre-formatted form, to guarantee that localization will not affect it at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a case like:

The message shouted: {$text :transform type=uppercase u:locale=$textLocale} <- what if the string is Turkish?

There are other cases. I think that on-going discussion suggests moving u:locale to draft and that we rebuild a design document rather than including it in 47 as stable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for illuminating a use case; this is more than I've seen before.

However, I'm not convinced this rises to the level of a SHOULD.

Here's my litmus test: if you are in a situation where you are able to choose your MF implementation, then your use case is a MAY. If you are in a situation where your environment chooses your MF implementation for you, then your use case is a SHOULD.

Given that Addison is writing his own strings for his own personal use, I think this lands in the MAY category.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that Addison is writing his own strings for his own personal use, I think this lands in the MAY category.

I don't understand this statement? Every MF2 user is "writing their own strings for their own personal use"? Do you mean "you're writing your own function :transform"?

Most users will be in situations in which their MF2 implementation is the one in their platform or in a library such as ICU. Their goal and task is to write messages, not to cherry-pick different choices.

I don't think we have time this morning to make u:locale satisfy everyone and I think we should employ our WG process--which requires a design doc. We have ample time in v48 to do this correctly and zero time to get it right just now. Hence, I propose moving u:locale to DRAFT.


Replaces the _locale_ defined in the _function context_ for this _expression_.

A comma-delimited list consisting of
Expand Down Expand Up @@ -67,6 +84,8 @@ not valid, or some other reason.

### `u:dir`

Implementations SHOULD support this option.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can or should make any u: option MUST in this release, but really everyone should implement this feature. Otherwise string metadata about direction has nowhere to go. For those playing at home, my UTW presentation explains why.


Replaces the base directionality defined in
the _function context_ for this _expression_
and applies bidirectional isolation to it.
Expand Down
0