10000 Assign the same CJK width to canonically equivalent strings by Jules-Bertholet · Pull Request #52 · unicode-rs/unicode-width · GitHub
[go: up one dir, main page]

Skip to content

Assign the same CJK width to canonically equivalent strings #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Jules-Bertholet
Copy link
Contributor

UAX 11:

Modern Rendering Practice. […] The set of characters with mappings to legacy character sets that have been assigned ambiguous width constitute a superset of the set of such characters that may be rendered as wide characters in a given context. In particular, an application might find it useful to treat characters from alphabetic scripts as narrow by default. Conversely, many of the symbols in the Unicode Standard have no mappings to legacy character sets, yet they may be rendered as “wide” characters if they appear in an East Asian context. An implementation might therefore elect to treat them as ambiguous even though they are classified as neutral here.

"Treat characters from alphabetic scripts as narrow by default" is the biggest change this PR makes. To achieve full canonical equivalence, we also need to adjust the width of a few mathematical symbols with diagonal strikethrough, and of U+0387 GREEK ANO TELEIA.

@Manishearth Manishearth merged commit d00d357 into unicode-rs:master May 22, 2024
2 checks passed
@Jules-Bertholet Jules-Bertholet deleted the canonically-equivalent-eaw branch May 22, 2024 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0