-
Notifications
You must be signed in to change notification settings - Fork 459
Use custom symbols based on the language's defaultNumberingSystem #470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use custom symbols based on the language's defaultNumberingSystem #470
Conversation
0861e18
to
c2ba4f3
Compare
c2ba4f3
to
5c36cd2
Compare
5c36cd2
to
2f00832
Compare
465f40c
to
52c4ab5
Compare
on the defaultNumberingSystem. Translated the numbers and dates to the locale’s defaultNumberingSystem.
52c4ab5
to
9981cb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @alexandrukis, and thanks for the patch! Sorry it's taken a while for anyone to look at it.
Anyway, there's a couple comments in-code, and in addition I'm thinking numbering system selection should probably be an option – probably so it defaults to the locale's numbering system, but could be changed to 'latn'
, etc? What do you think?
@@ -286,6 +286,28 @@ def dst(self, dt): | |||
return ZERO | |||
|
|||
|
|||
def translate(string, keys, values): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll probably just want to use str.translate
(and its friend str.maketrans
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my initial instinct, but there are certain problems with it, such as:
- for
python2
,translate
(and its friendmaketrans
) are actually in thestring
module - codec errors explained here
So I decided I'll just roll my own.
babel/messages/catalog.py
Outdated
@@ -415,11 +416,11 @@ def _set_mime_headers(self, headers): | |||
self._num_plurals = int(params.get('nplurals', 2)) | |||
self._plural_expr = params.get('plural', '(n != 1)') | |||
elif name == 'pot-creation-date': | |||
self.creation_date = _parse_datetime_header(value) | |||
self.creation_date = _parse_datetime_header(to_latn_numbering_system(value, locale=self.locale)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think pofiles can have headers in non-latin numbering?
babel/messages/catalog.py
Outdated
elif name == 'po-revision-date': | ||
# Keep the value if it's not the default one | ||
if 'YEAR' not in value: | ||
self.revision_date = _parse_datetime_header(value) | ||
self.revision_date = _parse_datetime_header(to_latn_numbering_system(value, locale=self.locale)) | < 8000 /tr>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think pofiles can have headers in non-latin numbering?
scripts/import_cldr.py
Outdated
|
||
# If the default_number_system_node is non-existent, the package will inherit it from its parent | ||
if default_number_system_node is not None: | ||
number_symbols['defaultNumberingSystem'] = numbering_systems[default_number_system_node.text] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this data doesn't belong in the number_symbols
tree.
It would probably be a good idea to parse <defaultNumberingSystem>
and <otherNumberingSystems>
into, say, numbering_systems
– something like (example from el.xml):
<defaultNumberingSystem>latn</defaultNumberingSystem>
<otherNumberingSystems>
<native>latn</native>
<traditional>grek</traditional>
</otherNumberingSystems>
->
"numbering_systems": {
"default": "latn",
"native": ["latn"],
"traditional": ["grek"]
}
(Do note how there can be multiple instances of the same otherNumberingSystems tag according to the DTD, hence a list.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, this copies the content of the numbering system into the locale, instead of the ID? That doesn't sound DRY.
scripts/import_cldr.py
Outdated
if default_number_system_node is not None: | ||
number_symbols['defaultNumberingSystem'] = numbering_systems[default_number_system_node.text] | ||
|
||
for elem in tree.findall('.//numbers/symbols'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we start to have support for non-latn numbering systems, I think we should also import <symbols>
for all the numbering systems declared in the XML – for instance, ur.xml has <symbols>
for both latn
and arabext
, zh.xml has four of them... This code seems like it will erroneously merge the various numbering systems?
The resultant shape of number_symbols
should then be something like
"number_symbols": {
"latn": {"decimal": ...},
"arabext": {"decimal": ...},
}
@alexandrukis Are you still around? :) |
Hey, I am, but I'm experiencing some really busy months. If anyone needs this, I can make an effort and try to get it merged. Is that the case? |
…46-use-custom-symbols-based-on-the-defaultNumberingSystem # Conflicts: # babel/dates.py # babel/numbers.py # tests/test_numbers.py # tests/test_util.py
79b4453
to
f1225bb
Compare
Kept only the ids of the used numberingsystems in the locale. Moved the actual data into a global. Imported symbols for every numbering system.
f1225bb
to
3300559
Compare
Codecov Report
@@ Coverage Diff @@
## master #470 +/- ##
=========================================
+ Coverage 90.22% 90.33% +0.1%
=========================================
Files 24 24
Lines 4042 4076 +34
=========================================
+ Hits 3647 3682 +35
+ Misses 395 394 -1
Continue to review full report at Codecov.
|
Hey @akx ! I have made some time to implement most of the changes you've requested such as moving the actual number systems outside the locale (and keep only the ids) and loading all the symbols. Regarding the usage of other number systems, that's a little more complicated. For example, let's assume a user wants to format a date in arabic. The date format is Now, I don't really have much experience with i18n or with Please let me know what you think. |
@akx Are you still around? :) |
Previously the script could have inadvertently merged formatting rules between numbering systems due to the XML selectors used. This makes sure only Latin rules are used for the time being. When support for other numbering systems is properly added (see #470), these checks can be changed.
Previously the script could have inadvertently merged formatting rules between numbering systems due to the XML selectors used. This makes sure only Latin rules are used for the time being. When support for other numbering systems is properly added (see #470), these checks can be changed.
This is not being implemented? |
This PR tackles two of the problems mentioned here: