-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Properly parse Plural-Forms header of PO/MO files (2.6) #5348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…cached-engine SASL with Memcached is only supported with binary protocol
This just contains the bare minimum to interpret common plural formulas, but new operators can be easily added.
Now, pretty much every language plural case is covered and the implementation follows to the standard.
Removed usages of create_function() as well.
return (int)($a < $b); | ||
} | ||
|
||
protected function _greaterThanOperator($a, $b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally include doc blocks for every function.
I don't know about this; plural rules don't change (at all? much? At least not with any significant frequency) - is it really necessary to parse plural rules at run time? Is a plural-rule-parsing "framework" really necessary? For reference I used the following logic to parse plural rules and convert them to php code. i.e.:
Output:
Which after injecting some whitespace and reformatting I simply used to generate the plural rule function, applied here that could be:
That's a lot easier to read and maintain than a plural rule parsing class. |
@AD7six You’re right that plurals do not change very frequently. Actually they don’t change, they just start being used when you start translating to some “new” language CakePHP don’t know yet. But translators probably don’t want to have to wait for CakePHP each time this happens. I don’t know to what extend the current naive implementation covers languages, but only to think that a language as widely used as Arabic isn’t supported don’t make me trust it at all. That’s the reason I implemented plural formula parsing. That being said, I admit your suggested approach is smart and could make both translators and CakePHP developers happy. But when exactly such code generation would take place? |
I completely agree.
Only as a developer aide - and in fact it wouldn't be present in any resultant class. Some years ago plural rules declarations looked to me like an alien language - and deriving what logic they represented was some confusing complex task. I'm sure for many developers/users it still is quite confusing, however for me (and I hope/assume for you) the mystery is all gone since it's just a simple mathematical formula of nested ternary expressions. We could parse these rules but to a certain extent that's propagating the confusion There's a known finite list of plural rule forms (16 of them). That list of rules might be updated - but at a glacial pace, so I don't think it's necessary to consider rules not expressed there. Given that I suggest:
What do you think about that @jiru (or anyone else)? |
I too am surprised that Arabic did not work out of the box as we've had effectively the same code for the last 5 years. I suspect that there are probably a few more omissions but not many. Given that the spoken/written languages of the world evolve pretty slowly a finite list/map of rules might be simpler to maintain in the long term and run more efficiently. |
I wasn't really aware of 3.0's plural rules class, it would be relatively low effort to back port it to 2.6 and preferable compared to having/maintaining 2 different implementations. |
Backporting 3.0's plural rules class does sound better. |
@ADmad Backporting and adding Arabic support sounds like a good idea to me. |
@markstory PluralRules class already has support for Arabic. |
I must have misread one of the earlier comments then, I thought it had 16 forms - my bad. |
6 forms not 16 😄 16 is the total number of plural rules covering all languages. Though the rule numbering isn't exactly standard. |
If it helps, there are 17 listed plural rules, since I can't 0-base count =). There's a reference also on unicode.org: http://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html Which could probably be considered canonical |
Righto, it's 17. Working with I18n messes with one's brains 😛 I think I can patch the existing 2.x code to correctly handle Arabic. Later on whenever someone has the time the 3.0 PluralRule class can be backported for 2.x. |
So does the 3.x PluralRule class need to be updated? If not I can backport it to 2.x. I'm confused as to which list on unicode.org has 17 rules, I counted more than that for Cardinal-Integer, and Ordinal-Integer. The MDN plural rule list has 17 rules of which PluralRules only implements 16. |
@markstory unicode lists 26 rulesets, whereas MDN has 17. ADmad already added one correction to the 3.0 plural rules class, but it already catered for Arabic for example. With some tedious cross referencing, we could identify which languages/rulesets are missing - I don't know if that's really worth it. If it is I could do that and update the switch of doom. |
Closing in favor of #5360 |
This is a backport of cakephp/cakephp#5348 to CakePHP 1.3. The CakePHP implementation assumes a static well-known finite list of plural rules (in _pluralGuess()). However, not only Arabic wasn’t initially supported (and we fixed it the hard way in 0e20cd0), but now Russian and Belarusian rules have been updated recently on Transifex in 3b6e63b (for good, they now follows the Unicode reference [1]), which totally breaks the original implementation. The algorithm first converts the expression to postfix using a shunting-yard algorithm, and then evaluates it each time a pluralized string is translated. [1] http://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html#rules
So this follows #5347, this time targeting the 2.6 branch, and hopefully compatible with PHP 5.2. I hope it’s fine even if it includes a few other commits not from me.