8000 [Translation] Add icu message support for the lokalise provider by DanielBadura · Pull Request #45368 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

[Translation] Add icu message support for the lokalise provider #45368

8000 New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: 5.4
Choose a base branch
from

Conversation

DanielBadura
Copy link
Contributor
@DanielBadura DanielBadura commented Feb 9, 2022
Q A
Branch? 5.4
Bug fix? yes
New feature? no
Deprecations? no
Tickets -
License MIT
Doc PR -

This PR would add the support for the ICU message format when using the lokalise provider. I saw that the unit tests are written as it would support the ICU message format so therefore the PR aims the 5.4 branch as a bugfix.

The API from lokalise needs a flag for the format to be downloaded, otherwise it return the file in the "classic" format with % as the delimiter for the placeholders. Also the file was always saved without the suffix. For this i looked up how the ICU support was implemented in symfony and tried to adapt this approach.

I think i would need some assitance how to write tests with and without the intl extension. Right now, some tests fail since the intl extension in enabled and i did not adjust them yet.

@carsonbot
Copy link

Hey!

To help keep things organized, we don't allow "Draft" pull requests. Could you please click the "ready for review" button or close this PR and open a new one when you are done?

Note that a pull request does not have to be "perfect" or "ready for merge" when you first open it. We just want it to be ready for a first review.

Cheers!

Carsonbot

@DanielBadura DanielBadura marked this pull request as ready for review February 9, 2022 09:40
@DanielBadura DanielBadura changed the base branch from 6.1 to 5.4 February 9, 2022 09:41
@nicolas-grekas nicolas-grekas added this to the 5.4 milestone Feb 9, 2022
@DanielBadura
Copy link
Contributor Author

I looked up how the extraction command is handling the ICU thing, and there, if i undestand it correctly, all message will extracted as ICU. The only exception is when the translation key is already in a domain which is not ICU. So maybe my approach here is wrong to begin with. So i would really appreciate some help or advices to the right direction! :)

@DanielBadura
Copy link
Contributor Author

I am still having problems even with these changes and try to figure it out with the Lokalise support. The %LANG_ISO% placeholder is not added to the filenames for example.

@DanielBadura
Copy link
Contributor Author

So after a longer discussion with the Lokalise support i got to this version of the LokaliseProvider:

The Diff

diff --git a/src/Symfony/Component/Translation/Bridge/Lokalise/LokaliseProvider.php b/src/Symfony/Component/Translation/Bridge/Lokalise/LokaliseProvider.php
index b86d3924ff..b431d263bf 100644
--- a/src/Symfony/Component/Translation/Bridge/Lokalise/LokaliseProvider.php
+++ b/src/Symfony/Component/Translation/Bridge/Lokalise/LokaliseProvider.php
@@ -14,6 +14,7 @@ namespace Symfony\Component\Translation\Bridge\Lokalise;
 use Psr\Log\LoggerInterface;
 use Symfony\Component\Translation\Exception\ProviderException;
 use Symfony\Component\Translation\Loader\LoaderInterface;
+use Symfony\Component\Translation\MessageCatalogue;
 use Symfony\Component\Translation\MessageCatalogueInterface;
 use Symfony\Component\Translation\Provider\ProviderInterface;
 use Symfony\Component\Translation\TranslatorBag;
@@ -37,14 +38,16 @@ final class LokaliseProvider implements ProviderInterface
     private $logger;
     private $defaultLocale;
     private $endpoint;
+    private $intlIcuEnabled;

-    public function __construct(HttpClientInterface $client, LoaderInterface $loader, LoggerInterface $logger, string $defaultLocale, string $endpoint)
+    public function __construct(HttpClientInterface $client, LoaderInterface $loader, LoggerInterface $logger, string $defaultLocale, string $endpoint, bool $intlIcuEnabled = false)
     {
         $this->client = $client;
         $this->loader = $loader;
         $this->logger = $logger;
         $this->defaultLocale = $defaultLocale;
         $this->endpoint = $endpoint;
+        $this->intlIcuEnabled = $intlIcuEnabled;
     }

     public function __toString(): string
@@ -102,7 +105,14 @@ final class LokaliseProvider implements ProviderInterface

         foreach ($translations as $locale => $files) {
             foreach ($files as $filename => $content) {
-                $translatorBag->addCatalogue($this->loader->load($content['content'], $locale, str_replace('.xliff', '', $filename)));
+                $domain = str_replace('.' . $locale . '.xliff', '', $filename);
+
+                if ($this->intlIcuEnabled) {
+                    $domain .= MessageCatalogue::INTL_DOMAIN_SUFFIX;
+                }
+
+                $catalogue = $this->loader->load($content['content'], $locale, $domain);
+                $translatorBag->addCatalogue($catalogue);
             }
         }

@@ -149,6 +159,8 @@ final class LokaliseProvider implements ProviderInterface
                 'filter_langs' => array_values($locales),
                 'filter_filenames' => array_map([$this, 'getLokaliseFilenameFromDomain'], $domains),
                 'export_empty_as' => 'skip',
+                'placeholder_format' => $this->intlIcuEnabled ? 'icu' : 'symfony',
+                'include_tags' => ['symfony'],
             ],
         ]);

@@ -183,6 +195,7 @@ final class LokaliseProvider implements ProviderInterface
                     'android' => null,
                     'other' => null,
                 ],
+                'tags' => ['symfony'],
             ];
         }

@@ -352,6 +365,6 @@ final class LokaliseProvider implements ProviderInterface

     private function getLokaliseFilenameFromDomain(string $domain): string
     {
-        return sprintf('%s.xliff', $domain);
+        return sprintf('%s.%s.xliff', $domain, '%LANG_ISO%');
     }
 }

In this diff are 3 changes, 2 of them which are imho important, the last one less important but nice to have.

  1. The most important the ICU Change proposed in this PR waiting for comments if this is a bugfix and if the configuration suggestion is good to go.
  2. The %LANG_ISO% placeholder Lokalise uses for replacement of the language suffix. We need to add this placeholder ourself. Lokalise cannot detect it via API right now, that's why i'm also replacing here the locale: $domain = str_replace('.' . $locale . '.xliff', '', $filename);. This should also be a bugfix imho.
  3. Adding the possibility to add a tag for the translation keys. With this developers can better filter the translations via Lokalise features. We are needing this since we are using Lokalise also with our frontend and there we are not needing the backend / symfony translations. But i think this should also be optional and not hardcoded like it is in the diff. That should be a new feature and is the less important one imho.

What do you think about these points? I would happy to create these into issues / PR's.

@fabpot
Copy link
Member
fabpot commented Mar 26, 2022

@welcoMattic Can you have a look at the latest comment from @DanielBadura?

@welcoMattic
Copy link
Member
welcoMattic commented Mar 28, 2022

@DanielBadura

So after a longer discussion with the Lokalise support i got to this version of the LokaliseProvider:
The Diff

In this diff are 3 changes, 2 of them which are imho important, the last one less important but nice to have.

  1. The most important the ICU Change proposed in this PR waiting for comments if this is a bugfix and if the configuration suggestion is good to go.

  2. The %LANG_ISO% placeholder Lokalise uses for replacement of the language suffix. We need to add this placeholder ourself. Lokalise cannot detect it via API right now, that's why i'm also replacing here the locale: $domain = str_replace('.' . $locale . '.xliff', '', $filename);. This should also be a bugfix imho.

For the first 2 points, I'm ok with it, you did great work, thank you. Just a small comment about parameter naming.

  1. Adding the possibility to add a tag for the translation keys. With this developers can better filter the translations via Lokalise features. We are needing this since we are using Lokalise also with our frontend and there we are not needing the backend / symfony translations. But i think this should also be optional and not hardcoded like it is in the diff. That should be a new feature and is the less important one imho.

What do you think about these points? I would happy to create these into issues / PR's.

About this, I have been wondering for some time if a particular configuration per provider is something to add. And your comment goes in this direction: for example, we could allow users to define one or more tags to add to their Symfony translations, but without forcing it.

We should consider it as a feature to be done in another PR.

@welcoMattic
Copy link
Member
welcoMattic commented Apr 6, 2022

@DanielBadura could you have a look to failing checks? Otherwise, I think this PR is ready to merge for me

@DanielBadura
Copy link
Contributor Author

Yes i will try to look at it the next days @welcoMattic :)

@DanielBadura DanielBadura force-pushed the lokalise-provider-icu branch from 74633ef to 692f87a Compare June 15, 2022 13:25
@DanielBadura
Copy link
Contributor Author

So i rebased again and added a default value for the provider creation for the tests. That resolved most of the failing test. Now only my new tests are failing :D. But I'm really not sure why I went through it mutltiple times now and i dont understand why the test is failing.. Maybe someone can also have a look? 😅

@welcoMattic
Copy link
Member

@DanielBadura I'll try to take a look on it this week

Co-authored-by: Mathieu Santostefano <msantostefano@protonmail.com>
@DanielBadura
Copy link
Contributor Author

code style checks from fabbot seems unrelated 🤔

@nicolas-grekas
Copy link
Member

Sorry to come late in the review, but can't we use the intl-icu suffix in the domain to decide for the correct format?

@DanielBadura
Copy link
Contributor Author

Maybe this is possible for sending the files based on that. But if we have no files local and want to read the files from the provider we need to pass the option to get it in the right format.

@nicolas-grekas
Copy link
Member

Can't we set some metadata on the provider side when pushing to know the format?

@DanielBadura
Copy link
Contributor Author
DanielBadura commented Aug 3, 2022

Im not sure how this can help us since we need to know the format before the query the provider. The format is part of the inital request to retrieve the data. If i understand correctly what you try to achieve is that lokalise is giving us the data in a specific format if we tell them so in some kind of metadata when uploading. AFAIK that is not a feature lokalise e.g. is offering. Maybe with their automation feature the user can somehow achieve that but Im not sure about that. Or did I misunderstood you?

EDIT:
Also what happens if we want to use translation files from mutliple sources? Like also pushing translations from frontend and want to reuse them also in the backend?

@nicolas-grekas
Copy link
Member
nicolas-grekas commented Aug 3, 2022

You've got me right. Is lokalize the only provider that allow fetching the messages in various formats? I feel like the other providers don't deal with formats and give back strings as is. Am I right?
What about putting a dummy message as a marker that ICU was used in the source? Would that work as a kind of metadata?
I'm thinking loud and I don't know the topic well, sorry if that doesn't make sense.

@DanielBadura
Copy link
Contributor Author

I can not answer this question about the other providers since i only worked with Lokalise so far. But i looked briefly into the documentation for Loco and there it seems they also offer this. See here the export API we are using in the LocoProvider: https://localise.biz/api/docs/export/exportall and the param doc https://localise.biz/help/developers/printf

Side note: The namings for all the providers can be confusing.

So i assumed you meant Lokalise with lokalize.

@nicolas-grekas
Copy link
Member

I'm going to delegate to @welcoMattic at this point 😬

@welcoMattic
Copy link
Member

@DanielBadura yes, the Provider names are confusing, but we can not do anything about that except to be rigorous when mentioning them.

About the main problem, which I understand is telling Lokalise if we want to download translation under ICU format or "symfony" format, IIRC Lokalise is the only provider (for now) which lets us define the format at download by exposing a parameter in the query, so let's use it.

About @nicolas-grekas comment:

Sorry to come late in the review, but can't we use the intl-icu suffix in the domain to decide for the correct format?

No we can't, because at the provider level, we only know the domain itself, not the variant (with or without +intl-icu suffix). It on purpose, to avoid translations duplication on providers side when user switch from the old format to ICU.

So, for me the only way to do, is the one has been done here: expose a config option for Lokalise provider, to explicitly ask for ICU format or "symfony" format.

Now, one question remains: should we set the config option at icu or symfony by default?

@nicolas-grekas
Copy link
Member
nicolas-grekas commented Aug 3, 2022

icu if extension_loaded('intl')?

@DanielBadura
Copy link
Contributor Author

icu if extension_loaded('intl')?

I had considered this too in the first place. But imho thats not a good approach since you could have intl extension enabled but still using the "old" / "symfony" format. So from my point of view configuring and let the user decide is the best option.

Also as i said before, i think it would be good to configure it for the whole component. Then this config could also be used to determine in which format the translations should be created. But this is another topic but maybe relevant.

@nicolas-grekas
Copy link
Member

But if the provider is in control, we don't care about the exact format on the app's side, do we? If not, we should fetch the most capable format, which is ICU. Why would the app want to decide what it works with when it can delegate to the provider?

@welcoMattic
Copy link
Member

@DanielBadura about Loco (localise.biz), they also expose a parameter to force icu format at export (see https://localise.biz/api/docs/export/exportall). Are you ok to open another PR to add support for icu format on Loco Provider, after this one?

@DanielBadura
Copy link
Contributor Author
DanielBadura commented Aug 3, 2022

Well since I'm not working with Loco or generally with translations right now, I can not really test it. 😅

@welcoMattic
Copy link
Member

Well since I'm not working with Loco or generally with translations right now, I can not really test it. 😅

Ok, no problem, I will take it

Copy link
@TwanVermeulen TwanVermeulen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a suggestion that will fix a possible issue for the plurals

@@ -149,6 +156,7 @@ private function exportFiles(array $locales, array $domains): array
'filter_langs' => array_values($locales),
'filter_filenames' => array_map([$this, 'getLokaliseFilenameFromDomain'], $domains),
'export_empty_as' => 'skip',
'placeholder_format' => $this->intlIcuEnabled ? 'icu' : 'symfony',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'placeholder_format' => $this->intlIcuEnabled ? 'icu' : 'symfony',
'placeholder_format' => $this->intlIcuEnabled ? 'icu' : 'symfony',
'plural_format' => $this->intlIcuEnabled ? 'icu' : 'symfony',

A different parameter is required to ensure that the plurals are also in the expected ICU format. The param is called plural_format which should contain the same value as placeholder_format. See https://developers.lokalise.com/reference/download-files for all params. It would be great if that can be included. Without this option users might get invalid ICU translation files

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, you will still get plurals in the old format from Lokalise, which the Symfony Translator is not expecting in ICU files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be really weird behaviour IMHO, but better safe then sorry i guess :D

Could you verify this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird, but I guess @TwanVermeulen is right. According to the documentation https://developers.lokalise.com/reference/api-plurals-and-placeholders#plural-format-descriptions I guess we have to pass both parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add it, but as i said - cant really verify for now as I'm not working anymore on the project which needed this patch. So is this verified that this is needed and working?

@DanielBadura
Copy link
Contributor Author

So what could be the next steps to finish this PR? IIRC we did not conclude on how the configuration should be done.

@welcoMattic
Copy link
Member

@DanielBadura sorry for the delay. To finish this PR I think we need to be sure about the plural parameter value.

About the configuration, I think we are ok. We need to enable the ICU format per Provider. As I mentioned earlier, I will take care of Loco Provider once this PR will be merged.

@TwanVermeulen
Copy link

I'm not sure if it helps, but I did verify this myself + I got the confirmation from the Lokalise support desk. If you want to receive the plurals in ICU format, you need to specify this via plural_format configuration.

@nicolas-grekas
Copy link
Member

I'd like to have an answer to my previous comment:

But if the provider is in control, we don't care about the exact format on the app's side, do we? If not, we should fetch the most capable format, which is ICU. Why would the app want to decide what it works with when it can delegate to the provider?

WDYT? I'm not happy with adding a config option...

@DanielBadura
Copy link
Contributor Author

IMHO the app is in control in which format it is using the translations. I'm not sure how we want to delegate this to the provider to decide 🤔.

@welcoMattic
Copy link
Member

I'd like to have an answer to my previous comment:

But if the provider is in control, we don't care about the exact format on the app's side, do we? If not, we should fetch the most capable format, which is ICU. Why would the app want to decide what it works with when it can delegate to the provider?

WDYT? I'm not happy with adding a config option...

Some apps are still working with the old plural format, not ICU. For those apps, we must keep an option to not fetch translations in ICU format, otherwise I guess the fetched translations will be in a weird format (old plural format treated as ICU).

Did I miss something?

@DanielBadura
Copy link
Contributor Author

How can we move forward here?

@@ -857,6 +857,7 @@ private function addTranslatorSection(ArrayNodeDefinition $rootNode, callable $e
->fixXmlConfig('locale')
->children()
->scalarNode('dsn')->end()
->booleanNode('intl_icu_enabled')->defaultFalse()->end()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
->booleanNode('intl_icu_enabled')->defaultFalse()->end()
->booleanNode('intl_icu')->defaultFalse()->end()

I think it's better to name it short.

@welcoMattic
Copy link
Member

Some apps are still working with the old plural format, not ICU. For those apps, we must keep an option to not fetch translations in ICU format, otherwise I guess the fetched translations will be in a weird format (old plural format treated as ICU).
Did I miss something?

I'm still convinced that we must leave the control of ICU or not to the app. For example, an app with the special %count% parameter in a trans call will fail with ICU formatted translation, IIRC the difference between both formats.

The configuration option seems to be our only solution here. It's totally acceptable for me.

@nicolas-grekas @DanielBadura WYDT?

To move forward, this branch needs to be rebased. @DanielBadura, are the functional tests still relevant? (I mean, if you test this branch with real API calls, Lokalise responds as expected?)

@DanielBadura
Copy link
Contributor Author

I'm still convinced that we must leave the control of ICU or not to the app. For example, an app with the special %count% parameter in a trans call will fail with ICU formatted translation, IIRC the difference between both formats.

The configuration option seems to be our only solution here. It's totally acceptable for me.

I think the same, hence why i proposed the configuration option.

To move forward, this branch needs to be rebased. @DanielBadura, are the functional tests still relevant? (I mean, if you test this branch with real API calls, Lokalise responds as expected?)

I could rebase this branch. About the testing part: I'm no longer working on the project which had this implementation, so i cant really test it right now with real API calls to Lokalise.

@fabpot fabpot modified the milestones: 5.4, 6.4 Dec 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0