8000 [Form][Security][Validator] Normalize translation files by nicolas-grekas · Pull Request #53404 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

[Form][Security][Validator] Normalize translation files #53404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 4, 2024

Conversation

nicolas-grekas
Copy link
Member
@nicolas-grekas nicolas-grekas commented Jan 4, 2024
Q A
Branch? 5.4
Bug fix? no
New feature? no
Deprecations? no
Issues -
License MIT

I used the below script to spot mistakes and normalize translation files.

I propose to accept English where translations are missing because this makes it easier to automate and also to spot what's missing (opening only one file is needed.)

Carsonbot would need to be updated to account for this when checking for missing translations /cc @Nyholm

<?php

use Symfony\Component\Finder\Finder;
use Symfony\Component\Translation\Loader\XliffFileLoader;
use Symfony\Component\Translation\MessageCatalogue;

require __DIR__.'/vendor/autoload.php';

function dumpXliff1(string $defaultLocale, MessageCatalogue $messages, string $domain)
{
    $dom = new \DOMDocument('1.0', 'utf-8');
    $dom->formatOutput = true;

    $xliff = $dom->appendChild($dom->createElement('xliff'));
    $xliff->setAttribute('version', '1.2');
    $xliff->setAttribute('xmlns', 'urn:oasis:names:tc:xliff:document:1.2');

    $xliffFile = $xliff->appendChild($dom->createElement('file'));
    $xliffFile->setAttribute('source-language', str_replace('_', '-', $defaultLocale));
    $xliffFile->setAttribute('target-language', 'no' === $messages->getLocale() ? 'nb' : str_replace('_', '-', $messages->getLocale()));
    $xliffFile->setAttribute('datatype', 'plaintext');
    $xliffFile->setAttribute('original', 'file.ext');

    $xliffBody = $xliffFile->appendChild($dom->createElement('body'));
    foreach ($messages->all($domain) as $source => $target) {
        $translation = $dom->createElement('trans-unit');
        $metadata = $messages->getMetadata($source, $domain);

        $translation->setAttribute('id', $metadata['id']);

        $s = $translation->appendChild($dom->createElement('source'));
        $s->appendChild($dom->createTextNode($source));

        $text = 1 === preg_match('/[&<>]/', $target) ? $dom->createCDATASection($target) : $dom->createTextNode($target);

        $targetElement = $dom->createElement('target');

        if ('en' !== $messages->getLocale() && $target === $source && 'Error' !== $source) {
            $targetElement->setAttribute('state', 'needs-translation');
        }
        if (isset($metadata['target-attributes'])) {
            foreach ($metadata['target-attributes'] as $key => $value) {
                $targetElement->setAttribute($key, $value);
            }
        }

        $t = $translation->appendChild($targetElement);
        $t->appendChild($text);

        $xliffBody->appendChild($translation);
    }

    return preg_replace('/^ +/m', '$0$0', $dom->saveXML());
}


foreach (['Security/Core' => 'security', 'Form' => 'validators', 'Validator' => 'validators'] as $component => $domain) {
    $dir = __DIR__.'/src/Symfony/Component/'.$component.'/Resources/translations';

    $enCatalogue = (new XliffFileLoader())->load($dir.'/'.$domain.'.en.xlf', 'en', $domain);

    file_put_contents($dir.'/'.$domain.'.en.xlf', dumpXliff1('en', $enCatalogue, $domain));
    $finder = new Finder();

    foreach ($finder->files()->in($dir)->name('*.xlf') as $file) {
        $locale = substr($file->getBasename(), 1 + strlen($domain), -4);

        $catalogue = (new XliffFileLoader())->load($file, $locale, $domain);
        $localeCatalogue = new MessageCatalogue($locale);

        foreach ($enCatalogue->all($domain) as $id => $translation) {
            $metadata = [];
            if ($catalogue->defines($id, $domain)) {
                $translation = $catalogue->get($id, $domain);
                $metadata = $catalogue->getMetadata($id, $domain);
            }
            $metadata['id'] = $enCatalogue->getMetadata($id, $domain)['id'];
            $localeCatalogue->set($id, $translation, $domain);
            $localeCatalogue->setMetadata($id, $metadata, $domain);
        }

        file_put_contents($file, dumpXliff1('en', $localeCatalogue, $domain));
    }
}

@carsonbot carsonbot added this to the 5.4 milestone Jan 4, 2024
@carsonbot carsonbot changed the title Normalize translation files [Form][Security][Validator] Normalize translation files Jan 4, 2024
@Nyholm
8000
Copy link
Member
Nyholm commented Jan 4, 2024

I propose to accept English where translations are missing because this makes it easier to automate and also to spot what's missing.

But this will make it hard to know if a translation is missing or not. Do we really want to force users to do this?

$missing = $original !== $translation

@nicolas-grekas nicolas-grekas force-pushed the tr-cs branch 3 times, most recently from f06e7de to 6aa5cec Compare January 4, 2024 11:09
@nicolas-grekas
Copy link
Member Author
nicolas-grekas commented Jan 4, 2024

I added state="needs-translation" to all <target> that aren't translated, as defined by the xliff spec for this purpose.

Copy link
Member
@Nyholm Nyholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

(Just update according to the test. Ie, the test is correct. )

@nicolas-grekas
Copy link
Member Author

PR ready, tests fixed!

@fabpot
Copy link
Member
fabpot commented Jan 4, 2024

I don't think the current xliff reader takes needs-translation into account. Something to add in this PR as well, right?

@thunderer
Copy link
Contributor

@nicolas-grekas I'm wondering, this will cause conflicts in all existing translation PRs, would it be possible to merge them before this one? Afterwards it seems like you could rerun the script to update the files.

@Nyholm
Copy link
Member
Nyholm commented Jan 4, 2024

I don't think the current xliff reader takes needs-translation into account.

Oh.. Does this mean my fallback locale is never used?
Ie:

framework:
    default_locale: 'fr'
    translator:
        fallbacks: ['sv']

If validators.fr.xlf is missing a translation, will it pick it up from validators.sv.xlf? Or, will the XliffReader think the translation is there?

8000

@nicolas-grekas
Copy link
Member Author

xliff reader takes needs-translation into account

Indeed. Updated!

@@ -111,6 +111,10 @@ private function extractXliff1(\DOMDocument $dom, MessageCatalogue $catalogue, s
continue;
}

if (isset($translation->target) && 'needs-translation' === (string) $translation->target->attributes()['state']) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would deserve a changelog entry IMO, while sneaking it in the normalization PR makes it invisible.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #54541

@stof
Copy link
Member
stof commented Jan 4, 2024

@nicolas-grekas can you update carsonbot to make it aware of those needs-translation attributes to consider them missing ?

nicolas-grekas added a commit that referenced this pull request Jan 5, 2024
…tor implementations to be used (xabbuh)

This PR was merged into the 5.4 branch.

Discussion
----------

[Form][Security][Validator] prevent incompatible Translator implementations to be used

| Q             | A
| ------------- | ---
| Branch?       | 5.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Issues        | Fix #53404 (comment)
| License       | MIT

Commits
-------

86a21dd prevent incompatible Translator implementations to be used
symfony-splitter pushed a commit to symfony/validator that referenced this pull request Jan 5, 2024
…tor implementations to be used (xabbuh)

This PR was merged into the 5.4 branch.

Discussion
----------

[Form][Security][Validator] prevent incompatible Translator implementations to be used

| Q             | A
| ------------- | ---
| Branch?       | 5.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Issues        | Fix symfony/symfony#53404 (comment)
| License       | MIT

Commits
-------

86a21dd578 prevent incompatible Translator implementations to be used
fabpot added a commit that referenced this pull request Jan 6, 2024
This PR was merged into the 5.4 branch.

Discussion
----------

Add .github/sync-translations.php

| Q             | A
| ------------- | ---
| Branch?       | 5.4
| Bug fix?      | no
| New feature?  | no
| Deprecations? | no
| Issues        | -
| License       | MIT

That's the script I used in #53404

With current branch numbers as example: when adding a new message to translate on 7.1, we would ask the author to add it to the "en.xlf" file on 5.4, to run this script, and only then to send the PR on 5.4
This would ensure new messages are duplicated in all languages.

We could also run chatgpt on that PR to pre-fill all translations, building on #53417

Commits
-------

9c4cb74 Add .github/sync-translations.php
fabpot added a commit that referenced this pull request Jan 6, 2024
…colas-grekas)

This PR was merged into the 5.4 branch.

Discussion
----------

[Translation] Add missing translations using ChatGPT

| Q             | A
| ------------- | ---
| Branch?       | 5.4
| Bug fix?      | no
| New feature?  | no
| Deprecations? | no
| Issues        | -
| License       | MIT

This is a follow up of #53404

The process around the maintenance of our translation files is too heavy currently. I feel like spending hours on the topic, and I guess it's the same for contributors.

Instead, I propose to account for the GPT tools that are really good at translating our short messages.

In order to still allow and call for human review,  I propose to use the `state="needs-review-translation"` on `<target>` tags as defined by XLIFF.

With some help from carsonbot, we'll be able to call for reviews per language. My proposal would be to make it able to send PRs that remove the `state` attribute but call for reviews by a native speaker before we could merge. We could even make the reviewer the author of the git commit if we want to keep some sort of history on the topic.

~`@thunderer` I'd prefer being explicit about using GPT. I'm going to replace your PRs by this one sorry.~

Commits
-------

2bbbff9 Add missing translations using ChatGPT
@kevinpapst
Copy link
kevinpapst commented Apr 10, 2024

I will open a bug report, because ignoring translations that have a needs-translation state should not be treated like missing translations.

Edit: #54541

@nicolas-grekas
Copy link
Member Author

I tackled this aspect in #54536

@nicolas-grekas
Copy link
Member Author

Ah no sorry, not the exact same topic 😬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
0