[go: up one dir, main page]

Page MenuHomePhabricator

Integrate IANA language registry with language-data and MediaWiki (let MediaWiki "knows" all languages with ISO 639-1/2/3 codes)
Open, Needs TriagePublic

Description

This will mean we may use 7000+ languages in Wikidata terms, monolingual texts and lexemes; Abstract Wikipedia will also benefit from it.

(For translating language names, see T231755: Local language name should be translatable in translatewiki.net)


Original:

I had a look at that tracker and found http://unicode.org/cldr/trac/ticket/9137 where two of the codes here (fkv and sje) were already requested but rejected. Given the following comment, it seems like it would be a waste of time to request the addition of more languages there:

We agreed to document there is no intent for CLDR to have the English names of all languages (there are over 7,000) of them, and point to ​http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry as a source for any extra ones that people need.

(Not only those MediaWiki supported) See parent tasks for use cases. They should preferably be translatable, but there should be a way to prevent duplicates in translation.

http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry is a good resource.

Also we may adapt an existing MediaWiki extension, like CLDR.

Related Objects

Event Timeline

Would this be intended to provide a full solution to T151269? If so, it would also need to support non-standard MediaWiki languages (e.g. "nl-informal") and IETF language tags with countries, scripts or other variants (e.g. pt-br, ku-arab, be-tarask).

IETF language tags with countries, scripts or other variants can be easily generated as valid (non private use) subtags are finite.

Babel extension already bundles the IANA registry, but does not register the English names with MediaWiki.

What do you mean by "list name of all languages"? A special page like Special:SiteMatrix but just with a super-long table of language codes and names?

It would be more useful to have MediaWiki find all the language codes (and names) it actually needs, e.g. for #language, #babel, ULS. The main obstacle to that is having a map from custom to "real" language codes i.e. T59133: Search in CLDR-aliased language codes, but output corresponding MediaWiki locale code (or $wgLanguageCode or wikiId?), AFAICT.

No. As Babel extension already bundles the IANA registry, you may try to make other extension use this piece of data (e.g. T151269: Add English names for languages which don't yet have one). Most of the content in https://phabricator.wikimedia.org/diffusion/ECLD/browse/master/LocalNames/LocalNamesEn.php can then be eliminated.

However, even we have this kind of information in IANA registry, allow translating the language names is useful.

I'm going to add another use case.

Bugreporter renamed this task from Create or adapt an extension to list name of all languages with ISO 639-1/2/3 codes to Integrate IANA language registry with language-data and MediaWiki (let MediaWiki "knows" all languages with ISO 639-1/2/3 codes).Jun 15 2021, 10:43 AM
Bugreporter updated the task description. (Show Details)