[go: up one dir, main page]

Page MenuHomePhabricator

Create new translate_message_group_subscriptions table on Wikimedia wikis with the Translate extension installed
Closed, ResolvedPublic4 Estimated Story Points

Description

Related to T348501: Minimal prototype implementation for watching message groups.

Table information

Added in 986865: Add table to store user message group subscriptions | https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Translate/+/986865

Note that the patch has been reviewed by @Ladsgroup.

Why? To keep track of the message groups that the user is subscribed to.

We've enabled the group subscription feature on translatewiki.net where this table already exists

Table structure

CREATE TABLE /*_*/translate_message_group_subscriptions (
  tmgs_subscription_id INT UNSIGNED AUTO_INCREMENT NOT NULL,
  tmgs_user_id INT UNSIGNED NOT NULL,
  tmgs_group VARBINARY(200) NOT NULL,
  INDEX translate_tmgs_user_id (tmgs_user_id),
  INDEX translate_tmgs_group (tmgs_group),
  PRIMARY KEY(tmgs_subscription_id)
) /*$wgDBTableOptions*/;

More information

Should this table be replicated to wiki replicas (does it not contain private data)?
It shouldn’t be replicated. It contains data similar to the watchlist table.

Will you be doing cross-joins with the wiki metadata?
Currently we're not doing any joins with wiki metadata, and I don't foresee doing so in the future either.

Size of the table (number of rows expected).
With the current usage of the table, there will be at most:

1 record per message group subscription for a user.

As per translatewiki.net stats for active translators from 2023, there are about 1,600 of them. Assuming that each of these translators subscribe to about 100 message groups, we will have around 160,000 records.

Expected growth per year (number of rows).

Assuming an increase of 200 active translators every year, and about 50 new message group subscriptions for all the users, I'd say about 100,000 records.

These are worst case estimates and the actuals will be much smaller.

For example on translatewiki.net, where the feature has been available for 3 months, we currently have around 450 records.

Expected amount of queries, both writes and reads (per minute, per hour...per day, any of those are ok)

Difficult to say. 1 query will be run every time the user access Special:Translate.

Examples of query

-- Insert a new group subscription for a user
INSERT INTO translate_message_group_subscriptions (tmgs_group, tmgs_user_id)
VALUES ('groupId', 'userId')
ON DUPLICATE KEY UPDATE tmgs_group = 'groupId' AND tmgs_user_id = 'userId';

-- Fetch all users subscribed to a group
SELECT tmgs_group, tmgs_user_id FROM translate_message_group_subscriptions WHERE tmgs_group IN ('group1', 'group2');

-- Fetch all subscriptions for a user
SELECT tmgs_group, tmgs_user_id FROM translate_message_group_subscriptions WHERE tmgs_user_id = "userId";

-- Remove subscriptions
DELETE FROM translate_message_group_subscriptions WHERE tmgs_group= 'groupId' AND tmgs_user_id = 'userId';

The release plan for the feature (are there specific wikis you'd like to test first etc).
Enable out on testwiki first, followed by mediawiki.org and then roll out on all wikis where Translate extension is enabled.

The table will be created first, and then $wgTranslateEnableMessageGroupSubscription = true will be set to enable the feature.


See https://wikitech.wikimedia.org/wiki/Creating_new_tables for more.

Event Timeline

@abi_ So we decided to put this in x1, correct? let me know if you need help creating it in x1.

Tacsipacsi subscribed.

I’ve updated the description: it shouldn’t be replicated, the fact that I subscribed to a message group is private data.


Actually, what purpose does the tmgs_subscription_id column? The watchlist table also has a separate primary key column, but it has been added in T125990, related to the watchlist expiry (it’s used as a foreign key target). Here, I don’t anticipate any foreign keys pointing to this table, so just a primary key of (tmgs_user_id, tmgs_group) would suffice.

Even if it is going to x1, it should still be filtered just in case cc @ABran-WMF

Actually, what purpose does the tmgs_subscription_id column? The watchlist table also has a separate primary key column, but it has been added in T125990, related to the watchlist expiry (it’s used as a foreign key target). Here, I don’t anticipate any foreign keys pointing to this table, so just a primary key of (tmgs_user_id, tmgs_group) would suffice.

The queries that we run are usually made on either the tmgs_user_id or the tmgs_group but not both columns so having separate indexes on them made sense. Some discussions around it happened here: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Translate/+/986865/comment/4acd205b_d2db64da/

The queries that we run are usually made on either the tmgs_user_id or the tmgs_group but not both columns so having separate indexes on them made sense.

Before the mentioned commit (43e386ca16411096bbbd7f14f9f4e15c5e268fe7), the watchlist table had no primary key at all, and apparently worked well for over a decade, so

CREATE TABLE translate_message_group_subscriptions (
  tmgs_user_id INT UNSIGNED NOT NULL,
  tmgs_group VARBINARY(200) NOT NULL,
  INDEX translate_tmgs_user_id (tmgs_user_id),
  INDEX translate_tmgs_group (tmgs_group)
);

(two indices on the two columns, no primary key) could also be an option. Or creating all three indices.

By the way, there are very much queries made on both columns: MessageGroupSubscription::isUserSubscribedTo(), which is run each time one opens Special:Translate (or selects a different message group), queries both columns.

We shouldn't create this table in production until the replication filters are in place though, to avoid having to resanitize everything - @ABran-WMF can you take care of that?

Change #1062436 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] mariadb: exclude translate_message_group_subscriptions from replication

https://gerrit.wikimedia.org/r/1062436

Change #1062436 merged by Arnaudb:

[operations/puppet@production] mariadb: exclude translate_message_group_subscriptions from replication

https://gerrit.wikimedia.org/r/1062436

The queries that we run are usually made on either the tmgs_user_id or the tmgs_group but not both columns so having separate indexes on them made sense.

Before the mentioned commit (43e386ca16411096bbbd7f14f9f4e15c5e268fe7), the watchlist table had no primary key at all, and apparently worked well for over a decade, so

As per: https://www.mediawiki.org/wiki/MediaWiki_database_policy#Schema_changes,

All new tables must have a primary key. When a candidate for primary key could not be created (for example, if all columns can be repeated), a separate auto_increment column, or other arbitrary field (depending on the case), must be added.

By the way, there are very much queries made on both columns: MessageGroupSubscription::isUserSubscribedTo(), which is run each time one opens Special:Translate (or selects a different message group), queries both columns.

You are right, but there will be queries made to just the group (to fetch notifications to be sent) and user (to display groups the user has subscribed to) columns, hence individual indexes made sense.

As per: https://www.mediawiki.org/wiki/MediaWiki_database_policy#Schema_changes,

All new tables must have a primary key. When a candidate for primary key could not be created (for example, if all columns can be repeated), a separate auto_increment column, or other arbitrary field (depending on the case), must be added.

I see. I also found it a bit strange, but I thought that if there’s precedence for this, it’s probably acceptable. Apparently it isn’t.

You are right, but there will be queries made to just the group (to fetch notifications to be sent) and user (to display groups the user has subscribed to) columns, hence individual indexes made sense.

But the two-column index also makes sense. And if it exists, why not use it as a primary key? By the way, if MariaDB is smart enough, probably it can use a (tmgs_user_id, tmgs_group) index to filter for the user, simply ignoring the end, so only two indices are needed – the primary containing two columns, and another one containing just the message group names. Or maybe it’s even acceptable to not have an index for the message group name? The only case when we filter only by message group name (i.e. query all users) is when there are new messages in a group – which is a write action, thus relatively rare. (The queries by user all happen on read actions.)

sanitarium instances have been restarted with the patch

But the two-column index also makes sense. And if it exists, why not use it as a primary key? By the way, if MariaDB is smart enough, probably it can use a (tmgs_user_id, tmgs_group) index to filter for the user, simply ignoring the end, so only two indices are needed – the primary containing two columns, and another one containing just the message group names. Or maybe it’s even acceptable to not have an index for the message group name? The only case when we filter only by message group name (i.e. query all users) is when there are new messages in a group – which is a write action, thus relatively rare. (The queries by user all happen on read actions.)

Two column index would work, and would be more efficient but having a separate primary key would allow us to be flexible. I guess we can add it when the need arises. I'll update the structure of the schema as following:

-- Drop primary key
ALTER TABLE mw.translate_message_group_subscriptions MODIFY COLUMN tmgs_subscription_id int(10) unsigned NOT NULL;
ALTER TABLE mw.translate_message_group_subscriptions DROP PRIMARY KEY;
-- Drop old indexes
ALTER TABLE mw.translate_message_group_subscriptions DROP INDEX translate_tmgs_group;
ALTER TABLE mw.translate_message_group_subscriptions DROP INDEX translate_tmgs_user_id;
-- Add new indexes
ALTER TABLE mw.translate_message_group_subscriptions ADD CONSTRAINT translate_message_group_subscriptions_pk PRIMARY KEY (tmgs_user_id,tmgs_group);
CREATE INDEX translate_message_group_subscriptions_tmgs_group_IDX USING BTREE ON mw.translate_message_group_subscriptions (tmgs_group);

-- New definition for mw.translate_message_group_subscriptions
CREATE TABLE `translate_message_group_subscriptions` (
  `tmgs_user_id` int(10) unsigned NOT NULL,
  `tmgs_group` varbinary(200) NOT NULL,
  PRIMARY KEY (`tmgs_user_id`,`tmgs_group`),
  KEY `translate_message_group_subscriptions_tmgs_group_IDX` (`tmgs_group`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=binary;

SQL generated using Dbeaver.

yeah, adding a PK is important for the sake of replication, backup, integration checks and recovery. It doesn't need to be a dedicated auto_increment column so current schema looks fine to me. If the virtual domain part and the pk part are merged, this is good to go?

Change #1067314 had a related patch set uploaded (by Abijeet Patro; author: Abijeet Patro):

[mediawiki/extensions/Translate@master] MessageGroupSubscription: Use composite primary key

https://gerrit.wikimedia.org/r/1067314

Change #1067351 had a related patch set uploaded (by Abijeet Patro; author: Abijeet Patro):

[mediawiki/extensions/Translate@master] MessageGroupSubscriptionStore: Use virtual domain

https://gerrit.wikimedia.org/r/1067351

Change #1067314 merged by jenkins-bot:

[mediawiki/extensions/Translate@master] MessageGroupSubscription: Use composite primary key

https://gerrit.wikimedia.org/r/1067314

Change #1067351 merged by jenkins-bot:

[mediawiki/extensions/Translate@master] MessageGroupSubscriptionStore: Use virtual domain

https://gerrit.wikimedia.org/r/1067351

This table with the new schema is deployed on translatewiki.net. Did not notice any issues.

Why has x1 been suggested as the location? The table has not been designed for cross-wiki use.

Why has x1 been suggested as the location? The table has not been designed for cross-wiki use.

That's a common misconception. x1 has a shared database for cross-wiki use cases ("wikishared") but also every wiki gets its own databases (so we do have a metawiki database in x1 too) which devs can put tables there.

The reason it's being recommended instead of core databases (when they don't need to join with core tables) is that we don't have a lot of space in core db servers. I wrote in more details in a couple of places (this for wikidata, for commons and here as a more general note)

Thanks for the clarification. Maybe https://wikitech.wikimedia.org/wiki/MariaDB#x1 could be updated to clarify, although I now notice there are [wiki] items on the list (and outdated team name, I'll fix that).

Thanks. Added a sentence there. Edit mercilessly.

Currently waiting to create the tables on x1

I will help getting it done, just quite busy these days.

Created the table on testwiki

mwscript sql.php --wiki=testwiki --cluster=extension1 /srv/mediawiki/php-1.43.0-wmf.22/extensions/Translate/sql/mysql/translate_message_group_subscriptions.sql

Checked if the table is created

mwscript mysql.php --wiki testwiki --cluster extension1

Table description:

wikiadmin2023@10.x.x.x(testwiki)> describe translate_message_group_subscriptions;
+--------------+------------------+------+-----+---------+-------+
| Field        | Type             | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+-------+
| tmgs_user_id | int(10) unsigned | NO   | PRI | NULL    |       |
| tmgs_group   | varbinary(200)   | NO   | PRI | NULL    |       |
+--------------+------------------+------+-----+---------+-------+
2 rows in set (0.001 sec)

I have a patch to enable the message group subscription feature on testwiki: 1072166: Enable message group subscription feature for Test Wikipedia | https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1072166

I confirm that it's created on x1

Change #1075521 had a related patch set uploaded (by Abijeet Patro; author: Abijeet Patro):

[operations/mediawiki-config@master] Translate: Add VirtualDomainsMapping

https://gerrit.wikimedia.org/r/1075521

We enabled the message group subscription feature on testwiki today, but encountered database errors.

Error 1146: Table 'testwiki.translate_message_group_subscriptions' doesn't exist
Function: MediaWiki\Extension\Translate\MessageGroupProcessing\MessageGroupSubscriptionStore::getSubscriptions

I realized that I did not add the VirtualDomainsMapping. I've submitted the patch to do so here: 1075521: Translate: Add VirtualDomainsMapping | https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1075521

Change #1075521 merged by jenkins-bot:

[operations/mediawiki-config@master] Translate: Add VirtualDomainsMapping

https://gerrit.wikimedia.org/r/1075521

Mentioned in SAL (#wikimedia-operations) [2024-09-26T07:04:59Z] <kartik@deploy1003> Started scap sync-world: Backport for [[gerrit:1075521|Translate: Add VirtualDomainsMapping (T372287)]]

Mentioned in SAL (#wikimedia-operations) [2024-09-26T07:07:45Z] <kartik@deploy1003> abi, kartik: Backport for [[gerrit:1075521|Translate: Add VirtualDomainsMapping (T372287)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-09-26T07:18:24Z] <kartik@deploy1003> Finished scap sync-world: Backport for [[gerrit:1075521|Translate: Add VirtualDomainsMapping (T372287)]] (duration: 13m 25s)

Seen on testwiki yesterday (in case this is something to worry about)

Error
labels.normalized_message
[{reqId}] {exception_url}   PHP Deprecated: Caller from MediaWiki\User\UserGroupManager::getUserGroupMemberships ignored an error originally raised from MediaWiki\Extension\Translate\MessageGroupProcessing\MessageGroupSubscriptionStore::getSubscriptions:
FrameLocationCall
from/srv/mediawiki/php-1.43.0-wmf.24/includes/debug/MWDebug.php(385)
#0[internal function]MWExceptionHandler::handleError(int, string, string, int, array)
#1/srv/mediawiki/php-1.43.0-wmf.24/includes/debug/MWDebug.php(385)trigger_error(string, int)
#2/srv/mediawiki/php-1.43.0-wmf.24/includes/db/MWLBFactory.php(453)MediaWiki\Debug\MWDebug::sendRawDeprecated(string, bool, string)
#3[internal function]MWLBFactory::logDeprecation(string)
#4/srv/mediawiki/php-1.43.0-wmf.24/includes/libs/rdbms/database/TransactionManager.php(191)call_user_func(array, string)
#5/srv/mediawiki/php-1.43.0-wmf.24/includes/libs/rdbms/database/Database.php(986)Wikimedia\Rdbms\TransactionManager->assertTransactionStatus(Wikimedia\Rdbms\DatabaseMySQL, array, string)
#6/srv/mediawiki/php-1.43.0-wmf.24/includes/libs/rdbms/database/Database.php(635)Wikimedia\Rdbms\Database->assertQueryIsCurrentlyAllowed(string, string)
#7/srv/mediawiki/php-1.43.0-wmf.24/includes/libs/rdbms/database/Database.php(1344)Wikimedia\Rdbms\Database->query(Wikimedia\Rdbms\Query, string)
#8/srv/mediawiki/php-1.43.0-wmf.24/includes/libs/rdbms/database/DBConnRef.php(127)Wikimedia\Rdbms\Database->select(array, array, array, string, array, array)
#9/srv/mediawiki/php-1.43.0-wmf.24/includes/libs/rdbms/database/DBConnRef.php(351)Wikimedia\Rdbms\DBConnRef->__call(string, array)
#10/srv/mediawiki/php-1.43.0-wmf.24/includes/libs/rdbms/querybuilder/SelectQueryBuilder.php(746)Wikimedia\Rdbms\DBConnRef->select(array, array, array, string, array, array)
#11/srv/mediawiki/php-1.43.0-wmf.24/includes/user/UserGroupManager.php(771)Wikimedia\Rdbms\SelectQueryBuilder->fetchResultSet()
#12/srv/mediawiki/php-1.43.0-wmf.24/includes/user/UserGroupManager.php(737)MediaWiki\User\UserGroupManager->getUserGroupMemberships(MediaWiki\User\User, int)
#13/srv/mediawiki/php-1.43.0-wmf.24/includes/user/UserGroupManager.php(323)MediaWiki\User\UserGroupManager->getUserGroups(MediaWiki\User\User, int)
#14/srv/mediawiki/php-1.43.0-wmf.24/includes/Permissions/PermissionManager.php(1539)MediaWiki\User\UserGroupManager->getUserEffectiveGroups(MediaWiki\User\User)
#15/srv/mediawiki/php-1.43.0-wmf.24/includes/Permissions/PermissionManager.php(1490)MediaWiki\Permissions\PermissionManager->getUserPermissions(MediaWiki\User\User)
#16/srv/mediawiki/php-1.43.0-wmf.24/includes/Permissions/UserAuthority.php(285)MediaWiki\Permissions\PermissionManager->userHasRight(MediaWiki\User\User, string)
#17/srv/mediawiki/php-1.43.0-wmf.24/includes/Permissions/UserAuthority.php(144)MediaWiki\Permissions\UserAuthority->internalAllowed(string, null, bool, null)
#18/srv/mediawiki/php-1.43.0-wmf.24/includes/user/User.php(2174)MediaWiki\Permissions\UserAuthority->isAllowed(string, null)
#19/srv/mediawiki/php-1.43.0-wmf.24/includes/user/User.php(1433)MediaWiki\User\User->isAllowed(string)
#20/srv/mediawiki/php-1.43.0-wmf.24/includes/block/BlockManager.php(758)MediaWiki\User\User->getBlock()
#21/srv/mediawiki/php-1.43.0-wmf.24/includes/MediaWikiEntryPoint.php(408)MediaWiki\Block\BlockManager->trackBlockWithCookie(MediaWiki\User\User, MediaWiki\Request\WebResponse)
#22/srv/mediawiki/php-1.43.0-wmf.24/includes/MediaWikiEntryPoint.php(189)MediaWiki\MediaWikiEntryPoint->commitMainTransaction()
#23/srv/mediawiki/php-1.43.0-wmf.24/includes/MediaWikiEntryPoint.php(172)MediaWiki\MediaWikiEntryPoint->doPrepareForOutput()
#24/srv/mediawiki/php-1.43.0-wmf.24/includes/MediaWiki.php(90)MediaWiki\MediaWikiEntryPoint->prepareForOutput()
#25/srv/mediawiki/php-1.43.0-wmf.24/includes/api/ApiMain.php(960)MediaWiki::preOutputCommit(MediaWiki\Context\DerivativeContext)
#26/srv/mediawiki/php-1.43.0-wmf.24/includes/api/ApiMain.php(903)ApiMain->executeActionWithErrorHandling()
#27/srv/mediawiki/php-1.43.0-wmf.24/includes/api/ApiEntryPoint.php(153)ApiMain->execute()
#28/srv/mediawiki/php-1.43.0-wmf.24/includes/MediaWikiEntryPoint.php(200)MediaWiki\Api\ApiEntryPoint->execute()
#29/srv/mediawiki/php-1.43.0-wmf.24/api.php(44)MediaWiki\MediaWikiEntryPoint->run()
#30/srv/mediawiki/w/api.php(3)require(string)
#31{main}

Seen on testwiki yesterday (in case this is something to worry about)
....

Apologies for not responding to this earlier. I'd investigated this earlier and it was because of a missing virtual domain mapping. This was added later: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1075521

List of wikis to create the table in:

  • advisorswiki
  • aewikimedia
  • amwikimedia
  • azwikimedia
  • bdwikimedia
  • betawikiversity
  • bewikimedia
  • brwikimedia
  • cawikimedia
  • collabwiki
  • commonswiki
  • foundationwiki
  • frwiktionary
  • gewikimedia
  • grwikimedia
  • hiwikimedia
  • idwikimedia
  • incubatorwiki
  • legalteamwiki
  • maiwikimedia
  • mediawikiwiki
  • metawiki
  • nowikimedia
  • otrs_wikiwiki
  • outreachwiki
  • plwikimedia
  • ptwikisource
  • punjabiwikimedia
  • ruwikimedia
  • sewikimedia
  • sourceswiki
  • specieswiki
  • testcommonswiki
  • testwiki
  • testwikidatawiki
  • u4cwiki
  • uawikimedia
  • vewikimedia
  • wbwikimedia
  • wikidatawiki
  • wikifunctionswiki
  • wikimania2012wiki
  • wikimania2013wiki
  • wikimania2014wiki
  • wikimania2015wiki
  • wikimania2016wiki
  • wikimania2017wiki
  • wikimania2018wiki
  • wikimaniawiki

These are wikis where translate extension is enabled.

Command to run:

mwscript sql.php --wiki=<wiki-id> --cluster=extension1 /srv/mediawiki/php-1.43.0-wmf.28/extensions/Translate/sql/mysql/translate_message_group_subscriptions.sql

If you don't mind, please update tables catalog and add these tables. We already have everything for other translate tables: https://github.com/wikimedia/operations-puppet/blob/production/modules/mediawiki/files/mariadb/tables-catalog.yaml

If you don't mind, please update tables catalog and add these tables. We already have everything for other translate tables: https://github.com/wikimedia/operations-puppet/blob/production/modules/mediawiki/files/mariadb/tables-catalog.yaml

I will do this tomorrow. This has to be updated manually I assume?

Change #1082549 had a related patch set uploaded (by Abijeet Patro; author: Abijeet Patro):

[operations/puppet@production] tables-catalog: Add translate_message_group_subscriptions table

https://gerrit.wikimedia.org/r/1082549

If you don't mind, please update tables catalog and add these tables. We already have everything for other translate tables: https://github.com/wikimedia/operations-puppet/blob/production/modules/mediawiki/files/mariadb/tables-catalog.yaml

Done in:

  1. 1082546: tables-catalog: Add translate_cache table | https://gerrit.wikimedia.org/r/c/operations/puppet/+/1082546
  2. 1082549: tables-catalog: Add translate_message_group_subscriptions table | https://gerrit.wikimedia.org/r/c/operations/puppet/+/1082549

Change #1082602 had a related patch set uploaded (by Abijeet Patro; author: Abijeet Patro):

[operations/puppet@production] createExtensionTables: Add translate_cache, translate_message_group_subscriptions tables

https://gerrit.wikimedia.org/r/1082602

Change #1082602 abandoned by Abijeet Patro:

[operations/puppet@production] createExtensionTables: Add translate_cache, translate_message_group_subscriptions tables

Reason:

Incorrectly submitted

https://gerrit.wikimedia.org/r/1082602

Change #1082603 had a related patch set uploaded (by Abijeet Patro; author: Abijeet Patro):

[mediawiki/extensions/WikimediaMaintenance@master] createExtensionTables: Add tables for translate extension

https://gerrit.wikimedia.org/r/1082603

Change #1082603 merged by jenkins-bot:

[mediawiki/extensions/WikimediaMaintenance@master] createExtensionTables: Add tables for translate extension

https://gerrit.wikimedia.org/r/1082603

Change #1082549 merged by Ladsgroup:

[operations/puppet@production] tables-catalog: Add translate_message_group_subscriptions table

https://gerrit.wikimedia.org/r/1082549

Table has been created on all wikis were Translate extension is enabled even though the subscription feature may not be enabled there yet. We also updated the scripts that run when a new wiki is created in order to automatically create the table where needed.