We plan to put chart definitions in the Data: namespace on Commons, and to use the existing Data:*.tab pages on Commons as data sources for charts.
We will need to track which pages on other wikis use each Data:*.chart and Data:*.tab page, so that we can do the following:
- On each chart page and tabular data page, display a list of pages that uses it
- When a chart page or tabular data page is edited, purge/rerender the pages that use it
To explore how to do this, we can look at:
- The GlobalUsage extension, which has a global database table for usage tracking and enqueues purge jobs when a file is changed. This appears to do everything we need, but unfortunately its database schema is specific to images. We could explore generalizing this to non-image resources.
- The JsonConfig extension, which provides the content model for chart and tabular data pages, and supports loading them cross-wiki. It doesn't do any usage tracking, but this could be a logical place to either put the usage tracking itself, or to put code that instructs the GlobalUsage extension to track usages.
- How Wikibase does its own usage tracking
- This stalled RFC from 2020: T253026: Introduce a centralized Dependency Tracking Service
Whatever we do here, we should discuss with the MediaWiki-Platform-Team , even if we do the work ourselves.
Current proposal
Add a new DB table that belongs to Commons but lives on x1 instead of s4 (meaning it couldn't be joined with tables from the commonswiki DB). Use this to track usage of JsonConfig pages similarly to how the GlobalUsage extension currently tracks image usage, with similar invalidation handling through the job queue to ensure that when a chart is edited, pages that use it are invalidated.
Proposed DB schema:
-- This has the same schema as the linktarget table, but since we intend for globaljsonlinks to be in x1, we can't join against linktarget CREATE TABLE globaljsonlinks_target ( gjlt_id BIGINT UNSIGNED AUTO_INCREMENT NOT NULL, gjlt_namespace INT NOT NULL, gjlt_title VARBINARY(255) NOT NULL, UNIQUE INDEX gjlt_namespace_title (gjlt_namespace, gjlt_title), PRIMARY KEY (gjlt_id) ); CREATE TABLE globaljsonlinks_source_ns ( gjlsn_id UNSIGNED AUTO_INCREMENT NOT NULL, gjlsn_wiki VARBINARY(32) NOT NULL, gjlsn_namespace VARBINARY(255) NOT NULL, UNIQUE INDEX gjlsn_wiki_namespace (gjlsn_wiki, gjlsn_namespace), PRIMARY KEY (gjlsn_id) ); CREATE TABLE globaljsonlinks ( gjl_source_wiki_ns UNSIGNED NOT NULL, /* refers to globaljsonlinks_source_ns.gjlsn_id */ gjl_source_title VARBINARY(255) NOT NULL, gjl_target BIGINT UNSIGNED NOT NULL, /* refers to globaljsonlinks_target.gjlt_id */ INDEX gjl_target_source (gjl_target, gjl_source_wiki_namespace, gjl_source_title), PRIMARY KEY (gjl_source_wiki_namespace, gjl_source_title, gjl_target) );
Open questions
What should the scope of this table be?Cross-wiki usage of JsonConfig pages hosted on CommonsShould it also include image links in the future (replacing globalimagelinks, or should those live in a separate table with a similar schema?No, those should live in a separate tableShould it be limited to Charts usage only? Should it be slightly broader but limited to JsonConfig usage?It should be limited to JsonConfig usage, but not necessarily Charts usage
Depending on the scope, should the namespace of the target page (gjlt_namespace) exist as a field, or should it be implicit (Data for the Charts/JsonConfig table, File for the image table)The JsonConfig extension currently allows its pages to live in different namespaces, so in theory we still need the namespace field. But in practice we probably don't, so we could still change this by limiting the scope to JsonConfig pages in the Data namespace specifically.Depending on the scope, what should the name of the table be? (e.g. globallinks if broad, globalchartlinks or globaljsonlinks if narrow)globaljsonlinksWhich extension should define this table and maintain/use its contents?The JsonConfig extension
See also: