[go: up one dir, main page]

Page MenuHomePhabricator

Investigate how many reflists are produced by templates
Open, Needs TriagePublic

Description

We've realized that we need to "move main ref to reflist" in some cases, but templated reflists are a major obstacle to this. Let's measure how common this case is.

Additional side question: how often are there multiple reflists for the same group, on one page?

Requirements

  • Add code to the scraper to tally reflists coming from a template.
  • Run scraper on hewiki, dewiki and enwiki dumps. Or a full run on all dumps.

Implementation hints

  • Demonstrate a reflist template usage on the Beta Cluster: https://en.wikipedia.beta.wmflabs.org/wiki/User:Adamw/Example/Reflist
  • Start the snippet snipper: livebook server notebooks/test-snipper.livemd
  • Use fetch_parsed_page to pull the reflist example page and save to test/data.
  • In cite_refs.ex
    • Follow the pattern for transclusions_with_contained_refs to do something similar for reflists. Search through all transclusion outputs to find produced reflists.
    • Save the names of any template that might be producing a reflist. Tally the number of reflists that come from a template.
    • Write tests along with implementation. Analyze the fixture and assert that the final page aggregation includes the new fields.
  • Document the new fields in metrics.md

Event Timeline

Dropping this task because we already know the general answer: it's a huge issue because there are entire wikis such as English Wikipedia which recommend wrapping the references tag in a template for all articles, in the manual of style.

Reopening and bringing into the sprint to support a discussion about our incremental approach to the subref creation workflow.

awight moved this task from Doing to Sprint Backlog on the WMDE-TechWish-Sprint-2024-09-04 board.