We've realized that we need to "move main ref to reflist" in some cases, but templated reflists are a major obstacle to this. Let's measure how common this case is.
Additional side question: how often are there multiple reflists for the same group, on one page?
Requirements
- Add code to the scraper to tally reflists coming from a template.
- Run scraper on hewiki, dewiki and enwiki dumps. Or a full run on all dumps.
Implementation hints
- Demonstrate a reflist template usage on the Beta Cluster: https://en.wikipedia.beta.wmflabs.org/wiki/User:Adamw/Example/Reflist
- Start the snippet snipper: livebook server notebooks/test-snipper.livemd
- Use fetch_parsed_page to pull the reflist example page and save to test/data.
- In cite_refs.ex
- Follow the pattern for transclusions_with_contained_refs to do something similar for reflists. Search through all transclusion outputs to find produced reflists.
- Save the names of any template that might be producing a reflist. Tally the number of reflists that come from a template.
- Write tests along with implementation. Analyze the fixture and assert that the final page aggregation includes the new fields.
- Document the new fields in metrics.md