[go: up one dir, main page]

Page MenuHomePhabricator

Verify proposing A/B test wikis will provide sufficient edit volume to draw conclusions from
Closed, ResolvedPublic

Description

T345298 will propose a list of wikis we will invite to participate in the A/B test that will evaluate the impact of the initial reference check (T342930).

This task involves the work of determining whether there is enough edit volume at the proposed wikis for the Editing Team to draw meaningful conclusions from/with.

Decision(s) to be made

  • 1. What – if any – adjustments will need to be made to the list of wikis T345298 proposes to ensure the Editing Team can draw meaningful conclusions from the analysis we will do in T342930?

Event Timeline

MNeisler triaged this task as Medium priority.Dec 1 2023, 5:11 PM
MNeisler moved this task from Triage to Current Quarter on the Product-Analytics board.

Some data on the initial list of wikis proposed in T345298

Diversity of Sample

  • 12 wikis total including a good portion of larger wikis. Note: It might be good to add three more small to mid-size wikis to increase the diversity of the sample if possible.
  • Includes several wikis (frwiki, ptwiki, yowiki, igwiki) with a high proportion or high total number of distinct editors from Sub-Saharan Africa.

Estimate of data sample size:

  • 9 of the proposed wikis have over 30 average junior contributors a day. See superset chart.
  • 8 out of the 12 proposed wikis have over 40 monthly new active editors (Based on Jan 2023 data in wiki comparison sheet)
  • We can expect edit check to be shown to 5 to 20% of all users that make an edit attempt based on the rate of edit check being shown at current partner wikis (see data in T345298#9401401).
  • Assuming similar rates apply to these AB Test wikis, I estimated the proportion of users we would expect the see edit check at each wiki over a two week test duration.
    • Larger Wikis: Between 200 to 500 distinct users
    • Medium Sized Wikis: 100 to 200 distinct users
    • Small Sized Wikis: Only 2 to 5 distinct users likely

The estimated sample size for larger and medium sized wikis would be sufficient to complete an analysis. Small wikis would likely need to be excluded from any per wiki analysis.

Updated based on the final selected 8 wikis: arwiki, frwiki, jawiki, viwiki, afwiki, itwiki,ptwiki, zhwiki

Diversity of Sample

  • 8 wikis total including 4 large-size ones: frwiki, itwiki, zhwiki, ptwiki based on monthly active editors and monthly unique devices.
  • We considered including some additional small wikis as well; however, we had to consider if the edit check feature would likely be shown. At some of these smaller wikis, the new content edits by newcomers primarily consist of creating new articles through translations and we would have very limited data to assess any impact at those wikis.
  • The selected wiki list includes wikis (frwiki, ptwiki, arwiki, afwiki) with a high proportion or high total number of distinct editors from Sub-Saharan Africa.
  • All of these wikis have a high percentage of mobile unique devices (over 50%) and majority-mobile editors (over 30%), indicating we should have a decent sample size of mobile data to review for the AB test.

Estimate of data sample size:

  • 8 of the proposed wikis have over 100 average junior contributors a day.
  • All wikis except afwiki, have well over 40 monthly new active editors. (Based on Jan 2023 data in wiki comparison sheet)
  • We can expect edit check to be shown to 5 to 20% of all users (including desktop and mobile web) that make an edit attempt based on the rate of edit check being shown at current partner wikis (see data inT345298#9401401).

Assuming similar rates apply to these AB Test wikis, I estimate the proportion of users we would expect the see edit check at each wiki over a two-week test duration.
Larger Wikis: Between 200 to 500 distinct users
Medium Sized Wikis: 100 to 200 distinct users
Small Sized Wikis: Only 2 to 5 distinct users likely

Based on these details, I believe we will have a sufficient sample size to complete the analysis planned in T342930.

I will plan to check the AB test results around 2 weeks to confirm that we have enough data or if we might need to run the test for longer. This will be done as part of QA for T352122