[go: up one dir, main page]

Page MenuHomePhabricator

Analytics-KanbanGroup
ArchivedPublic

Details

Description

Superseded by Data-Engineering-Kanban, see T287531: Create project tag for Data-Engineering. This was the Kanban board for the Analytics team. People moved tasks here from the Analytics board.

For an explanation of the tags and other conventions, see https://www.mediawiki.org/wiki/Analytics/Development_Process#Kanban

Recent Activity

Mon, Nov 4

nshahquinn-wmf updated the task description for T292479: wmfdata.mariadb relies on analytics-mysql being available.
Mon, Nov 4, 11:30 PM · Data-Engineering, Product-Analytics, Analytics-Kanban, Wmfdata-Python

Sep 28 2024

mpopov closed T190769: Notebook machine to double as RStudio Server?, a subtask of T224658: Newpyter - SWAP Juypter Rewrite, as Invalid.
Sep 28 2024, 6:21 PM · Analytics-Kanban, Analytics

Sep 25 2024

Maintenance_bot removed a project from T174640: Invalid "wikimedia" family in unique devices data due to misplaced WMF-Last-Access-Global cookie : Patch-For-Review.
Sep 25 2024, 1:32 PM · Analytics-Kanban, Traffic, SRE

Jun 26 2024

Maintenance_bot removed a project from T288853: Migrated Server-side EventLogging events recording http.client_ip as 127.0.0.1 : Patch-For-Review.
Jun 26 2024, 3:31 PM · MW-1.38-notes (1.38.0-wmf.1; 2021-09-21), Analytics-Kanban, SRE, Traffic, Metrics Platform, Data-Engineering, Growth-Team, Product-Analytics, Analytics

Jun 23 2024

Pelajanela added a comment to T263908: Article on Carles Puigdemont has inflated pageviews in many projects.

Thank you for bringing back this case/issue to the "table", @Larske!
It seems that the manifestation of this unauthentic behavior has vanished in Buglarian Wikipedia as the article for Puigdemont in Bulgarian finished in 13th place in 2023 but is nowhere to be seen in the last three months in the monthly top 100 most-viewed articles charts. It was on #35 in January 2024.
And, as you can see following the link, mobile views represent only 0.1%
I hope this case will lead to some engineering breakthrough because once a vulnerability in automated detection is exploited in such a way, the probability that it will not happen again and that no one else will reproduce the manipulation seems to me to be very unlikely.

Jun 23 2024, 4:22 PM · Analytics-Kanban, Pageviews-Anomaly
Larske added a comment to T263908: Article on Carles Puigdemont has inflated pageviews in many projects.

Any updates on this? (I can't find any "answer with future steps")

Jun 23 2024, 3:40 PM · Analytics-Kanban, Pageviews-Anomaly

Jun 18 2024

Ottomata added a subtask for T206785: Modern Event Platform: Stream Intake Service (EventGate): Implementation: T256891: EventGate and EventStreams rate limiting.
Jun 18 2024, 12:45 PM · Analytics-Kanban, Platform Team Legacy (Watching / External), Services (watching), MediaWiki-extensions-EventLogging, Event-Platform, Analytics

May 16 2024

Maintenance_bot removed a project from T220456: Many small wikis missing from mediawiki_history dataset: Patch-For-Review.
May 16 2024, 9:32 PM · Analytics-Kanban, Analytics-Data-Quality, Analytics, Product-Analytics

Apr 4 2024

Maintenance_bot removed a project from T214089: Update git lfs on stat1006/7: Patch-For-Review.
Apr 4 2024, 8:30 AM · git-lfs, User-Elukey, Analytics-Kanban, Analytics
hashar added a project to T214089: Update git lfs on stat1006/7: git-lfs.
Apr 4 2024, 8:26 AM · git-lfs, User-Elukey, Analytics-Kanban, Analytics

Mar 11 2024

nshahquinn-wmf added a comment to T198425: "Total Article Count" (a.k.a "pages to date") Wikistats metric (per project and overall).
In T198425#9604028, @Sj wrote:

For en:wp, this seems to differ from WP:Statistics by a factor of two. (looks like the latter shows ~500 articles/day)

Mar 11 2024, 8:07 PM · Data-Engineering, Analytics-Kanban, Analytics, Data-Engineering-Wikistats

Mar 6 2024

Restricted Application added a project to T198425: "Total Article Count" (a.k.a "pages to date") Wikistats metric (per project and overall): Data-Engineering.

For en:wp, this seems to differ from WP:Statistics by a factor of two. (looks like the latter shows ~500 articles/day)

Mar 6 2024, 3:27 AM · Data-Engineering, Analytics-Kanban, Analytics, Data-Engineering-Wikistats

Jan 17 2024

Harej added a project to T139324: Make top pages for WP:MED articles: Wikimedia-Medicine.
Jan 17 2024, 4:00 AM · Wikimedia-Medicine, Analytics-Kanban

Jan 16 2024

gerritbot added a comment to T273642: Add analytics-presto.eqiad.wmnet CNAME for Presto coordinator failover.

Change 709713 merged by Btullis:

[operations/puppet@production] Switch presto from Puppet to PKI certificates

https://gerrit.wikimedia.org/r/709713

Jan 16 2024, 11:06 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters

Jan 8 2024

lbowmaker closed T266798: [Event Platform] Enable canary events for all MediaWiki streams, a subtask of T251609: Automate ingestion and refinement into Hive of event data from Kafka using stream configs and canary/heartbeat events, as Resolved.
Jan 8 2024, 7:50 PM · Data-Engineering, MW-1.36-notes (1.36.0-wmf.8; 2020-09-08), MW-1.35-notes (1.35.0-wmf.41; 2020-07-14), Patch-For-Review, Analytics-Kanban, Analytics, MediaWiki-extensions-EventLogging, Event-Platform

Dec 19 2023

gerritbot added a comment to T273642: Add analytics-presto.eqiad.wmnet CNAME for Presto coordinator failover.

Change 709737 abandoned by Btullis:

[operations/puppet@production] Add presto keytabs to the cluster coordinator replica role

Reason:

No longer needed.

https://gerrit.wikimedia.org/r/709737

Dec 19 2023, 10:38 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters

Dec 13 2023

BTullis closed T292388: Move the Analytics/DE testing infrastructure to Pontoon as Declined.

I'm declining this task, as we haven't invested any more time into pontoon recently and seem unlikely to do so in the near future.
The kerberos automation in T292389 is still potentially useful though and still has patches open for review, so we should decided whether or not we will ever want to move forward with that too.

Dec 13 2023, 5:11 PM · Pontoon, Data-Engineering, Analytics-Kanban

Dec 7 2023

Ottomata removed a parent task for T251609: Automate ingestion and refinement into Hive of event data from Kafka using stream configs and canary/heartbeat events: T238230: Decommission EventLogging backend components by migrating to MEP.
Dec 7 2023, 7:43 PM · Data-Engineering, MW-1.36-notes (1.36.0-wmf.8; 2020-09-08), MW-1.35-notes (1.35.0-wmf.41; 2020-07-14), Patch-For-Review, Analytics-Kanban, Analytics, MediaWiki-extensions-EventLogging, Event-Platform
Ottomata closed T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned as Resolved.

There is only one remaining schema to migrate (T323828), so we have completed this task! We have 'determined' what to do with all schemas. Resolving!

Dec 7 2023, 7:39 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform
Ottomata updated the task description for T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned.
Dec 7 2023, 7:38 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform
Ottomata closed T353014: Decommission all legacy EventLogging MobileWikiApp* schemas., a subtask of T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned, as Declined.
Dec 7 2023, 7:31 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform

Dec 4 2023

Maintenance_bot removed a project from T287864: Deploy an-test-coord1002 to facilitate failover testing of analytics coordinator role: Patch-For-Review.
Dec 4 2023, 12:11 PM · Data-Engineering, Analytics-Kanban
gerritbot added a comment to T287864: Deploy an-test-coord1002 to facilitate failover testing of analytics coordinator role.

Change 714753 abandoned by Btullis:

[operations/puppet@production] Add replica hadoop coordinator role in the test cluster

Reason:

No longer required

https://gerrit.wikimedia.org/r/714753

Dec 4 2023, 11:58 AM · Data-Engineering, Analytics-Kanban

Dec 1 2023

gerritbot added a comment to T236895: ArticlePlaceholder dashboard stopped tracking page views.

Change 572713 abandoned by Ladsgroup:

[analytics/refinery@master] Pass spark_job_jar as an argument in ArticlePlaceholder oozie job

Reason:

Waaay too outdated, it should be replaced with ariflow DAG.

https://gerrit.wikimedia.org/r/572713

Dec 1 2023, 4:29 PM · Analytics-Radar, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Analytics-Kanban, User-Ladsgroup, Patch-For-Review, [DEPRECATED] wdwb-tech, Wikidata, ArticlePlaceholder

Nov 23 2023

Maintenance_bot removed a project from T211247: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline: Patch-For-Review.
Nov 23 2023, 10:10 AM · Data-Engineering, Analytics-Kanban, Platform Team Legacy (Watching / External), Services (watching), MediaWiki-extensions-EventLogging, Event-Platform, Analytics
Gehel reopened T213561: Discovery for Kafka cluster brokers, a subtask of T211247: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline, as Open.
Nov 23 2023, 9:46 AM · Data-Engineering, Analytics-Kanban, Platform Team Legacy (Watching / External), Services (watching), MediaWiki-extensions-EventLogging, Event-Platform, Analytics

Nov 21 2023

Ottomata updated the task description for T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned.
Nov 21 2023, 7:20 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform
Aklapper placed T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned up for grabs.

@Ottomata: Per emails from Sep18 and Oct20 and https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup , I am resetting the assignee of this task because there has not been progress lately (please correct me if I am wrong!). Resetting the assignee avoids the impression that somebody is already working on this task. It also allows others to potentially work towards fixing this task. Please claim this task again when you plan to work on it (via Add Action...Assign / Claim in the dropdown menu) - it would be welcome. Thanks for your understanding!

Nov 21 2023, 8:21 AM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform

Nov 20 2023

Maintenance_bot removed a project from T181646: Enable ores::base on stat1006: Patch-For-Review.
Nov 20 2023, 12:33 PM · Analytics-Kanban, Machine-Learning-Team, ORES, Analytics
isarantopoulos moved T181646: Enable ores::base on stat1006 from Unsorted to 2023-2024 Q3 Done on the Machine-Learning-Team board.
Nov 20 2023, 12:18 PM · Analytics-Kanban, Machine-Learning-Team, ORES, Analytics
isarantopoulos moved T277609: Generate dump of scored-revisions from 2018-2020 for English Wikipedia from Unsorted to 2023-2024 Q3 Done on the Machine-Learning-Team board.
Nov 20 2023, 11:44 AM · Analytics-Kanban, Data-Services, artificial-intelligence, editquality-modeling, ORES, Analytics, Machine-Learning-Team

Nov 10 2023

lbowmaker moved T292479: wmfdata.mariadb relies on analytics-mysql being available from Data Products & Metrics to Icebox (not considered in current quarter) on the Data-Engineering board.
Nov 10 2023, 2:42 PM · Data-Engineering, Product-Analytics, Analytics-Kanban, Wmfdata-Python
lbowmaker removed a project from T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned: Data Engineering and Event Platform Team.
Nov 10 2023, 2:29 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform
lbowmaker moved T292388: Move the Analytics/DE testing infrastructure to Pontoon from Radar (External Teams) to Icebox (not considered in current quarter) on the Data-Engineering board.
Nov 10 2023, 1:23 PM · Pontoon, Data-Engineering, Analytics-Kanban

Oct 20 2023

lbowmaker moved T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned from Parent Tasks/Epics to Event Platform Backlog on the Data Engineering and Event Platform Team board.
Oct 20 2023, 2:57 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform

Oct 19 2023

Ottomata updated the task description for T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned.
Oct 19 2023, 1:07 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform
matmarex closed T349005: Decommission Schema:CentralAuth, a subtask of T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned, as Resolved.
Oct 19 2023, 11:34 AM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform
Maintenance_bot removed a project from T255026: Upgrade schema[12]00[12] to Debian Buster: Patch-For-Review.
Oct 19 2023, 11:11 AM · Analytics-Kanban, Analytics-Clusters

Oct 17 2023

Ottomata closed T185233: Modern Event Platform as Resolved.

We've made good progress for the Stream Processing component.

Oct 17 2023, 2:30 PM · Data-Engineering, Data Engineering and Event Platform Team, Platform Team Workboards (Initiatives), Core Platform Team Initiatives (Modern Event Platform (TEC2)), Goal, Services (watching), MediaWiki-extensions-EventLogging, Event-Platform, Analytics-Kanban
Ottomata updated the task description for T185233: Modern Event Platform.
Oct 17 2023, 2:28 PM · Data-Engineering, Data Engineering and Event Platform Team, Platform Team Workboards (Initiatives), Core Platform Team Initiatives (Modern Event Platform (TEC2)), Goal, Services (watching), MediaWiki-extensions-EventLogging, Event-Platform, Analytics-Kanban
Aklapper added a comment to T185233: Modern Event Platform.

Does this task serve any purpose in itself that is not covered by the Event-Platform project tag?

Oct 17 2023, 7:35 AM · Data-Engineering, Data Engineering and Event Platform Team, Platform Team Workboards (Initiatives), Core Platform Team Initiatives (Modern Event Platform (TEC2)), Goal, Services (watching), MediaWiki-extensions-EventLogging, Event-Platform, Analytics-Kanban

Oct 16 2023

matmarex added a subtask for T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned: T349005: Decommission Schema:CentralAuth.
Oct 16 2023, 3:43 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform

Oct 3 2023

Ottomata closed T253392: Document in-schema who sets which fields, a subtask of T185233: Modern Event Platform, as Declined.
Oct 3 2023, 7:14 PM · Data-Engineering, Data Engineering and Event Platform Team, Platform Team Workboards (Initiatives), Core Platform Team Initiatives (Modern Event Platform (TEC2)), Goal, Services (watching), MediaWiki-extensions-EventLogging, Event-Platform, Analytics-Kanban
phuedx closed T330766: Decommission the EditorActivation instrument, a subtask of T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned, as Resolved.
Oct 3 2023, 5:55 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform

Sep 22 2023

Maintenance_bot removed a project from T280649: Add a spark job loading Cassandra 3: Patch-For-Review.
Sep 22 2023, 1:11 PM · Analytics-Kanban
gerritbot added a comment to T280649: Add a spark job loading Cassandra 3.

Change 681682 abandoned by Joal:

[analytics/refinery@master] Cleanup cassandra double loading

Reason:

Oozie has been deprecated, those jobs are now in Airflow.

https://gerrit.wikimedia.org/r/681682

Sep 22 2023, 12:57 PM · Analytics-Kanban

Sep 15 2023

Maintenance_bot removed a project from T266573: eventgate-analytics-external occasionally seems to fail lookups of dynamic stream config from MW EventStreamConfig API: Patch-For-Review.
Sep 15 2023, 1:13 PM · Data-Engineering, Event-Platform, Analytics-Kanban, Analytics

Aug 30 2023

phuedx closed T344167: Decommission the EchoMail and EchoInteraction instruments, a subtask of T282131: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned, as Resolved.
Aug 30 2023, 1:13 PM · Data-Engineering, MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), Fundraising-Backlog, Better Use Of Data, Product-Analytics, Product-Data-Infrastructure, Analytics-Kanban, MediaWiki-extensions-EventLogging, Event-Platform

Aug 10 2023

gerritbot added a comment to T278424: Upgrade the Hadoop coordinators to Debian Buster.

Change 947857 merged by Btullis:

[operations/puppet@production] Create component/libmysql-java for bullseye

https://gerrit.wikimedia.org/r/947857

Aug 10 2023, 4:10 PM · Analytics-Kanban, Patch-For-Review, Analytics-Clusters
gerritbot added a comment to T278424: Upgrade the Hadoop coordinators to Debian Buster.

Change 947857 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Create component/libmysql-java for bullseye

https://gerrit.wikimedia.org/r/947857

Aug 10 2023, 2:41 PM · Analytics-Kanban, Patch-For-Review, Analytics-Clusters