[go: up one dir, main page]

Page MenuHomePhabricator

ops-monitoring-bot (Operations Monitoring Bot)
UserBot

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Aug 12 2016, 1:45 PM (421 w, 3 d)
Roles
Bot
Availability
Available
LDAP User
Unknown
MediaWiki User
Unknown

Bot managed by SRE for automated interaction with Phabricator from monitoring tools.

Recent Activity

Today

ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by cgoubert: for 1 hosts: mw2431.codfw.wmnet

Tue, Sep 10, 9:35 AM · serviceops, SRE
ops-monitoring-bot created T374422: Degraded RAID on cloudvirt2004-dev.
Tue, Sep 10, 8:59 AM · SRE, DC-Ops, ops-codfw
ops-monitoring-bot added a comment to T363210: kafka-main200[6789] and kafka-main2010 implementation tracking.

Icinga downtime and Alertmanager silence (ID=1ecd31b5-5c44-49dc-a69c-a3104ecc9241) set by jayme@cumin1002 for 1 day, 0:00:00 on 2 host(s) and their services with reason: Hardware refresh

kafka-main[2002,2007].codfw.wmnet
Tue, Sep 10, 8:23 AM · serviceops
ops-monitoring-bot created T374409: Degraded RAID on wikikube-worker2092.
Tue, Sep 10, 2:14 AM · SRE, ops-codfw, DC-Ops

Yesterday

ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by cgoubert: for 2 hosts: mw[2428-2429].codfw.wmnet

Mon, Sep 9, 4:20 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node started by cgoubert@cumin1002 Renumbering for host wikikube-worker2106.codfw.wmnet completed:

  • wikikube-worker2106.codfw.wmnet (PASS)
    • Successfully reimaged node wikikube-worker2106.codfw.wmnet
    • Successfully set BGP to true in Netbox
    • Pooled and uncordoned node wikikube-worker2106.codfw.wmnet
Mon, Sep 9, 4:19 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker2106.codfw.wmnet with OS bullseye completed:

  • wikikube-worker2106.codfw.wmnet (PASS)
Mon, Sep 9, 4:11 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node started by cgoubert@cumin1002 Renumbering for host wikikube-worker2105.codfw.wmnet completed:

  • wikikube-worker2105.codfw.wmnet (PASS)
    • Successfully reimaged node wikikube-worker2105.codfw.wmnet
    • Successfully set BGP to true in Netbox
    • Pooled and uncordoned node wikikube-worker2105.codfw.wmnet
Mon, Sep 9, 4:11 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker2105.codfw.wmnet with OS bullseye completed:

  • wikikube-worker2105.codfw.wmnet (PASS)
Mon, Sep 9, 4:01 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin1002 for host wikikube-worker2095.codfw.wmnet with OS bullseye completed:

  • wikikube-worker2095 (WARN)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Unable to downtime the new host on Icinga/Alertmanager, the sre.hosts.downtime cookbook returned 99
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202409091526_hnowlan_3513763_wikikube-worker2095.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Mon, Sep 9, 3:46 PM · serviceops, SRE
ops-monitoring-bot added a comment to T374272: asw2-d2-eqid <-> asw2-d4-eqiad vcp link flapping.

Icinga downtime and Alertmanager silence (ID=81e99a80-f593-4494-a565-ea730a19fbc7) set by cmooney@cumin1002 for 1:00:00 on 1 host(s) and their services with reason: repalce vcp link from d2 port 51 to d4 port 52

asw2-d-eqiad
Mon, Sep 9, 3:27 PM · ops-eqiad, SRE, DC-Ops, Infrastructure-Foundations, netops
ABran-WMF awarded T374095: Degraded RAID on db2198 a Party Time token.
Mon, Sep 9, 3:12 PM · DBA, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage was started by hnowlan@cumin1002 for host wikikube-worker2095.codfw.wmnet with OS bullseye

Mon, Sep 9, 3:09 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker2106.codfw.wmnet with OS bullseye

Mon, Sep 9, 2:50 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node was started by cgoubert@cumin1002 Renumbering for host wikikube-worker2106.codfw.wmnet

Mon, Sep 9, 2:49 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker2105.codfw.wmnet with OS bullseye

Mon, Sep 9, 2:45 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node was started by cgoubert@cumin1002 Renumbering for host wikikube-worker2105.codfw.wmnet

Mon, Sep 9, 2:45 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2429 to wikikube-worker2106 completed:

  • mw2429 (PASS)
    • ✔️ Downtimed host on Icinga/Alertmanager
    • ✔️ Disabled puppet
    • ✔️ Netbox updated
    • ✔️ BMC Hostname updated
    • ✔️ DNS updated
    • ✔️ Switch description updated
    • ✔️ Removed from DebMonitor
    • ✔️ Removed from Puppet master and PuppetDB
    • Rename completed 👍 - now please run the re-image cookbook on the new name with --new
Mon, Sep 9, 2:40 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2428 to wikikube-worker2105 completed:

  • mw2428 (PASS)
    • ✔️ Downtimed host on Icinga/Alertmanager
    • ✔️ Disabled puppet
    • ✔️ Netbox updated
    • ✔️ BMC Hostname updated
    • ✔️ DNS updated
    • ✔️ Switch description updated
    • ✔️ Removed from DebMonitor
    • ✔️ Removed from Puppet master and PuppetDB
    • Rename completed 👍 - now please run the re-image cookbook on the new name with --new
Mon, Sep 9, 2:31 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node started by kamila@cumin1002 Renumbering for host wikikube-worker2104.codfw.wmnet completed:

  • wikikube-worker2104.codfw.wmnet (PASS)
    • Successfully reimaged node wikikube-worker2104.codfw.wmnet
    • Successfully set BGP to true in Netbox
    • Pooled and uncordoned node wikikube-worker2104.codfw.wmnet
Mon, Sep 9, 2:02 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker2104.codfw.wmnet with OS bullseye completed:

  • wikikube-worker2104.codfw.wmnet (PASS)
Mon, Sep 9, 1:54 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker2104.codfw.wmnet with OS bullseye

Mon, Sep 9, 12:43 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node was started by kamila@cumin1002 Renumbering for host wikikube-worker2104.codfw.wmnet

Mon, Sep 9, 12:43 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw2431 to wikikube-worker2104 completed:

  • mw2431 (PASS)
    • ✔️ Downtimed host on Icinga/Alertmanager
    • ✔️ Netbox updated
    • ✔️ BMC Hostname updated
    • ✔️ DNS updated
    • ✔️ Switch description updated
    • ✔️ Removed from DebMonitor
    • ✔️ Removed from Puppet master and PuppetDB
    • Rename completed 👍 - now please run the re-image cookbook on the new name with --new
Mon, Sep 9, 12:39 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2379.codfw.wmnet

Mon, Sep 9, 11:09 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2435.codfw.wmnet

Mon, Sep 9, 11:09 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2434.codfw.wmnet

Mon, Sep 9, 11:09 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2430.codfw.wmnet

Mon, Sep 9, 11:09 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2423.codfw.wmnet

Mon, Sep 9, 11:09 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2422.codfw.wmnet

Mon, Sep 9, 11:09 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2420.codfw.wmnet

Mon, Sep 9, 11:08 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2407.codfw.wmnet

Mon, Sep 9, 11:07 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2406.codfw.wmnet

Mon, Sep 9, 11:07 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2402.codfw.wmnet

Mon, Sep 9, 11:07 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2389.codfw.wmnet

Mon, Sep 9, 11:07 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2388.codfw.wmnet

Mon, Sep 9, 11:06 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2387.codfw.wmnet

Mon, Sep 9, 11:06 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2386.codfw.wmnet

Mon, Sep 9, 11:06 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2385.codfw.wmnet

Mon, Sep 9, 11:06 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2384.codfw.wmnet

Mon, Sep 9, 11:06 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2383.codfw.wmnet

Mon, Sep 9, 11:05 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2382.codfw.wmnet

Mon, Sep 9, 11:05 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2381.codfw.wmnet

Mon, Sep 9, 11:05 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2380.codfw.wmnet

Mon, Sep 9, 11:05 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2378.codfw.wmnet

Mon, Sep 9, 11:03 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2377.codfw.wmnet

Mon, Sep 9, 11:03 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2319.codfw.wmnet

Mon, Sep 9, 11:03 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2318.codfw.wmnet

Mon, Sep 9, 11:03 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2317.codfw.wmnet

Mon, Sep 9, 11:02 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2316.codfw.wmnet

Mon, Sep 9, 11:02 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2312.codfw.wmnet

Mon, Sep 9, 11:02 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2296.codfw.wmnet

Mon, Sep 9, 11:01 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2295.codfw.wmnet

Mon, Sep 9, 11:01 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2293.codfw.wmnet

Mon, Sep 9, 11:01 AM · serviceops, SRE
ops-monitoring-bot added a comment to T373579: Productionize db22[21-40].

Icinga downtime and Alertmanager silence (ID=59ed8fec-fb98-42d5-a3be-b1cfde886f73) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2227.codfw.wmnet - T373579

db2227.codfw.wmnet
Mon, Sep 9, 9:22 AM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T373579: Productionize db22[21-40].

Icinga downtime and Alertmanager silence (ID=04d38476-a5a0-4975-806d-17d6841b33f9) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2227.codfw.wmnet - T373579

db2127.codfw.wmnet
Mon, Sep 9, 9:21 AM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T372817: reimage gerrit1004.wikimedia.org as phab1005.eqiad.wmnet.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: gerrit1004.wikimedia.org

Mon, Sep 9, 9:18 AM · SRE, DC-Ops, ops-eqiad, collaboration-services
ops-monitoring-bot added a comment to T332011: Migrate dragonfly-supernodes to Bookworm.

Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin2002 for host dragonfly-supernode2001.codfw.wmnet with OS bookworm completed:

  • dragonfly-supernode2001 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202409090840_elukey_1736022_dragonfly-supernode2001.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Mon, Sep 9, 8:54 AM · User-Elukey, serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2054.codfw.wmnet

Mon, Sep 9, 8:37 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2055.codfw.wmnet

Mon, Sep 9, 8:36 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2057.codfw.wmnet

Mon, Sep 9, 8:36 AM · serviceops, SRE
ops-monitoring-bot added a comment to T373980: Hosts using nftables are not reachable via ssh from alert[12]002. Reboot needed..

Host rebooted by jelto@cumin1002 with reason: reboot lists2001 to fix nftables/ferm issue

Mon, Sep 9, 8:35 AM · collaboration-services, Infrastructure-Foundations, SRE Observability (FY2024/2025-Q1), Observability-Alerting
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2055.codfw.wmnet

Mon, Sep 9, 8:35 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2035.codfw.wmnet

Mon, Sep 9, 8:35 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2033.codfw.wmnet

Mon, Sep 9, 8:35 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2029.codfw.wmnet

Mon, Sep 9, 8:34 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2028.codfw.wmnet

Mon, Sep 9, 8:34 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2027.codfw.wmnet

Mon, Sep 9, 8:34 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2025.codfw.wmnet

Mon, Sep 9, 8:34 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2018.codfw.wmnet

Mon, Sep 9, 8:31 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2010.codfw.wmnet

Mon, Sep 9, 8:31 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2008.codfw.wmnet

Mon, Sep 9, 8:31 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2034.codfw.wmnet

Mon, Sep 9, 8:27 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: kubernetes2031.codfw.wmnet

Mon, Sep 9, 8:26 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2332.codfw.wmnet

Mon, Sep 9, 8:25 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2322.codfw.wmnet

Mon, Sep 9, 8:25 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2321.codfw.wmnet

Mon, Sep 9, 8:25 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: mw2320.codfw.wmnet

Mon, Sep 9, 8:25 AM · serviceops, SRE
ops-monitoring-bot added a comment to T332011: Migrate dragonfly-supernodes to Bookworm.

Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin2002 for host dragonfly-supernode2001.codfw.wmnet with OS bookworm

Mon, Sep 9, 8:20 AM · User-Elukey, serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node started by jayme@cumin1002 Renumbering for host kubestage2002.codfw.wmnet completed:

  • kubestage2002.codfw.wmnet (FAIL)
    • Successfully cordoned node kubestage2002.codfw.wmnet
    • Failed to reimage node kubestage2002.codfw.wmnet, sre.hosts.reimage returned 99
Mon, Sep 9, 7:37 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by jayme@cumin1002 for host kubestage2002.codfw.wmnet with OS bookworm executed with errors:

  • kubestage2002.codfw.wmnet (PASS)
    • Successfully cordoned node kubestage2002.codfw.wmnet
Mon, Sep 9, 7:37 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage was started by jayme@cumin1002 for host kubestage2002.codfw.wmnet with OS bookworm

Mon, Sep 9, 7:37 AM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node was started by jayme@cumin1002 Renumbering for host kubestage2002.codfw.wmnet

Mon, Sep 9, 7:34 AM · serviceops, SRE

Fri, Sep 6

ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node started by kamila@cumin1002 Renumbering for host wikikube-worker2103.codfw.wmnet completed:

  • wikikube-worker2103.codfw.wmnet (PASS)
    • Successfully reimaged node wikikube-worker2103.codfw.wmnet
    • Successfully set BGP to true in Netbox
    • Pooled and uncordoned node wikikube-worker2103.codfw.wmnet
Fri, Sep 6, 4:30 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker2103.codfw.wmnet with OS bullseye completed:

  • wikikube-worker2103.codfw.wmnet (PASS)
Fri, Sep 6, 4:24 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin1002 for host wikikube-worker2095.codfw.wmnet with OS bullseye executed with errors:

  • wikikube-worker2095 (FAIL)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details,You can also try typing "sudo install-console wikikube-worker2095.codfw.wmnet" to get a root shellbut depending on the failure this may not work.
Fri, Sep 6, 4:04 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker2103.codfw.wmnet with OS bullseye

Fri, Sep 6, 3:32 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node was started by kamila@cumin1002 Renumbering for host wikikube-worker2103.codfw.wmnet

Fri, Sep 6, 3:29 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw2430 to wikikube-worker2103 completed:

  • mw2430 (PASS)
    • ✔️ Downtimed host on Icinga/Alertmanager
    • ✔️ Netbox updated
    • ✔️ BMC Hostname updated
    • ✔️ DNS updated
    • ✔️ Switch description updated
    • ✔️ Removed from DebMonitor
    • ✔️ Removed from Puppet master and PuppetDB
    • Rename completed 👍 - now please run the re-image cookbook on the new name with --new
Fri, Sep 6, 3:24 PM · serviceops, SRE
ops-monitoring-bot added a comment to T374247: LInk errors from lvs1017 to ssw1-e1-eqiad.

Icinga downtime and Alertmanager silence (ID=c63ff66a-28d3-4567-b7cc-a03c0da01345) set by cmooney@cumin1002 for 2:00:00 on 1 host(s) and their services with reason: Move traffic off lvs1017 to lvs1020 to troubleshooot faulty link

lvs1017.eqiad.wmnet
Fri, Sep 6, 3:15 PM · Traffic, ops-eqiad, Infrastructure-Foundations, netops, SRE, DC-Ops
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node started by hnowlan@cumin1002 Renumbering for host wikikube-worker2098.codfw.wmnet completed:

  • wikikube-worker2098.codfw.wmnet (PASS)
    • Successfully reimaged node wikikube-worker2098.codfw.wmnet
    • Successfully set BGP to true in Netbox
    • Pooled and uncordoned node wikikube-worker2098.codfw.wmnet
Fri, Sep 6, 3:03 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin1002 for host wikikube-worker2098.codfw.wmnet with OS bullseye completed:

  • wikikube-worker2098.codfw.wmnet (PASS)
Fri, Sep 6, 2:52 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage was started by hnowlan@cumin1002 for host wikikube-worker2095.codfw.wmnet with OS bullseye

Fri, Sep 6, 2:44 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node started by hnowlan@cumin1002 Renumbering for host wikikube-worker2095.codfw.wmnet completed:

  • wikikube-worker2095.codfw.wmnet (FAIL)
    • Failed to reimage node wikikube-worker2095.codfw.wmnet, sre.hosts.reimage returned 99
Fri, Sep 6, 2:43 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin1002 for host wikikube-worker2095.codfw.wmnet with OS bullseye executed with errors:

  • wikikube-worker2095.codfw.wmnet (PASS)
Fri, Sep 6, 2:43 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage was started by hnowlan@cumin1002 for host wikikube-worker2095.codfw.wmnet with OS bullseye

Fri, Sep 6, 2:42 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node was started by hnowlan@cumin1002 Renumbering for host wikikube-worker2095.codfw.wmnet

Fri, Sep 6, 2:42 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node started by jayme@cumin1002 Renumbering for host wikikube-worker2102.codfw.wmnet completed:

  • wikikube-worker2102.codfw.wmnet (PASS)
    • Successfully reimaged node wikikube-worker2102.codfw.wmnet
    • Successfully set BGP to true in Netbox
    • Pooled and uncordoned node wikikube-worker2102.codfw.wmnet
Fri, Sep 6, 2:17 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.hosts.reimage started by jayme@cumin1002 for host wikikube-worker2102.codfw.wmnet with OS bullseye completed:

  • wikikube-worker2102.codfw.wmnet (PASS)
Fri, Sep 6, 2:13 PM · serviceops, SRE
ops-monitoring-bot added a comment to T372878: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets.

Cookbook cookbooks.sre.k8s.renumber-node started by jayme@cumin1002 Renumbering for host kubestage2001.codfw.wmnet completed:

  • kubestage2001.codfw.wmnet (PASS)
    • Successfully cordoned node kubestage2001.codfw.wmnet
    • Successfully reimaged node kubestage2001.codfw.wmnet
    • Pooled and uncordoned node kubestage2001.codfw.wmnet
Fri, Sep 6, 1:56 PM · serviceops, SRE