In the parent task we're migrating alert hosts to new hardware. While checking outstanding alerts there are a few hosts that work on alert1001 (old hardware) but don't on alert2002. Specifically ssh not reachable:
durum1001 durum2002 durum3004 durum4001 durum5001 durum5002 durum6002 etherpad2002 gitlab1003 gitlab1004
Upon checking these hosts I noticed they all have nft enabled. I'll use alert2002 and etherpad2002 as an example. On etherpad2002 everything seems in order:
root@etherpad2002:~# grep -ir $(dig +short alert2002.wikimedia.org) /etc/nftables /etc/nftables/input/10_full-monitoring-metrics-access-tcp.nft:ip saddr { 10.192.16.75, 10.192.32.67, 208.80.153.42, 208.80.153.84, 208.80.154.78, 208.80.154.88 } tcp dport 1-65535 accept /etc/nftables/input/10_full-monitoring-metrics-access-tcp.nft:ip6 saddr { 2620:0:860:102:10:192:16:75, 2620:0:860:103:10:192:32:67, 2620:0:860:2:208:80:153:42, 2620:0:860:3:208:80:153:84, 2620:0:861:3:208:80:154:78, 2620:0:861:3:208:80:154:88 } tcp dport 1-65535 accept /etc/nftables/input/10_full-monitoring-metrics-access-udp.nft:ip saddr { 10.192.16.75, 10.192.32.67, 208.80.153.42, 208.80.153.84, 208.80.154.78, 208.80.154.88 } udp dport 1-65535 accept /etc/nftables/input/10_full-monitoring-metrics-access-udp.nft:ip6 saddr { 2620:0:860:102:10:192:16:75, 2620:0:860:103:10:192:32:67, 2620:0:860:2:208:80:153:42, 2620:0:860:3:208:80:153:84, 2620:0:861:3:208:80:154:78, 2620:0:861:3:208:80:154:88 } udp dport 1-65535 accept /etc/nftables/sets/MONITORING_HOSTS_ipv6.nft: 2620:0:860:2:208:80:153:42 /etc/nftables/sets/MONITORING_HOSTS_ipv4.nft: 208.80.153.42
And yet there's no answer from SYNs issued by alert2002:
root@etherpad2002:~# tcpdump -i any 'host alert2002.wikimedia.org' tcpdump: data link type LINUX_SLL2 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 08:58:54.229394 ens13 In IP6 alert2002.wikimedia.org.34744 > etherpad2002.codfw.wmnet.ssh: Flags [S], seq 3918410496, win 43200, options [mss 1440,sackOK,TS val 1948470522 ecr 0,nop,wscale 9], length 0 08:58:55.246920 ens13 In IP6 alert2002.wikimedia.org.34744 > etherpad2002.codfw.wmnet.ssh: Flags [S], seq 3918410496, win 43200, options [mss 1440,sackOK,TS val 1948471540 ecr 0,nop,wscale 9], length 0 08:58:57.262928 ens13 In IP6 alert2002.wikimedia.org.34744 > etherpad2002.codfw.wmnet.ssh: Flags [S], seq 3918410496, win 43200, options [mss 1440,sackOK,TS val 1948473556 ecr 0,nop,wscale 9], length 0
I'm sure I'm missing something obvious here, though I can't quite figure out what! cc @Muehlenhoff
actions
Turns out we need to reboot the hosts that have switched from iptables to nftables per @Muehlenhoff
- doc1003
- doc2002
- durum1001
- durum1002
- durum2001
- durum2002
- durum3003
- durum3004
- durum4001
- durum4002
- durum5001
- durum5002
- durum6001
- durum6002
- durum7001
- durum7002
- aphlict1002
- aphlict2001
- etherpad1004
- etherpad2002
- gerrit1003
- gerrit2002
- gerrit2003
- gitlab1003
- gitlab1004
- gitlab2002
- lists1004
- lists2001
- miscweb1003
- miscweb2003
- planet1003
- planet2003