This task will outline the overall steps needed (and link sub-tasks) for actionable items from the operations teams breakout group on Hardare provisioning & automation.
The notes of the session are listed on https://office.wikimedia.org/wiki/Operations/Operations_Meeting_Notes/2015-10-12_Ops_Offsite/hardware-automation-workflow
Action Items:
- servermon fully in production (big one)
- related T88424: Migrate racktables to servermon
- related T84001: alternatives to racktables ?
- mac address and network port info missing (among others)
- easier allocation of IPs for management (in dns.git)
- this is easier for codfw, since its properly ordered. eqiad is ordered by service group and horrible to work with.
- we should clean up and renumber eqiad when this is automated.
- the same concept can be applied to add/remove machines from puppet.git (e.g. dhcp entries)
- this is easier for codfw, since its properly ordered. eqiad is ordered by service group and horrible to work with.
- PXE-boot linux image to run administration tasks
- related T78135: Provide a pxe-bootable rescue image
- ran only from a dedicated "install vlan"
- investigate ssh keys for idrac/ilo
- related {T113557}
- lock mgmt vlan from non ops bastions
- {T79294} OLD rtimport task for this.
Additionally, there has long been discussion about scripting all mgmt tasks.
- Interesting link for HP mgmt wrapper in python: https://github.com/seveas/python-hpilo