Changes

Jump to: navigation, search

Infrastructure:Incident 2015-04-23

62 bytes removed, 20:55, 3 January 2019
no edit summary
TODO: Move to "Incident 2015-04-23"
 
In the early morning of April 23, 2015, [[gp:Whittemore Hall|Whittemore Hall]] lost power and brought down VTLUUG infrastructure. Issues bringing hardware back up were compounded by:
* Maintenance deferred for far too long, which prevented some machines from booting on their own.
Some steps that should be taken to reduce the length and impact of future outages include:
* Rebuild VMs like [[Infrastructure:milton|milton]], which have had too much maintenance deferred, and move them to [[cyberdelia]].
* Install NTP on all servers.
* Find a dedicated sysadmin.
* Create a disaster recovery plan.
* Reduce services to match maintence maintenance capabilities
[[Category:Incidents]]
[[Category:Needs restoration]]

Navigation menu