Open main menu

Linux and Unix Users Group at Virginia Teck Wiki β

Changes

Infrastructure:Incident 2015-04-23

248 bytes added, 20:57, 3 January 2019
no edit summary
** This is a club of Linux users, someone should step up!
* Lack of a disaster recovery plan.
* Undocumented systems with several different iterations of init scripts none of which were removed and many broken packages that were unused
Some steps that should be taken to reduce the length and impact of future outages include:
* Rebuild VMs like [[Infrastructure:miltonMilton|milton]], which have had too much maintenance deferred, and move them to [[Infrastructure:Cyberdelia|cyberdelia]].
* Install NTP on all servers.
* Find a dedicated sysadmin.
* Create a disaster recovery plan.
* Reduce services to match maintenance capabilities
 
[[Category:Incidents]]