I’ve to start thinking about how to monitor a vmware cluster.
I think that the basic thing to monitor are:
- Disk capacity
- hardware failures
vSphere has a lot of alarms all customizable, so I think that the first step will be a revision of these alarms and a first choice among them.
I think that is also useful considering strictly hardware failures alarms separated from other alarms and also to separate node level alarm from vm level alarm.
For now I have not yet discovered some best practice guideline, so I will start reviewing basic alarms and thinking about what I would like to know to prevent a disaster :)
I will update this post!