Cluster monitoring

I’ve to start thinking about how to monitor a vmware cluster.

I think that the basic thing to monitor are:

  • CPU
  • Memory
  • Disk capacity
  • Network
  • hardware failures

vSphere has a lot of alarms all customizable, so I think that the first step will be a revision of these alarms and a first choice among them.

I think that is also useful considering strictly hardware failures alarms separated from other alarms and also to separate node level alarm from vm level alarm.

For now I have not yet discovered some best practice guideline, so I will start reviewing basic alarms and thinking about what I would like to know to prevent a disaster :)

I will update this post!

