System Health and Performance Monitoring

No replies
Dave Kinchlea
Dave Kinchlea's picture
Offline
Joined: 2009-04-22

Typically not a distinct job from system administrators except in the largest of environments, the entire system requires significant monitoring to assure the environment is functioning properly and performing well. This may includes any or all of the following:

  • SNMP monitoring -- real-time statistics from various devices -- these are all discrete values from a large set of different metrics...% free RAM, % free disk, # of errors reported, # of bytes received and transmitted, etc
  • Syslog / NT Event Manager -- system-shared logging services used by applications and services to log events (usually using "levels" like: debug, info, warning, error, and critical)
  • External service monitoring: sitescope, spong, BMC Patrol, Nagios). These are applications designed to monitor your application as if they were end-users...this provides both real-time information about the performance of the system as well as necessary and valuable data for trend analysis