Monitoring Server
Jump to navigation
Jump to search
Monitored Systems and Services
System Name | CPU Usage | Current Load | # Users | DRBD | Disk Space | Heartbeat | LDAP | Memory | # Network Connections | Network I/O | RAID | SMART | SSH | # Processes | HTTP | MySQL | Samba | System Temperature | PSU | PING | Other |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Alan's Desktop | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | Yes | No |
brems.iac.isu.edu | Yes | Yes | Yes | No | Yes | No | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | No | No | No | No | Averaging Test, Cluster Room Temperatures |
Computers to Monitor
- Brems
- Slave nodes
- Slurm queue
- Inca
- Webserver
- Wiki
- Seattle
- Backup server
- File server
Non-computer things to monitor
- Cluster Room Temp
- CleanRoom temp probes
Things to Monitor
- raid status (number of up drives)
- Hard drive space df
- memory usage
- load average
- temp (CPU, case, etc) lmsensors1
- CPU utilization
- fan speed/failure
- ITRC
- number of connections (netstat?)
- Network I/O
- Individual process times