Skip to content

ohjho/Server-Monitoring-Script

 
 

Repository files navigation

Server Monitoring Script (with reporting to Slack)

GitHub license Build Status Code Coverage GitHub Size Python 3.7

Purpose

The Python script is a modified version of mfcodeworks' version designed to be run as a supervisor process on every boot in the background. The script will gather information and send updates in the form of a Slack message:

  • UUID (Unique for each system to avoid overlapping hostname for multi-network monitoring)
  • Hostname
  • CPU
  • Memory
  • Network Usage
  • Network Cards
  • Hard Drives
  • GPU Memory
  • System OS
  • System Uptime
  • UTC Timestamp

The script will produce a JSON output at intervals for use with any software or server accepting a JSON input and send a slack message like:

Server Monitor LIMIT (root_drive_used_percent) REACHED

hostname : name-of-your-server
system : {'name': 'Linux', 'version': '4.9.0-11-amd64'}
uptime : 68129
cpu_count : 16
cpu_usage : 0.2
memory_used_percent : 46.3
root_drive_used_percent : 88.6
timestamp : 2020-08-05T03:28:05+00:00
gpu_memory_max_used_percent : 26.53

This script can be installed on several machines that report to a central Slack channel.

The destination, checking interval, sending attempts after failure and timeout between resending attempts can be set through arguments, use python3 server_monitor.py -h for more.

Usage

Shallow Clone the repo with

git clone --depth 1 https://github.com/ohjho/Server-Monitoring-Script.git

Make sure you have Python 3.7 and install the required libraries

pip3 install -r requirements.txt

To test the script output run

python3 server_monitor.py

Use with a config file

You can supply a config file by python3 server_monitor.py -c config.yaml to gain more control of the monitor and only get slack messages when limits are reached. Here's an example config file:

dev:
  environment:
    interval: 30    # seconds, monitoring interval
    attempts: 10    # deprecated: retry send message for this many times
    timeout: 60     # deprecated: send message timeout after waiting for this many seconds
    dest: "https://slack.com/api/chat.postMessage"

    machine_ulimit: # only send messages if these upper limits are reached
      memory_used_percent: 80
      root_drive_used_percent: 80
      gpu_memory_max_used_percent: 50

    slack:          # see section on slack setup below
      token: "enter your token here"
      channel: "#unixn00bs"

Slack Setup

Create New App at api.slack.com

After your app is created, in OAuth & Permissions features, under Scopes make sure you have the following Bot Token Scopes:

Bot Token Scope Screenshot

Install the app, then copy the Bot User OAuth Access Token for use with the server_monitor script

Linux Autostart with supervisord

install supervisor (also written in python):

sudo apt-get install -y supervisor

make a config file (e.g. server_monitor.conf) in /etc/supervisor/conf.d/ with the following:

[program:server_monitor]
command=python3 server_monitor.py -c config.yaml
directory=/path/to/Server-Monitoring-Script
autostart=true
autorestart=true
startretries=3
stderr_logfile=~/log/server_monitor.err.log
stdout_logfile=~/log/server_monitor.out.log
user=your_user_name

Make sure ~/log/ or wherever you determine the logs to be exists, then start it up and run:

sudo service supervisor start
sudo supervisorctl reread
sudo supervisorctl update

you can use sudo supervisorctl to mange services (type help)

  • fg server_monitor: will let you see the process in the foreground. do ctrl + c t exit fg

To enable the Web Interface, inside of /etc/supervisor/supervisord.conf/ add this:

[inet_http_server]
port = 0.0.0.0:8080
username = user # Basic auth username
password = pass # Basic auth password

then restart the server

Credits

MF Softworks for the original code

JHO for these new features:

About

Monitor System Info With Python Script

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%