Watchdog¶
Since¶
LeoFS v1.2.2
Purpose¶
The watchdog mechanism monitors CPU, Erlang-IO, disk and a storage cluster status in order to keep running a node stably. Also, this mechanism communicates with the MQ mechanism
and the auto-compaction/compaction mechanism. Their processings - comsumption of MQ’s messages and deletion of unnecessary objects are affected by this mechanism.
Getting Started¶
Pre-requirement¶
Note
We have supported this mechanism for CentOS 6.5/7.0, Ubuntu Server 14.04 LTS, FreeBSD and SmartOS. And also, disk monitoring
is using the iostat command and df command, you need to install it before getting started the watchdog.
Configuration of LeoFS Gateway¶
Note
All default setting of the watchdogs are disabled. Before getting use this mechanism, you need to turn on them.
Modify leo_gateway.conf at the section of wathdog.
LeoFS Gateway’s watchdog properties:¶
Property | Default value | Description |
---|---|---|
rex (rpc-server) | ||
watchdog.rex.interval | 5 | Watch interval (sec) |
watchdog.rex.threshold_mem_capacity | 33554432 | Threshold memory capacity for binary (byte) |
CPU | ||
watchdog.cpu.is_enabled | false | Is cpu-watchdog enabled? [true|false] |
watchdog.cpu.interval | 5 | Watch interval(sec) |
watchdog.cpu.raised_error_times | 3 | An error is raised to subscribers when a number of errors reached this configuration. |
watchdog.cpu.threshold_cpu_load_avg | 5.0 | Threshold CPU load avg for 1min/5min |
watchdog.cpu.threshold_cpu_util | 100 | Threshold CPU load util (%) |
Configuration of LeoFS Storage¶
Note
All default setting of the watchdogs are disabled. Before getting use this mechanism, you need to turn on them.
Modify leo_storage.conf at the section of watchdog.
LeoFS Storage’s watchdog properties:¶
Property | Default value | Description |
---|---|---|
rex (rpc-server) | ||
watchdog.rex.interval | 10 | Watch interval (sec) |
watchdog.rex.threshold_mem_capacity | 33554432 | Threshold memory capacity for binary (byte) |
CPU | ||
watchdog.cpu.is_enabled | false | Is cpu-watchdog enabled? [true|false] |
watchdog.cpu.interval | 10 | Watch interval (sec) | |
watchdog.cpu.raised_error_times | 5 | An error is raised to subscribers when a number of errors reached this configuration. |
watchdog.cpu.threshold_cpu_load_avg | 5.0 | Threshold CPU load avg for 1min/5min |
watchdog.cpu.threshold_cpu_util | 100 | Threshold CPU load util (%) |
DISK | ||
watchdog.disk.is_enabled | false | Is disk-watchdog enabled? [true|false] |
watchdog.disk.interval | 10 | Watch interval (sec) |
watchdog.disk.raised_error_times | 5 | An error is raised to clients when a number of errors reached this configuration. |
watchdog.disk.threshold_disk_use | 85 | Threshold disk usage(capacity) (%) - leo_watchdog is using df command |
watchdog.disk.threshold_disk_util | 100 | Threshold disk util (%) - leo_watchdog is using iostat command |
watchdog.disk.threshold_disk_rkb | 98304 | Threshold disk read KB/sec |
watchdog.disk.threshold_disk_wkb | 98304 | Threshold disk write KB/sec |
watchdog.disk.target_devices | [] | Target devices for checking disk utilization |
Cluster | ||
watchdog.cluster.is_enabled | false | Is cluster-watchdog enabled? [true|false] |
watchdog.cluster.interval | 1 | Watch interval (sec) |