Difference between revisions of "Resource Monitoring Tools"
(12 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
<small>There are programs available to watch various aspects of your system. | <small>There are programs available to watch various aspects of your system. | ||
− | ==iftop== | + | ==Network== |
+ | ===iftop=== | ||
See active internet connections. | See active internet connections. | ||
e.g. | e.g. | ||
Line 9: | Line 10: | ||
alt: | alt: | ||
netstat | head -n 20 | netstat | head -n 20 | ||
+ | ===Speed/Bandwidth=== | ||
+ | iperf3 | ||
+ | iperf | ||
+ | ping with the -i flag to set interval to less than 0.1 seconds (unix only and not busybox). | ||
+ | |||
+ | speedtest-cli w/owrt default luci status, real-time graphs, traffic of the lan / wan interface. | ||
+ | |||
+ | ethtool will tell you if your nic supports 1000/m | ||
− | == | + | ==RAM== |
See RAM usage. Can be watched, to monitor swapping. | See RAM usage. Can be watched, to monitor swapping. | ||
e.g. | e.g. | ||
Line 16: | Line 25: | ||
Leave it running. It will update every 3 seconds. | Leave it running. It will update every 3 seconds. | ||
− | ==iotop== | + | ===htop=== |
+ | Take htop, and go in the menus. Change the update rate to | ||
+ | 0.1 seconds | ||
+ | I think this view is superior to the default. Might slow down machine, | ||
+ | so use with discretion, (i.e. don't leave it running). | ||
+ | |||
+ | ==Filesystem== | ||
+ | ===iotop=== | ||
See HDD accesses. | See HDD accesses. | ||
e.g. | e.g. | ||
Line 24: | Line 40: | ||
# iotop -d 0.01 or -d 0.1 | # iotop -d 0.01 or -d 0.1 | ||
delay flag can be set to be faster than 1 second. Some writes are missed otherwise. | delay flag can be set to be faster than 1 second. Some writes are missed otherwise. | ||
+ | |||
+ | |||
+ | See also: | ||
+ | https://hackaday.com/2020/11/05/linux-fu-monitor-disks/ | ||
+ | https://en.wikipedia.org/wiki/Bonnie%2B%2B | ||
+ | i can't remember how often i test file system speed though. I am not working in a data center. It's | ||
+ | never been necessary. | ||
+ | |||
+ | ===List Open Files=== | ||
+ | lsof | ||
+ | Note: there are different types of lsof (e.g. busybox's) | ||
+ | |||
+ | ===Filesystem metadata=== | ||
+ | |||
+ | # dumpe2fs /dev/sda1 | less | ||
==Monitor Library Reads from PID== | ==Monitor Library Reads from PID== | ||
Line 31: | Line 62: | ||
See what a program is doing. (Note: not available on ARM deb repos) | See what a program is doing. (Note: not available on ARM deb repos) | ||
− | == | + | ==cron monitoring scripts== |
+ | ===monitor ip address up/down via ping=== | ||
<pre> | <pre> | ||
#!/bin/bash | #!/bin/bash | ||
Line 79: | Line 111: | ||
</pre> | </pre> | ||
+ | ===hdd full=== | ||
+ | <pre> | ||
+ | #usage: feed $1 company name/subject | ||
+ | df -h | grep 100% | ||
+ | if [ $? -eq 0 ]; then | ||
+ | #send email | ||
+ | echo "hdd full" | mutt -s $1 alerts@emailaddress | ||
+ | else | ||
+ | echo "do nothing" | ||
+ | fi | ||
+ | |||
+ | </pre> | ||
+ | ===cpu temperature=== | ||
+ | <pre> | ||
+ | #usage: feed $1 company name/subject | ||
+ | sensors | grep -e temp -e Core | cut -c 16-19 | sort | grep [[:digit:]] | cut -c 1-2 > /tmp/tmp | ||
+ | |||
+ | |||
+ | input="/tmp/tmp" | ||
+ | while IFS= read -r line | ||
+ | do | ||
+ | if [ $line -gt 60 ]; then | ||
+ | echo $line | ||
+ | echo "overtemperature detected." | ||
+ | fi | ||
+ | done < "$input" | ||
+ | |||
+ | |||
+ | #send email | ||
+ | echo "cpu temperature overload detected" | mutt -s $1 alerts@email | ||
− | |||
− | |||
− | |||
− | == | + | #fan |
− | + | fanspeed=$(sensors | grep -e fan1 | cut -c 14-17 | sort | grep [[:digit:]]) | |
− | + | if [ $fanspeed -gt 4000 ]; then | |
+ | echo "fan speed overload detected" | mutt -s $1 alerts@email | ||
+ | fi | ||
+ | </pre> | ||
+ | ===monitor hdd usage=== | ||
+ | <pre> | ||
+ | #must run as root for access to dmesg | ||
+ | LOGFILE=/root/file.log | ||
+ | SUBJECT="hdd details" | ||
+ | echo "" > $LOGFILE | ||
+ | echo $COMPANY >> $LOGFILE | ||
+ | echo "" >> $LOGFILE | ||
+ | echo "" >> $LOGFILE | ||
+ | df -h >> $LOGFILE | ||
+ | echo "" >> $LOGFILE | ||
+ | echo "" >> $LOGFILE | ||
+ | lsblk >> $LOGFILE | ||
+ | echo "" >> $LOGFILE | ||
+ | echo "" >> $LOGFILE | ||
+ | dmesg | grep -e sda -e sdb -e sdc -e sdd -e sde >> $LOGFILE | ||
+ | echo "" | mutt -s $SUBJECT alerts@email -a $LOGFILE | ||
+ | </pre> | ||
</small> | </small> |
Latest revision as of 01:46, 27 October 2023
There are programs available to watch various aspects of your system.
Network
iftop
See active internet connections. e.g.
# iftop -i eth1
Will show you websites that don't close a connection, when the tab is left open. A privacy and security nightmare. This is a reason why Javascript is bad.
alt:
netstat | head -n 20
Speed/Bandwidth
iperf3 iperf ping with the -i flag to set interval to less than 0.1 seconds (unix only and not busybox).
speedtest-cli w/owrt default luci status, real-time graphs, traffic of the lan / wan interface.
ethtool will tell you if your nic supports 1000/m
RAM
See RAM usage. Can be watched, to monitor swapping. e.g.
$ vmstat 3
Leave it running. It will update every 3 seconds.
htop
Take htop, and go in the menus. Change the update rate to
0.1 seconds
I think this view is superior to the default. Might slow down machine, so use with discretion, (i.e. don't leave it running).
Filesystem
iotop
See HDD accesses. e.g.
# iotop --only # iotop -o
only flag will show active processes only
# iotop -d 0.01 or -d 0.1
delay flag can be set to be faster than 1 second. Some writes are missed otherwise.
See also:
https://hackaday.com/2020/11/05/linux-fu-monitor-disks/
https://en.wikipedia.org/wiki/Bonnie%2B%2B
i can't remember how often i test file system speed though. I am not working in a data center. It's
never been necessary.
List Open Files
lsof Note: there are different types of lsof (e.g. busybox's)
Filesystem metadata
# dumpe2fs /dev/sda1 | less
Monitor Library Reads from PID
$ ltrace -p -pidhere-
See what a program is doing. (Note: not available on ARM deb repos)
cron monitoring scripts
monitor ip address up/down via ping
#!/bin/bash SERVERIP=$1 LOGFILE=$1_$(date +%A)_LOG HISTORYFILE=$1_$(date +%A)_LOCKFILE NOTIFYEMAIL=myemail@address.com #setup this script in cron each minute, and also #crontab requires historyfile / lockfile to be blanked (echo "" > file) each day or each hour, whatever you prefer. #mkdir /var/log/networkalerts #e.g. $ script.sh <ipaddress> # in /etc/crontab #*/3 * * * * root /root/email_alerts/test_up.sh 192.168.1.1 #tune this frequency based on your priority #0 */2 * * * root rm /var/log/networkalerts/*LOCKFILE #0 0 * * * root rm /var/log/networkalerts/*$(date +%A)*LOG #keep track of time date >> /var/log/networkalerts/$LOGFILE ping -c 6 $SERVERIP >> /var/log/networkalerts/$LOGFILE #nothing after ping, as we need return value #if return val is error (see man on ping regarding count and deadline) # == or -eq can be used. == is intuitive, therefore better if test $? == 1 then #if file empty #[ -s FILE ] True if FILE exists and has a size greater than zero. Thus, you get "empty.txt" if "diff.txt" is not e> #https://stackoverflow.com/questions/9964823/how-to-check-if-a-file-is-empty-in-bash #https://mywiki.wooledge.org/BashGuide/TestsAndConditionals for all the other tests like -s # [! -s file] to invert didn't work because of missing spaces (i think) # must be space between [ and -s and also last bracket. test brackets are unintuitive so don't use them. # if [ -s /var/log/networkalerts/$HISTORYFILE ] if test -s /var/log/networkalerts/$HISTORYFILE then exit 5 else # Use your favorite mailer here: # wiki.zoneminder.com/Email explains how to configure email for devuan echo "alert" | mutt -s "Network Down" -- $NOTIFYEMAIL #lock file / history file echo "alertsent" > /var/log/networkalerts/$HISTORYFILE fi fi
hdd full
#usage: feed $1 company name/subject df -h | grep 100% if [ $? -eq 0 ]; then #send email echo "hdd full" | mutt -s $1 alerts@emailaddress else echo "do nothing" fi
cpu temperature
#usage: feed $1 company name/subject sensors | grep -e temp -e Core | cut -c 16-19 | sort | grep [[:digit:]] | cut -c 1-2 > /tmp/tmp input="/tmp/tmp" while IFS= read -r line do if [ $line -gt 60 ]; then echo $line echo "overtemperature detected." fi done < "$input" #send email echo "cpu temperature overload detected" | mutt -s $1 alerts@email #fan fanspeed=$(sensors | grep -e fan1 | cut -c 14-17 | sort | grep [[:digit:]]) if [ $fanspeed -gt 4000 ]; then echo "fan speed overload detected" | mutt -s $1 alerts@email fi
monitor hdd usage
#must run as root for access to dmesg LOGFILE=/root/file.log SUBJECT="hdd details" echo "" > $LOGFILE echo $COMPANY >> $LOGFILE echo "" >> $LOGFILE echo "" >> $LOGFILE df -h >> $LOGFILE echo "" >> $LOGFILE echo "" >> $LOGFILE lsblk >> $LOGFILE echo "" >> $LOGFILE echo "" >> $LOGFILE dmesg | grep -e sda -e sdb -e sdc -e sdd -e sde >> $LOGFILE echo "" | mutt -s $SUBJECT alerts@email -a $LOGFILE