Difference between revisions of "Resource Monitoring Tools"

From Steak Wiki
Jump to navigationJump to search
 
(14 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
<small>There are programs available to watch various aspects of your system.
 
<small>There are programs available to watch various aspects of your system.
  
==iftop==
+
==Network==
 +
===iftop===
 
See active internet connections.  
 
See active internet connections.  
 
e.g.
 
e.g.
Line 8: Line 9:
  
 
alt:
 
alt:
netstat | head -n 20
+
netstat | head -n 20
 +
===Speed/Bandwidth===
 +
iperf3
 +
iperf
 +
ping with the -i flag to set interval to less than 0.1 seconds (unix only and not busybox).
 +
 
 +
speedtest-cli w/owrt default luci status, real-time graphs, traffic of the lan / wan interface.
 +
 
 +
ethtool will tell you if your nic supports 1000/m
  
==vmstat==
+
==RAM==
 
See RAM usage. Can be watched, to monitor swapping.
 
See RAM usage. Can be watched, to monitor swapping.
 
e.g.
 
e.g.
Line 16: Line 25:
 
Leave it running. It will update every 3 seconds.
 
Leave it running. It will update every 3 seconds.
  
==iotop==
+
===htop===
 +
Take htop, and go in the menus. Change the update rate to
 +
0.1 seconds
 +
I think this view is superior to the default. Might slow down machine,
 +
so use with discretion, (i.e. don't leave it running).
 +
 
 +
==Filesystem==
 +
===iotop===
 
See HDD accesses.
 
See HDD accesses.
 
e.g.
 
e.g.
Line 24: Line 40:
 
  # iotop -d 0.01  or -d 0.1
 
  # iotop -d 0.01  or -d 0.1
 
delay flag can be set to be faster than 1 second. Some writes are missed otherwise.
 
delay flag can be set to be faster than 1 second. Some writes are missed otherwise.
 +
 +
 +
See also:
 +
https://hackaday.com/2020/11/05/linux-fu-monitor-disks/
 +
https://en.wikipedia.org/wiki/Bonnie%2B%2B
 +
i can't remember how often i test file system speed though. I am not working in a data center. It's
 +
never been necessary.
 +
 +
===List Open Files===
 +
lsof
 +
Note: there are different types of lsof (e.g. busybox's)
 +
 +
===Filesystem metadata===
 +
 +
# dumpe2fs /dev/sda1 | less
  
 
==Monitor Library Reads from PID==
 
==Monitor Library Reads from PID==
Line 31: Line 62:
 
See what a program is doing. (Note: not available on ARM deb repos)
 
See what a program is doing. (Note: not available on ARM deb repos)
  
==Monitor IP address up/down via ping==
+
==cron monitoring scripts==
  
 +
===monitor ip address up/down via ping===
 
<pre>
 
<pre>
 
#!/bin/bash
 
#!/bin/bash
Line 78: Line 110:
  
 
</pre>
 
</pre>
 +
 +
===hdd full===
 +
<pre>
 +
#usage: feed $1 company name/subject
 +
df -h | grep 100%
 +
if [ $? -eq 0 ]; then
 +
#send email
 +
  echo "hdd full" | mutt -s $1 alerts@emailaddress
 +
else
 +
  echo "do nothing"
 +
fi
 +
 +
</pre>
 +
===cpu temperature===
 +
<pre>
 +
#usage: feed $1 company name/subject
 +
sensors | grep -e temp -e Core | cut -c 16-19 | sort | grep [[:digit:]] | cut -c 1-2 > /tmp/tmp
 +
 +
 +
input="/tmp/tmp"
 +
while IFS= read -r line
 +
do
 +
  if [ $line -gt 60 ]; then
 +
echo $line
 +
echo "overtemperature detected."
 +
  fi
 +
done < "$input"
 +
 +
 +
#send email
 +
  echo "cpu temperature overload detected" | mutt -s $1 alerts@email
  
  
==Network==
+
#fan
iperf3
+
fanspeed=$(sensors | grep -e fan1 | cut -c 14-17 | sort | grep [[:digit:]])
iperf
+
  if [ $fanspeed -gt 4000 ]; then
 +
  echo "fan speed overload detected" | mutt -s $1 alerts@email
 +
  fi
 +
</pre>
 +
===monitor hdd usage===
 +
<pre>
 +
#must run as root for access to dmesg
  
 +
LOGFILE=/root/file.log
 +
SUBJECT="hdd details"
  
 +
echo "" > $LOGFILE
 +
echo $COMPANY >> $LOGFILE
 +
echo "" >> $LOGFILE
 +
echo "" >> $LOGFILE
 +
df -h  >> $LOGFILE
 +
echo "" >> $LOGFILE
 +
echo "" >> $LOGFILE
 +
lsblk  >> $LOGFILE
 +
echo "" >> $LOGFILE
 +
echo "" >> $LOGFILE
 +
dmesg | grep -e sda -e sdb -e sdc -e sdd -e sde >> $LOGFILE
 +
echo "" | mutt -s $SUBJECT alerts@email -a $LOGFILE
 +
</pre>
 
</small>
 
</small>

Latest revision as of 01:46, 27 October 2023

There are programs available to watch various aspects of your system.

Network

iftop

See active internet connections. e.g.

# iftop -i eth1

Will show you websites that don't close a connection, when the tab is left open. A privacy and security nightmare. This is a reason why Javascript is bad.

alt:

netstat | head -n 20

Speed/Bandwidth

iperf3
iperf
ping with the -i flag to set interval to less than 0.1 seconds (unix only and not busybox).
speedtest-cli w/owrt default luci status, real-time graphs, traffic of the lan / wan interface.
ethtool will tell you if your nic supports 1000/m

RAM

See RAM usage. Can be watched, to monitor swapping. e.g.

$ vmstat 3

Leave it running. It will update every 3 seconds.

htop

Take htop, and go in the menus. Change the update rate to

0.1 seconds

I think this view is superior to the default. Might slow down machine, so use with discretion, (i.e. don't leave it running).

Filesystem

iotop

See HDD accesses. e.g.

# iotop --only
# iotop -o

only flag will show active processes only

# iotop -d 0.01  or -d 0.1

delay flag can be set to be faster than 1 second. Some writes are missed otherwise.


See also: https://hackaday.com/2020/11/05/linux-fu-monitor-disks/ https://en.wikipedia.org/wiki/Bonnie%2B%2B i can't remember how often i test file system speed though. I am not working in a data center. It's never been necessary.

List Open Files

lsof Note: there are different types of lsof (e.g. busybox's)

Filesystem metadata

# dumpe2fs /dev/sda1 | less

Monitor Library Reads from PID

$ ltrace -p -pidhere-

See what a program is doing. (Note: not available on ARM deb repos)

cron monitoring scripts

monitor ip address up/down via ping

#!/bin/bash

SERVERIP=$1
LOGFILE=$1_$(date +%A)_LOG
HISTORYFILE=$1_$(date +%A)_LOCKFILE
NOTIFYEMAIL=myemail@address.com

#setup this script in cron each minute, and also
#crontab requires historyfile / lockfile to be blanked (echo "" > file) each day or each hour, whatever you prefer.
#mkdir /var/log/networkalerts
#e.g. $ script.sh <ipaddress>
# in /etc/crontab
#*/3 * * * *   root /root/email_alerts/test_up.sh 192.168.1.1 #tune this frequency based on your priority
#0 */2 * * * root rm /var/log/networkalerts/*LOCKFILE
#0 0 * * *   root rm /var/log/networkalerts/*$(date +%A)*LOG

#keep track of time
  date >> /var/log/networkalerts/$LOGFILE
  ping -c 6 $SERVERIP >> /var/log/networkalerts/$LOGFILE
#nothing after ping, as we need return value
#if return val is error (see man on ping regarding count and deadline)
# == or -eq can be used. == is intuitive, therefore better
  if test $? == 1
  then
#if file empty
#[ -s FILE ] True if FILE exists and has a size greater than zero. Thus, you get "empty.txt" if "diff.txt" is not e>
#https://stackoverflow.com/questions/9964823/how-to-check-if-a-file-is-empty-in-bash
#https://mywiki.wooledge.org/BashGuide/TestsAndConditionals    for all the other tests like -s
#   [! -s file] to invert didn't work because of missing spaces (i think)
# must be space between [ and -s and also last bracket. test brackets are unintuitive so don't use them.
#  if [ -s /var/log/networkalerts/$HISTORYFILE ]
  if test -s /var/log/networkalerts/$HISTORYFILE
   then
    exit 5
   else
    # Use your favorite mailer here:
    # wiki.zoneminder.com/Email explains how to configure email for devuan
    echo "alert" | mutt  -s "Network Down" -- $NOTIFYEMAIL
    #lock file / history file
    echo "alertsent" > /var/log/networkalerts/$HISTORYFILE
   fi
  fi

hdd full

#usage: feed $1 company name/subject 
df -h | grep 100%
if [ $? -eq 0 ]; then
#send email
  echo "hdd full" | mutt -s $1 alerts@emailaddress 
else
  echo "do nothing"
fi

cpu temperature

#usage: feed $1 company name/subject 
sensors | grep -e temp -e Core | cut -c 16-19 | sort | grep [[:digit:]] | cut -c 1-2 > /tmp/tmp


input="/tmp/tmp"
while IFS= read -r line
do
  if [ $line -gt 60 ]; then 
echo $line
echo "overtemperature detected."
  fi
done < "$input"


#send email
  echo "cpu temperature overload detected" | mutt -s $1 alerts@email 


#fan
fanspeed=$(sensors | grep -e fan1 | cut -c 14-17 | sort | grep [[:digit:]])
  if [ $fanspeed -gt 4000 ]; then 
  echo "fan speed overload detected" | mutt -s $1 alerts@email
  fi

monitor hdd usage

#must run as root for access to dmesg

LOGFILE=/root/file.log
SUBJECT="hdd details"

echo "" > $LOGFILE
echo $COMPANY >> $LOGFILE
echo "" >> $LOGFILE
echo "" >> $LOGFILE
df -h   >> $LOGFILE
echo "" >> $LOGFILE
echo "" >> $LOGFILE
lsblk   >> $LOGFILE
echo "" >> $LOGFILE
echo "" >> $LOGFILE
dmesg | grep -e sda -e sdb -e sdc -e sdd -e sde >> $LOGFILE
echo "" | mutt -s $SUBJECT alerts@email -a $LOGFILE