Network health overview with mtr, ss, lsof and iperf3

Radu Zaharia
8 min readSep 17, 2023

--

We said many times in the past that personal and home network security is a habit. It’s not the operating system you install, it’s not how careful you are when browsing the Internet, it’s not even about the antivirus choice. You can either trust that your network is not compromised, or you can build a few habits that keep you sure. The most important habit would be of course the dreaded continuous log scanning. Logs are weird and scanning them is time consuming, if not sometimes impossible, but taking the time to have a good fail2ban configuration means we take one worry out of the way.

Log scanning allows us to create a few rules that automatically react when something goes wrong. But sometimes the malicious activity is not entirely recognized nor understood from the default Linux logs. Sometimes our Internet connection seems weird, and we don’t understand why. We know something is wrong, but we cannot really pinpoint the issue. And other times, there is weird file access that goes undetected simply because we don’t know what to look for. In these instances, we have a few tools that Linux provides which help us understand our environment. Let’s talk about them one at a time.

Revealing network issues with mtr

Typical mtr output

We talked before about mtr. The Linux mtr tool, preinstalled with Fedora distributions, runs a complex traceroute combined with ping over the given target, giving us complete information about what path the network infrastructure takes to the target, along with how responsive the nodes in the path are. We can see the typical output in the screenshot above. We have all the hosts on the path from our system to linux.org, and we also have the ping result for each one.

A well-functioning network will have zero packet loss. A good network path will also have low ping responses. As we see in the screenshot above, one such result is over 100ms. This may be considered weird if it persists. But let’s see how a concerning result would look like:

Issues in the home network

When we look at the result above, we see not only over 100ms ping results, but we also see packet loss on both the home computer and one of the nodes in the network. This may indicate trouble with the outside network infrastructure, like hackers trying to redirect legit calls to malicious servers and make it seem normal, or trouble with the home network like a faulty Wi-Fi setup, or worse, a poorly masked man in the middle attack.

So, what’s the habit here? Perform mtr scans regularly and see what changes from one call to the next. This operation can be scheduled with cron and the output can be scanned automatically with fail2ban. And what are we looking for? Packet loss as we saw and poor ping response speed. Packet loss should be reported immediately, and poor ping response should be compared with the previous value and if it persists for a long time treated as a warning sign.

The automated mtr scan should always be against the same target so the comparison would make sense. And yes, there is no harm in running more mtr scans for several separate targets at the same time but again, always comparing the same targets between them to keep the results sane. There is no sense in comparing the linux.org result with google.com: they will differ, and the difference will tell us nothing.

Revealing unwanted network connections with ss

Typical ss output

Again, ss is not an alien tool to us. We use it to see all the connections in the network from the current computer. Above we can see a typical result and depending on what services and application we run, it will be longer or shorter. For a system that was just started and has nothing running, it would look more like this:

Output of ss on a system with no applications running

This is a normal output after a cold system start. We can also see the ports where applications listen for incoming connections, to reveal unwanted servers that may be running:

Output of ss with the l parameter, showing listening sockets

What is the habit? The same: run ss with the parameters we deem correct for our network from time to time. Again, a single data point says little of our system and our network. But running frequently and comparing results would immediately pinpoint outliers, new connections, some potentially unwanted. As before, this operation can be scheduled with cron, and the difference between runs can be monitored with fail2ban.

Each new connection found is a potential thread and may require a manual review. This is similar to older Windows firewall tools which would show a notification for each new connection detected. The difference here is that we control when we run the tools and what we want to do with the results. Maybe we are not interested in every new connection that shows up, or maybe we have a list we want to keep track of. Either way, automatically monitoring network connections is a healthy and necessary habit.

Revealing unwanted file access with lsof

Open files by the firefox process

After unwanted network access comes unwanted file access. We talked about lsof before. It’s not a trivial tool to use, and it’s surely even harder to use without automated checks, but it is utmost useful. As we see above, it will report all open files for a given process. If we skip the -c parameter, it will report all open files for all processes in the system. It’s really verbose so be ready. Since for Linux everything is a file, there will be many files open.

We may also filter for open files in a given folder, using the +D parameter: lsof +D /var/home/radu. This is again useful to see who wanders in folders they should not. Usually, system applications will work in system folders, while malicious applications will prefer home or network folders. This is a good criterion to check automatically. A typical home folder check output would be:

Xwayland   3670 radu   27u   REG   0,40     1904 454488 /var/home/radu/.cache/nvidia/GLCache/c5ffef57514e7536ba38c6beb1768c70/a872de409e78cfba/a8cd1738c531e7c9.toc
gnome-sof 2098 radu 273r DIR 0,40 18 415 /var/home/radu/.cache/flatpak/system-cache
evolution 47563 radu 31u REG 0,40 16384 13227 /var/home/radu/.var/app/org.gnome.Evolution/config/evolution/mail/properties.db
evolution 47563 radu 13u REG 0,40 28672 14621 /var/home/radu/.pki/nssdb/cert9.db

We can also check for open files for a given user with lsof -u radu or file access for everyone except the given user with lsof -u ^radu. Both options are very good when searching for unknown threats.

What is the habit then? Again, frequent automatic checks for files open in home folders. If we really want to be thorough, we can check all open files. And what are we looking for? As before, differences between open files across work sessions, sensible files open by processes we don’t recognize, weird process names and weird file names. All of that is again a job perfect for fail2ban.

Revealing bandwidth misallocation with iperf3

Typical output of iperf3

We talked about iperf3 before. Sometimes a simple network speed check speaks more than all the tools above combined. It is a quick check, easy to understand and compare, and it has a double usage. We can check the speed against an Internet target, which reveals the external network speed, and we can check between two computers in the same network, revealing internal network speed.

To check the external speed, we just run iperf3 against a given internet target as in the screenshot above. The result is not particularly important on its own, but when compared with previous results it speaks volumes. Why did the speed decrease on that particular day? Why did the speed decrease forever after that particular day? Why did the speed suddenly increase? All these questions are excellent conversation starters when thinking about the home network.

To check the internal network speed, we run iperf3 -s on one computer and iperf3 -c 192.168.68.116 on the other. The IP address given is of course the one of the first computer, the performance server. This will measure the speed between them in the home network:

Internal network speed test with iperf3

Again, the result is inconclusive on its own, but when compared with a previous scan it sparks the same questions as before: why did the speed increase or decrease and why then? This will allow us to make deeper scans on the day the difference started. Of course, speed checks should be automated with cron and differences should be scanned with fail2ban. The frequency can be daily, even hourly: the test takes a few seconds and is a good indicator of the overall network performance and health.

There you have it. I know we talked about all of these tools before, but I wanted to round them all up in a single article to give a quicker all-around solution for quick and reliable home network issue detection. I know there are probably better automated tools out there by big brands, but they also usually cost a lot of money. If you master these tools, you may get the same results without spending money at all. If you feel this is too much, getting a security tool that does all the above checks is necessary and you should invest in one.

When dealing with the home network we should not rely on trust. We should rely on metrics. I hope this article will give a good bootstrap on what we are looking for and how to collect and compare this data. Thank you for reading and see you next time!

--

--

Responses (2)