Traceroute in Python: A Powerful Wrapper Guide Exposed

Image shows penguin wrapped by a Python. This is supposed to convey the main idea of this article, which is to create a Linux program wrapper with the traceroute program to make tracing a network route easier for a Python user.
Part of this image was generated by DALL-E 2

This week, we will create a Linux program wrapper for the ‘traceroute’ program to make tracing a network route easier for a Python user. By the end of this article, I will have demonstrated a few useful techniques in data parsing and object orientation that you might find helpful.

The Target: ‘traceroute,’ Linux, and Python

Here is an abstract view of the entire system:

Image shows entire system from a high-level view. 'Python Business Logic' is at the top. 'Python Linux Layer' is in the middle. The 'Linux' layer is at the bottom. Although not shown in the diagram, the traceroute program resides in the Linux layer.

In the diagram, the ‘Python Linux Layer’ is the thing we want to program. ‘Traceroute’ resides at the lowest ‘Linux’ layer.

We want to wrap the traceroute tool on a Linux system and make it usable in a Python program. ‘Traceroute’ lists the route that a network packet follows in order to retrieve the information you seek from the remote host.

A Plan

We need to be able to:

  • Call the program from Python and capture its data
  • Parse the data into an importable and usable object

The first will be relatively easy, but the second is where the work lies. If we do it in a systematic way and keep the goal in mind, however, it shouldn’t be too difficult.

Calling the ‘traceroute’ Program and Storing its Output

First things first, we need to call the program…

import subprocess

subprocess.call(["traceroute", "-4", "periaptapis.net"])

Wow! That was easy!

To explain this a little bit, we are using ‘traceroute’ in the IPV4 mode (the ‘-4’ flag).

Right now, it is printing directly to the console, which is not ideal — we need that data for parsing!

To accomplish this, we will switch the method we call from ‘call’ to ‘run.’ This method has a ‘capture_output’ parameter. According to the documentation, when ‘capture_output’ is set to True, the ‘run’ method will return an object which contains the text of the program’s output in stdout.

import subprocess

ret = subprocess.run(["traceroute", "-4", "periaptapis.net"], capture_output=True)

We can access that captured output like this:

print(ret.stdout.decode())

We use the ‘decode’ method because stdout is in bytes and we need a string (more on that here).

Wow! Only 3 lines of code and we already have the data in a place that is very accessible!

Data Parsing

One of the most useful skills I think I’ve gained from programming in Python is data parsing.

Specifically, as a firmware engineer, I produce a lot of serial log data for the boards that I work on. Without the data parsing techniques I have accumulated, I would be spending the majority of my time scrolling long text files looking for less than 1% of the information they hold. Not just once either, because I (like every other programmer) never get it right on the first try.

Data parsing is also useful elsewhere in website scraping, big data collection, and more. Once you get fast enough, it will be hard not to write a parser for everything.

Taking a Peek at our ‘traceroute’ Output Data

The 3-line program we wrote above produces the following output:

traceroute to periaptapis.net (147.182.254.138), 30 hops max, 60 byte packets
 1  home (192.168.0.1)  0.267 ms  0.618 ms  0.583 ms
 2  boid-dsl-gw13.boid.qwest.net (184.99.64.13)  3.425 ms  3.177 ms  3.032 ms
 3  184-99-65-97.boid.qwest.net (184.99.65.97)  3.143 ms  3.458 ms  3.038 ms
 4  ae27.edge6.Seattle1.Level3.net (4.68.63.133)  13.147 ms  13.216 ms  30.300 ms
 5  * * *
 6  * DIGITAL-OCE.edge9.SanJose1.Level3.net (4.71.117.218)  29.985 ms *
 7  * * *
 8  * * *
 9  * * *
10  147.182.254.138 (147.182.254.138)  31.194 ms  30.308 ms  29.753 ms

Oof, that is going to be a pain to parse out the data we want. Let’s take this apart.

The first line is something we already know – we are tracing the route to ‘periaptapis.net.’ From there, each line represents a station that the packet is routed through. These “stations” are things like routers and gateways. The last line is the host that we wanted to reach.

By default, there are 3 queries per station. This helps the user know a couple of things, like average round-trip time and whether there are other hosts that respond to packets at that station number. Below is the breakdown for one of our lines of output:

Example of traceroute output, broken down by component (station number, hostname and IP address of the station, Query #1 round-trip time, Query #2 round-trip time, and Query #3 round-trip time).
  1. The station number
  2. The hostname and IP address of the station
  3. Query #1 round-trip time
  4. Query #2 round-trip time
  5. Query #3 round-trip time

If there is an asterisk in a field, the station didn’t respond to the packet we sent. Also, there is a chance that we get a different host in any of the three queries for a given station number. This is due to network configurations to handle incoming requests.

Simplifying Output

To simplify the output of the ‘traceroute’ program and make our parsing lives easier, we will introduce the ‘-n’ and ‘-N1’ flags to our ‘traceroute’ program call.

-N1

The ‘-N1’ tells the ‘traceroute’ program to only use one packet at a time.

Sending one packet at a time is just a kindness to the devices we will be tracing so we don’t send a bunch of packets each time we want to trace a route. The downside is that the results will be slower to come in.

-n

The ‘-n’ simply tells traceroute to not print the hostnames, simplifying things further by obtaining only the IP address of the station.

With those two flags added, our call now looks like this:

ret = subprocess.run(["traceroute", "-4", "-N1", "-n", "periaptapis.net"], capture_output=True)

Our call also produces a clean output:

traceroute to periaptapis.net (147.182.254.138), 30 hops max, 60 byte packets
 1  192.168.0.1  0.657 ms
 2  184.99.64.13  9.288 ms
 3  184.99.65.97  3.681 ms
 4  4.68.63.133  14.012 ms
 5  4.69.219.65  29.552 ms
 6  4.71.117.218  30.166 ms
 7  *
 8  *
 9  *
10  147.182.254.138  30.837 ms

This will make our lives much easier!

Coding an Object

Finally we get to the object creation that will be importable and usable by another Python program. We will call this object ‘TraceRouteInstance’ and it will have a single constructor input of ‘hostname_or_ip,’ which we will use in our ‘traceroute’ call.

So far, our object looks like this:

import subprocess

class TraceRouteInstance:
    def __init__(self, hostname_or_ip: str):
        self.traceroute_data = subprocess.run(
            ["traceroute", "-4", "-N1", "-n", hostname_or_ip], 
            capture_output=True
            )

We store our ‘traceroute’ return data in a ‘traceroute_data’ variable for later parsing. To store each of the stations, we will use a named tuple, shown below:

Station = namedtuple('Station', ['ip', 'latency(ms)'])

The syntax might look a little strange, but it enables us to reference the data in the tuple by a name which looks cleaner in the long run.

Parsing

Next we need to parse the output.

One way to do this would be by stripping leading and trailing white space on each line, then checking for a digit at the first character. Stripping the line of white space at the start will ensure that the line has a digit as its first character (if it is a line we care about).

This is how a newcomer might approach the problem.

However, to avoid so much string manipulation, I always suggest using regular expression for data extraction from lines of text. It is very difficult to understand regular expressions in the beginning, but when you understand them, they will save you headaches down the road and immensely reduce the complexity of data parsing.

Enter Regular Expression

Let’s take a look at the regular expression we will be using for this exercise:

"(?P<station_number>\d+)  (?P<ip_address>\d+\.\d+\.\d+\.\d+)  (?P<latency>\d+\.\d+) ms"

In this, we have 3 named groups: ‘station_number,’ ‘ip_address,’ and ‘latency.’ We can use this regular expression to search a line and reference the named groups to extract the data we want.

Our parse function looks like this:

    def parse_data(self):
        station_regex = r"(?P<station_number>\d+)  (?P<ip_address>\d+\.\d+\.\d+\.\d+)  (?P<latency>\d+\.\d+) ms"

        for line in self.traceroute_data.split("\n"):
            re_match = re.search(station_regex, line)

            if re_match:
                ip_address = re_match.group("ip_address")
                latency = float(re_match.group("latency"))

                self.route_list.append(Station(ip_address, latency))
            elif '*' in line:
                self.route_list.append(Station('*', '*'))

We take the output line-by-line, searching the line for our regular expression. If the regular expression is found, we use the named groups to extract the data from the line, adding it to our ‘route_list.’

If we don’t find the regular expression but we see an asterisk, we assume that the station didn’t respond and add a default value to our ‘route_list.’

Final Object Code

Finally, our importable ‘traceroute’ Python wrapper looks like this:

import subprocess
import re
from collections import namedtuple

Station = namedtuple('Station', ['ip', 'latency_ms'])

class TraceRouteInstance:
    def __init__(self, hostname_or_ip: str):
        self.traceroute_data = subprocess.run(
            ["traceroute", "-4", "-N1", "-n", hostname_or_ip], 
            capture_output=True
            ).stdout.decode()
        self.route_list = []

        self.parse_data()
    
    def parse_data(self):
        station_regex = r"(?P<station_number>\d+)  (?P<ip_address>\d+\.\d+\.\d+\.\d+)  (?P<latency>\d+\.\d+) ms"

        for line in self.traceroute_data.split("\n"):
            re_match = re.search(station_regex, line)

            if re_match:
                ip_address = re_match.group("ip_address")
                latency = float(re_match.group("latency"))

                self.route_list.append(Station(ip_address, latency))
            elif '*' in line:
                self.route_list.append(Station('*', '*'))

As a simple test, let’s write a main function in this script that utilizes the above object and prints out the trace list.

if __name__ == "__main__":
    tr = TraceRouteInstance("periaptapis.net")
    print(tr.route_list)

If you run this, it will eventually produce the following output:

[Station(ip='192.168.0.1', latency_ms=0.624), Station(ip='184.99.64.13', latency_ms=3.244), Station(ip='184.99.65.97', latency_ms=3.445), Station(ip='4.68.63.133', latency_ms=13.911), Station(ip='4.69.219.65', latency_ms=29.505), Station(ip='4.7.18.10', latency_ms=30.912), Station(ip='*', latency_ms='*'), Station(ip='*', latency_ms='*'), Station(ip='*', latency_ms='*'), Station(ip='147.182.254.138', latency_ms=31.139)]

This output shows that we have successfully extracted the route that a packet would take from my machine to requesting data from ‘periaptapis.net.’

Conclusion

In this post, we explored wrapping a Linux program with a Python interface to make it more accessible by another Python program. If you run this program, however, you will notice how long an object takes to be created. This is due to the length of time it takes for a route to be traced by using ‘traceroute.’

In a future post, we might explore options on how to mitigate the effects of this load time for external programs by using the ‘asyncio’ built-in Python library or by multi-threading.

I hope you enjoyed this week’s post! Please add any comments, suggestions, or questions that you may have! If you are interested in other articles I have written, check out these links:

Thank you for reading!

-Travis

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.