Home network monitoring with pfSense, Protectli, and a screen scraper

Programming Snapshot – Protectli

Article from Issue 208/2018
Author(s):

What is making the lights on the router flicker so excitedly? An intruder? We investigate with pfSense on a Protectli micro appliance and a screen scraper to email the information.

It's a shame that no routers simply display the network packet addresses that pass through them on an LED display. Because I'm curious about what's going on in my home network, on the advice of a work colleague, I bought a micro appliance from the Chinese company Protectli (Figure 1) [1], which runs the FreeBSD-based open source firewall pfSense. The box is about four by four inches in size and passively cooled, so there is absolutely no fan or other noise.

Figure 1: The Protectli micro appliance used as a router.

The installation is a piece of cake – simply load the distribution from the pfSense Community Edition website [2] onto a bootable USB stick, insert an mSATA disk and RAM into the Protectli's small case, and after booting, say yes to the installation prompts. Badda-bing badda-boom, pfSense's web GUI is up and running (Figure 2).

Figure 2: Finally, a device to monitor all of your home's network traffic.

Guardian at the Gate

The Protectli appliance is directly connected to the Internet-facing interface (in my case, a DSL modem to the ISP). On the LAN side, it provides access to the Internet for all devices connected to my home network (in my case, a series of routed subnets).

Equipped with a four-core Celeron, it's powerful enough to look at every single packet, create statistics or even intervene when needed, and block certain communication attempts according to predefined firewall rules. If I want to know why router lights are flickering, I only need to call the pfSense GUI to see who is streaming Spotify, watching Netflix, or ordering on Amazon (Figure 2).

In addition to traditional terminal-based tools such as pftop, the firewall GUI also offers very elegant add-on packages such as ntopng, so you can browse through pie charts and HTML tables to find out who uses the most bandwidth or contacts computers in dubious countries (Figure 3).

Figure 3: Pie charts for server and client ports.

Unfortunately, there is no official API for the GUI, only a FauxAPI [3], which runs as an add-on package on the pfSense distribution and provides limited access to the firewall's internals.

Keys for Protectli

To check at regular intervals what is happening on the Protectli box, I thought it would be easy to write a screen scraper [4] that logs in periodically and automatically to the box's login page (Figure 4) and scans and mails the data displayed on the dashboard.

Figure 4: First hurdle: Login.

However, the first hurdle between a command-line client and the juicy network data is the login page so defiantly presented by pfSense. A look at the HTML code (Figure 5) reveals that the two fields for accepting the username and password are dubbed usernamefld and passwordfld, and the submit button goes by the name login. The Python scraper quickly thrown together in Listing 1 [5], which uses the selenium module installed using pip3 to simulate a browser, searches for and finds these elements using the find_element_by_name() function.

Listing 1

dash-scraper.py

 

Figure 5: The login page's source code contains fields for the username and password.

For the webdriver.Firefox() call to work with the system's Firefox browser, the Linux distribution needs the geckodriver program, which is available as a TAR file [6]. You need to unpack this and dump the binary that falls out of the archive into a path that can be found somewhere in $PATH. The script opens the browser, takes you to the login page, autofills the form fields, and then clicks on the Login button. The selenium module is often used for testing WebGUIs and makes it really easy to simulate a user sitting in front of a web user interface.

After the pfSense login screen, the pfSense dashboard page (Figure 6) with the firewall overview data is saved in the saved.png file by the new screen scraper by calling save_screenshot() as a trigger. The values to be filled in are read by the script from the creds.yaml file, which is read from disk; the data is then stored as a username and password accessible in the creds dictionary (Figure 7).

Figure 6: pfSense firewall status information.
Figure 7: Sensitive data in the creds.yaml file.

One Man Went to Mow

Listing 2 is used to periodically send the collected data to an email address. It bundles the PNG file created by Listing 1 into an email as an attachment and sends it via an SMTP server. To read the security-relevant username and password variables for the SMTP server, line 10 retrieves the same YAML file as before and stores its contents in the creds dictionary.

Listing 2

mail.py

 

The script then builds an HTML body – with introductory text and an IMG link to the attached image, so that a webmail client can display it graphically. As of line 31, Listing 2 establishes a connection to the SMTP server, whose address is also retrieved from the creds.yaml file as the smtp_server: mail.provider.net entry. The script uses port 587 and transmits the data in TLS-encrypted form. It appends the screenshot in MIME format and adds an email content ID header with the name of an imaginary file in square brackets.

Figure 8 shows how the mail arrives in Gmail. Called as a cron job once a day, this keeps the home owner up to date with what's happening on the local network.

Figure 8: The screen-scraped dashboard arrives by email.

Infos

  1. Protectli micro appliance: https://www.amazon.com/gp/product/B01GIVQI3M
  2. pfSense Community Edition download: https://www.pfsense.org/download/
  3. pfSense FauxAPI: https://github.com/ndejong/pfsense_fauxapi
  4. Jarmul, Katharine, and Richard Lawson. Python Web Scraping, 2nd ed.Packt Publishing, 2017
  5. Listings for this article: ftp://ftp.linux-magazine.com/pub/listings/linux-magazine.com/208/
  6. geckodriver: https://github.com/mozilla/geckodriver/releases/tag/v0.19.0

The Author

Mike Schilli works as a software engineer in the San Francisco Bay area, California. Each month in his column, which has been running since 1997, he researches practical applications of various programming languages. If you go to mailto:mschilli@perlmeister.com he will gladly answer any questions.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Plan Your Hike

    The hiking and cycling app komoot saves your traveled excursion routes. Mike Schilli shows you how to retrieve the data with Go.

  • darkstat

    Thanks to its minimal footprint, 20-year-old darkstat hardly generates any noticeable load even on low-powered systems, making it the perfect monitoring tool for Charly's home utility room.

  • Thunderbird Security

    Thunderbird offers several options for secure email, and the GnuPG-based Enigmail encryption add-on provides an additional layer of protection.

  • Archiving Email

    Email archiving involves more than just backing up your email directories. It is also a question of classifying the email and making it easier for users to find their way around overfilled email folders.

  • Suricata

    Snort isn't the only free intrusion detection tool in the barnyard. We'll show you a powerful and promising alternative known as Suricata.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News