URL filtering with Pi-hole
Into the Funnel
Supporting browser plug-ins, network-based DNS blockers like Pi-hole help protect you against online tracking and unwanted content.
One episode of the award-winning TV series Futurama depicts the Internet as a metaverse in which advertising banners attack users' avatars like birds of prey: "The Internet! My God! It's full of ads!" Even without a metaverse, Internet users today are tracked by trackers and cookies and flooded with unwanted advertising. But users can protect themselves against this flood of advertising. There are various methods of evading tracking by advertisers, confusing trackers, and keeping unwanted content out of websites. With the help of the free Pi-hole [1], this article looks at a couple of effective approaches that help protect you against unwanted content at the server, network, and client levels, while minimizing the threat of phishing at the same time.
Proxy Filter with Problems
In the early 2000s, the proxy filter was the best way to protect yourself against unwanted content and threats from viruses and Trojan Horses from the web. Clients do not request the content of a website directly from the web, but pass the request to a central proxy server such as Squid. The server then retrieves the content, stores some of it in a local cache, and returns the information to the browser. In times of limited bandwidth, proxies were popular mainly because of their caching function, which meant that less information needed to be retrieved over slow Internet connections. Plugins such as squidGuard blocked unwanted content at the proxy level, while other extensions inspected the content of websites directly and checked for malware.
The proxies' work was made more difficult by an important security feature: HTTPS. Encrypted protocols need to pass through a proxy server without change, meaning that their content cannot be filtered, unless you break the encryption. This method, SSL Bump, is still used, especially by large companies: The proxy server terminates the SSL connection of the accessed website and inspects, filters, and caches the decrypted content. For communication with the client browser, the proxy then encrypts the data again, but uses its own certificate for this purpose. For a scenario such as this, the administrator needs to modify the configurations of all the browsers on the LAN so they accept the proxy's certificate for all URLs.
The biggest problem with any proxy filter is not technical but legal. It is a de facto man-in-the-middle attack that decrypts all Internet communication of all users. As soon as a user logs on to private services such as home banking, shopping, or social media on his company PC, the company proxy also decrypts and stores the private data such as passwords or shopping cart content. This intrusion into the privacy of users is not permissible. For a filter of this kind, the employer needs an agreement that has been approved by the works council and signed by all employees, and that, for example, generally excludes private use of company PCs. It would also be possible to create filters that completely block access to services such as banking, shopping, and social media.
As an alternative, the company could provide a separate, unfiltered WLAN without a connection to the company network for the users' private devices. In addition, modern filtering proxies such as Squid can use rule-based forwarding (peek and slice) in addition to simply "bumping" SSL connections. A ruleset decides which connections are broken and examined by the proxy and which are tunneled directly to the client without decryption. This, in turn, would allow users' private traffic to pass through untouched. But a setup like this renders the proxy ineffective as a security measure. We will not be looking at a Squid setup with bump, peek, and slice in this article, but instead investigating a solution that is legally far less problematic.
Not Ordered, Not Picked Up
Another method for keeping out unwanted content does not filter the packets returned from the Internet but, instead, the outgoing requests. When looking at the HTML code of a web page with advertising, it quickly becomes apparent that the advertising banners do not originate from the addressed target server itself. Instead, the pages embed HTML that includes links to advertising services, as well as tracking cookies that point to advertising providers. These deep links do not point to IP addresses, but to the DNS names of the operators or subdomains.
This means that the user's browser itself actively requests these banners and trackers, after resolving the DNS address of the embedded link. This is where DNS filtering comes in: It uses blacklists for unwanted URLs and refuses to deliver the IP addresses of these URLs to the client. Instead of an IP address, the filter DNS simply returns 0.0.0.0. Therefore, the browser does not even request the integrated URL from the Internet. The space normally occupied by the advertising banner remains empty and the trackers do not receive any feedback from the client. But be careful: For DNS filtering to work, clients and their browsers must use the network's default DNS resolution and not use their own DNS servers and methods.
Pi-hole as a DNS Filter
As an alternative to the options discussed so far, Pi-hole filters out DNS requests for unwanted URLs on the fly, hiding advertising content and trackers (Figure 1). Pi-hole is one of the most popular DNS filters. As the name suggests, the tool started life as a piece of software for the Raspberry Pi, but Pi-hole runs reliably and quickly on all other platforms, even when deployed on a larger network.
Some IT managers are reluctant to use Pi-hole because they do not want to replace their existing DNS server and transfer its configuration to Pi-hole. This is especially the case if the existing DNS server resolves local addresses and services such as Kerberos and is perhaps also integrated with the DHCP service. However, because the DNS protocol has no problems with proxy forwarding, a Pi-hole setup does not need to replace the existing service at all; instead it can act as a kind of overlay – even on the same machine, like in my example.
My setup uses an existing dnsmasq server on the application server running RHEL 8. The service provides the LAN with IP addresses via DHCP, lets physical and virtual systems boot via PXE over the network, and resolves local domain names. The dnsmasq service prefers to use the public Quad9 service 9.9.9.9 as its upstream DNS. Unlike Google's open DNS service on 8.8.8.8, Quad9 does not log all incoming DNS requests including source IP addresses.
Besides the dnsmasq service, the application server is now also running Pi-hole, in a Podman container. In principle, there are two options for running two DNS servers on the same machine. If Pi-hole runs in a container without its own IP address, the existing dnsmasq service must switch to a port other than 53. Alternatively, you can let the Pi-hole container operate on a bridge network and therefore with its own IP address. For this example, I chose the second approach, because my application server uses a whole bunch of other Podman containers with their own IP addresses anyway.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Armbian 24.11 Released with Expanded Hardware Support
If you've been waiting for Armbian to support OrangePi 5 Max and Radxa ROCK 5B+, the wait is over.
-
SUSE Renames Several Products for Better Name Recognition
SUSE has been a very powerful player in the European market, but it knows it must branch out to gain serious traction. Will a name change do the trick?
-
ESET Discovers New Linux Malware
WolfsBane is an all-in-one malware that has hit the Linux operating system and includes a dropper, a launcher, and a backdoor.
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.
-
Linux Sees Massive Performance Increase from a Single Line of Code
With one line of code, Intel was able to increase the performance of the Linux kernel by 4,000 percent.
-
Fedora KDE Approved as an Official Spin
If you prefer the Plasma desktop environment and the Fedora distribution, you're in luck because there's now an official spin that is listed on the same level as the Fedora Workstation edition.
-
New Steam Client Ups the Ante for Linux
The latest release from Steam has some pretty cool tricks up its sleeve.
-
Gnome OS Transitioning Toward a General-Purpose Distro
If you're looking for the perfectly vanilla take on the Gnome desktop, Gnome OS might be for you.
-
Fedora 41 Released with New Features
If you're a Fedora fan or just looking for a Linux distribution to help you migrate from Windows, Fedora 41 might be just the ticket.