URL filtering with Pi-hole

Into the Funnel

Article from Issue 274/2023
Author(s):

Supporting browser plug-ins, network-based DNS blockers like Pi-hole help protect you against online tracking and unwanted content.

One episode of the award-winning TV series Futurama depicts the Internet as a metaverse in which advertising banners attack users' avatars like birds of prey: "The Internet! My God! It's full of ads!" Even without a metaverse, Internet users today are tracked by trackers and cookies and flooded with unwanted advertising. But users can protect themselves against this flood of advertising. There are various methods of evading tracking by advertisers, confusing trackers, and keeping unwanted content out of websites. With the help of the free Pi-hole [1], this article looks at a couple of effective approaches that help protect you against unwanted content at the server, network, and client levels, while minimizing the threat of phishing at the same time.

Proxy Filter with Problems

In the early 2000s, the proxy filter was the best way to protect yourself against unwanted content and threats from viruses and Trojan Horses from the web. Clients do not request the content of a website directly from the web, but pass the request to a central proxy server such as Squid. The server then retrieves the content, stores some of it in a local cache, and returns the information to the browser. In times of limited bandwidth, proxies were popular mainly because of their caching function, which meant that less information needed to be retrieved over slow Internet connections. Plugins such as squidGuard blocked unwanted content at the proxy level, while other extensions inspected the content of websites directly and checked for malware.

The proxies' work was made more difficult by an important security feature: HTTPS. Encrypted protocols need to pass through a proxy server without change, meaning that their content cannot be filtered, unless you break the encryption. This method, SSL Bump, is still used, especially by large companies: The proxy server terminates the SSL connection of the accessed website and inspects, filters, and caches the decrypted content. For communication with the client browser, the proxy then encrypts the data again, but uses its own certificate for this purpose. For a scenario such as this, the administrator needs to modify the configurations of all the browsers on the LAN so they accept the proxy's certificate for all URLs.

The biggest problem with any proxy filter is not technical but legal. It is a de facto man-in-the-middle attack that decrypts all Internet communication of all users. As soon as a user logs on to private services such as home banking, shopping, or social media on his company PC, the company proxy also decrypts and stores the private data such as passwords or shopping cart content. This intrusion into the privacy of users is not permissible. For a filter of this kind, the employer needs an agreement that has been approved by the works council and signed by all employees, and that, for example, generally excludes private use of company PCs. It would also be possible to create filters that completely block access to services such as banking, shopping, and social media.

As an alternative, the company could provide a separate, unfiltered WLAN without a connection to the company network for the users' private devices. In addition, modern filtering proxies such as Squid can use rule-based forwarding (peek and slice) in addition to simply "bumping" SSL connections. A ruleset decides which connections are broken and examined by the proxy and which are tunneled directly to the client without decryption. This, in turn, would allow users' private traffic to pass through untouched. But a setup like this renders the proxy ineffective as a security measure. We will not be looking at a Squid setup with bump, peek, and slice in this article, but instead investigating a solution that is legally far less problematic.

Not Ordered, Not Picked Up

Another method for keeping out unwanted content does not filter the packets returned from the Internet but, instead, the outgoing requests. When looking at the HTML code of a web page with advertising, it quickly becomes apparent that the advertising banners do not originate from the addressed target server itself. Instead, the pages embed HTML that includes links to advertising services, as well as tracking cookies that point to advertising providers. These deep links do not point to IP addresses, but to the DNS names of the operators or subdomains.

This means that the user's browser itself actively requests these banners and trackers, after resolving the DNS address of the embedded link. This is where DNS filtering comes in: It uses blacklists for unwanted URLs and refuses to deliver the IP addresses of these URLs to the client. Instead of an IP address, the filter DNS simply returns 0.0.0.0. Therefore, the browser does not even request the integrated URL from the Internet. The space normally occupied by the advertising banner remains empty and the trackers do not receive any feedback from the client. But be careful: For DNS filtering to work, clients and their browsers must use the network's default DNS resolution and not use their own DNS servers and methods.

Pi-hole as a DNS Filter

As an alternative to the options discussed so far, Pi-hole filters out DNS requests for unwanted URLs on the fly, hiding advertising content and trackers (Figure 1). Pi-hole is one of the most popular DNS filters. As the name suggests, the tool started life as a piece of software for the Raspberry Pi, but Pi-hole runs reliably and quickly on all other platforms, even when deployed on a larger network.

Figure 1: On the fly, Pi-hole filters DNS requests for unwanted URLs and hides advertising content and trackers.

Some IT managers are reluctant to use Pi-hole because they do not want to replace their existing DNS server and transfer its configuration to Pi-hole. This is especially the case if the existing DNS server resolves local addresses and services such as Kerberos and is perhaps also integrated with the DHCP service. However, because the DNS protocol has no problems with proxy forwarding, a Pi-hole setup does not need to replace the existing service at all; instead it can act as a kind of overlay – even on the same machine, like in my example.

My setup uses an existing dnsmasq server on the application server running RHEL 8. The service provides the LAN with IP addresses via DHCP, lets physical and virtual systems boot via PXE over the network, and resolves local domain names. The dnsmasq service prefers to use the public Quad9 service 9.9.9.9 as its upstream DNS. Unlike Google's open DNS service on 8.8.8.8, Quad9 does not log all incoming DNS requests including source IP addresses.

Besides the dnsmasq service, the application server is now also running Pi-hole, in a Podman container. In principle, there are two options for running two DNS servers on the same machine. If Pi-hole runs in a container without its own IP address, the existing dnsmasq service must switch to a port other than 53. Alternatively, you can let the Pi-hole container operate on a bridge network and therefore with its own IP address. For this example, I chose the second approach, because my application server uses a whole bunch of other Podman containers with their own IP addresses anyway.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • The sysadmin's daily grind: Pi-hole

    A strange rule seems to dictate that the most useless products and services have the most annoying online advertising. Columnist Charly blocks the garish advertising for all computers on his network centrally with the Pi-hole tool, which is not only for Raspberry Pi devices.

  • Privacy Appliances

    A Raspberry Pi with the right software filters out annoying ads and nasty trackers for end devices on your local network.

  • FOSSPicks

    The promised profusion of extra time has failed to materialize for Graham this month, leaving him with too many synth kits to build, a table littered with components, and a leaking toilet.

  • Mistborn

    Mistborn bundles important Internet services on your home network and secures them with a WireGuard VPN tunnel, Pi-hole, iptables rules, and separate containers.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News