Build your own web server in a few simple steps
Self Made
If you want to learn a little bit more about the communication between a web browser and an HTTP server, why not build your own web server and take a closer look.
Programming your own web server might seem like a difficult and unnecessary undertaking. Any number of freely available web servers exist in the Linux space, from popular all-rounders like Apache or NGINX to lightweight alternatives like Cherokee or lighttpd (pronounced "lighty").
But sometimes you don't need a full-blown web server. If you just want to share a couple of HTML pages locally on your own network or offer people the ability to upload files, Linux on-board tools are all it takes. A simple shell script is fine as a basic framework that controls existing tools from the GNU treasure chest. Network communication is handled by Netcat [1], aka the Swiss army knife of TCP/IP.
Getting Ready
With a project like this, the best place to start is at the root. Because a web server is still a server at the end of the day, it needs to constantly listen on a given port and respond appropriately to requests. Usually, web servers listen on port 80 for normal requests, and port 80 generally only accepts HTTP requests without encryption. The web server I'll describe in this article listens on ports 8080 and 8081 and communicates without encryption. If you are using a firewall and want to test the server on the local network, remember to allow these two ports in the firewall.
A web server needs a root folder from which it loads the requested HTML files. It also needs a directory in which it can store uploaded files. Your first step is to define a configuration using a series of simple variables at the start of the server script (Listing 1). And you need to create the directories, along with a FIFO file, either manually or using the Bash test builtin. The server6.sh
script, which is included with the code from this article [2], offers a solution.
Listing 1
Configuration
HTTP_HOME=http_home HTTP_UPLOAD=${HTTP_HOME}/upload CACHE_DATEI=${HTTP_UPLOAD}/filetoprocess FIFO_GET=fifo_get HTTP_GET_PORT=8080 HTTP_POST_PORT=8081 MEINE_IP=$(ip addr show <enp2s0> | grep -Eo "([0-9]{1,3}\.){3}[0-9]+" | sed 1q)
In the last line of Listing 1, you can see that your own IP address is also important. You will need to modify the network device specification (the Ethernet interface enp2s0
in this example) to suit your own system. When a web browser tries to submit a file via a web form, it needs a target address. GET requests are the simplest approach to doing this. When a browser sends a GET request, it expects the content of a web page in response, and it displays this content in the browser window.
You'll also need to create some sample HTML files for testing your homegrown server. (See the box entitled "Sample Files.")
Sample Files
Files for testing the web server are easily scripted. The function in Listing 2 runs through a for
loop seven times. The routine uses a here document (heredoc) to support the entry of HTML code almost 1:1 (third line). Heredocs let you refer to the variable set in the for
statement, which then simply contains the sequence number.
Heredocs help to define sections of text in many programming languages. Unlike conventional output via echo
or printf
, line breaks, indents, and some special characters are preserved in the text. Bash also supports the use of variables in heredocs.
In this way, you can create as many HTML files as you need with just a few lines of code. You could optionally integrate additional dynamic content that you generate with a script within the heredoc.
Listing 2
Creating Sample Files
function create_files () { for x in {1..7}; do cat <<-FILE > ${HTTP_HOME}/datei${x}.html <html><head><meta charset="utf-8"> <title>Page ${x}</title> </head><body> <p> $( date ) </p> <p> Page ${x} </p> </body></html> FILE done }
GET Requests
Responding to a GET request entails much more than just sending the content of a file. HTTP and HTTPS require that additional information be sent along with the transmission. If you want to know what a response from a genuine web server looks like, type the following command:
wget --spider -S "https://www.zeit.de/index"
The wget
utility downloads a web page from the terminal. The --spider
option tells wget
to behave like a web spider; in other words, it won't download the actual content but will check that the content is there and will receive the transmission information associated with an HTTP request.
In the first line, the server confirms that it is happy to take the HTTP request – HTTP/1.1 200 OK
. Further lines in the form of value pairs (such as Connection: keep-alive
, Content-Length:300
) are used to send back additional information or instructions.
It also appears that this service is a well-secured web server, because it does not reveal precisely what kind of server program it is. Many servers out themselves at this point as server: nginx
, for example – not advisable, because such disclosures makes things easier for attackers. If you want Netcat to behave like a genuine web server, you'll need a way to generate this header information associated with HTTP.
Netcat
Netcat is available on virtually any Linux system and can be used for many purposes given a little creativity on the user's part, although it admittedly has some limitations. You can emulate basic network operations using Netcat, but complex interactions are difficult or impossible. You definitely don't want to try to compete with Apache or NGINX just using Netcat.
If you want Netcat to permanently listen on a port and also send different responses, you have to combine it in a loop with a FIFO file. FIFO refers to the "first in, first out" principle. This means that the information comes back out of the file in the same order in which it was sent in [3]. Listing 3 shows an example.
Listing 3
Netcat Response
while true; do respond < $FIFO_GET | netcat -l $HTTP_GET_PORT > $FIFO_GET done
The FIFO file improves the communication between Netcat and the respond
function, as shown in Listing 4. Netcat listens on the specified port and writes to the FIFO file. On the left side of the pipe, you can see the call to the function that reads the browser request. It evaluates the request and then sends a matching response, containing an HTML header and HTML data, back through the pipe to Netcat. The respond
function decides what to return to the browser.
Listing 4
FIFO File
01 function respond () { 02 read get_or_post address httpversion 03 if [ ${#address} = 1 ]; then 04 list_dir 05 elif [ ${#address} -gt 1 ]; then 06 return_file $address 07 fi 08 }
This variant is already a fairly powerful solution. If the length of the browser request is 1
(line 3), then it is /
, and Netcat returns a directory listing. If the length is not equal to 1
, Netcat returns the content of a file from the root directory. To get the web server to return a list of the files contained in the root folder, a very simple ls directory_name
is all that is needed. However, the results then need to be embedded in suitable HTML code so that the links work and the browser can actually use them (Figure 1). The sed [4] stream editor is recommended for converting a directory listing into HTML code.
Listing 5 shows the functions referenced in Listing 4. In the list_dir
function, the directory content is output with a simple ls
command. Sed then converts the results into plain vanilla HTML. The files generated by the function from Listing 2, which reside in the root directory, already contain HTML code. The server uses the return_file
function in line 19 of Listing 5 to send a file back to the browser with a matching header.
Listing 5
Output
01 function list_dir () { 02 local output=$( ls --hide=upload -1 $HTTP_HOME | sed -r ' 03 1 i<html><head><meta charset="utf-8"><title>Content</title></head>\ 04 <body style="margin: 45px; font-family: sans-serif"> 05 s#(.*)#<li><a href="\1">\1</a></li># 06 $ a</body></html> 07 ' ) 08 09 local content_length="Content-Length: $( cat <<<$output | wc --bytes )" 10 11 cat <<<$output | sed ' 12 1 i HTTP/1.1 200 OK 13 1 i Server: Your GET SERVER 14 1 i Connection: close 15 1 i '"$content_length"'\n 16 ' 17 } 18 19 function return_file () { 20 content=$( cat ${HTTP_HOME}/${1:1} ) 21 if [[ $? -eq 0 ]]; then 22 laenge=$( cat <<<${content} | wc --bytes ) 23 cat <<<${content} | sed -r ' 24 1 i HTTP/1.1 200 OK 25 1 i Server: Your GET SERVER 26 1 i Connection: close 27 1 i Content-Length: '"$length"'\n' 28 else 29 cat <<-ERROR 30 HTTP/1.1 404 Not Found 31 Connection: close 32 Content-Length: 42 33 34 The requested page does not exist, sorry! 35 ERROR 36 fi 37 }
Because Netcat is continuously available for requests in the loop and sends a header and the corresponding HTML, a browser in the local network thinks it is dealing with a real web server.
However, it can also happen that the user manually requests a page in the browser that does not exist. This leads to the infamous 404 error, which you have probably seen on the web before [5]. The custom web server can also come up with this feature. If the cat
command in the first line of the return_file
function (line 20) throws an error, the else
branch starting at line 28 is executed. The web browser then displays a message that the requested page does not exist.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
The Gnome Foundation Struggling to Stay Afloat
The foundation behind the Gnome desktop environment is having to go through some serious belt-tightening due to continued financial problems.
-
Thousands of Linux Servers Infected with Stealth Malware Since 2021
Perfctl is capable of remaining undetected, which makes it dangerous and hard to mitigate.
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.
-
Valve and Arch Linux Announce Collaboration
Valve and Arch have come together for two projects that will have a serious impact on the Linux distribution.
-
Hacker Successfully Runs Linux on a CPU from the Early ‘70s
From the office of "Look what I can do," Dmitry Grinberg was able to get Linux running on a processor that was created in 1971.
-
OSI and LPI Form Strategic Alliance
With a goal of strengthening Linux and open source communities, this new alliance aims to nurture the growth of more highly skilled professionals.
-
Fedora 41 Beta Available with Some Interesting Additions
If you're a Fedora fan, you'll be excited to hear the beta version of the latest release is now available for testing and includes plenty of updates.
-
AlmaLinux Unveils New Hardware Certification Process
The AlmaLinux Hardware Certification Program run by the Certification Special Interest Group (SIG) aims to ensure seamless compatibility between AlmaLinux and a wide range of hardware configurations.
-
Wind River Introduces eLxr Pro Linux Solution
eLxr Pro offers an end-to-end Linux solution backed by expert commercial support.
-
Juno Tab 3 Launches with Ubuntu 24.04
Anyone looking for a full-blown Linux tablet need look no further. Juno has released the Tab 3.