Workshop – Accessing log data with Loki

Log Study

© Lead Image © Kheng Ho Toh, 123RF.com

© Lead Image © Kheng Ho Toh, 123RF.com

Author(s):

Loki is a powerful, scalable, and easy-to-use solution for aggregating log data.

One day, during one of my company's cloud project meetings, a developer colleague said, "I need to find a way to quickly access logs for debugging and troubleshooting." I already had some experience with the Grafana-Prometheus, so I said I would help find a solution.

It turns out, the solution we settled on was Loki [1], from Grafana Labs. The Grafana Labs website describes Loki as "…a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus." Loki is designed to aggregate logs efficiently, extracting metrics and alerts – all without requiring a massive indexing configuration. Once you have extracted the information you need, you can then use Grafana to visualize the data.

This workshop offers a quick look at how to access log data using Loki. In this scenario, I will push logs generated by an Apache web server hosting a sample Nextcloud deployment, then evaluate the data using Loki's own query language, LogQL.

In addition to a Loki server, I'll install the companion application Promtail [2], which Grafana Labs maintains as an agent to push data to Loki.

Log Factory

The first step is to start logging. If you already have a log-filled folder, just skip this step. In this case, I'll run a Nextcloud Docker container. I'll "mount" (bind) the Apache web server log folder inside the container to a local folder on the workstation. This step will allow Promtail, which will run locally, to access the files.

Of course you need to have the Docker engine installed. If you don't, see the box entitled "Get Docker Ready."

Get Docker Ready

This tutorial uses Docker as a way to spin up services quickly without installing unnecessary packages. In order to use Docker, you'll need the Docker engine, which you can install with a one-liner:

# curl -sSL https://get.docker.com | sudo bash

This command will fetch the latest official installation script. The script will detect which Linux distribution you're running, add the proper package manager repositories, and install.

Warning: Make sure you check the contents of a script every time you plan to pipe directly to sudo bash.

docker run --name nextcloud -d -p 8080:80 -v /somelogsdir:/var/log/apache2 nextcloud

Once the container is running, you can sail a browser to:

http://localhost:8080

and perform some actions on the Nextcloud instance. It doesn't really matter what you do as long as it generates entries in the access.log and error.log Apache default log files.

Deploying

Download the binary release of the most recent version of both Loki and Promtail. Regarding Loki, fetch both the binary distribution and a sample config file:

wget https://raw.githubusercontent.com/grafana/loki/v2.2.1/cmd/loki/loki-local-config.yaml -O loki_config.yaml
wget https://github.com/grafana/loki/releases/download/v2.2.1/loki-linux-amd64.zip
unzip loki-linux-amd64.zip

The Loki sample config is already good enough for this workshop, so I'll execute it and keep it running:

./loki-linux-amd64 -config.file=loki_config.yaml

If you see a bunch of creating table messages, that is a good sign – it means Loki is creating the structure to host the log entries.

The next step is to set up and run Promtail:

wget https://raw.githubusercontent.com/
grafana/loki/master/cmd/promtail/promtail-local-config.yaml-O promtail_config.yaml
wget https://github.com/grafana/loki/releases/download/v2.2.1/promtail-linux-amd64.zip
unzip promtail-linux-amd64.zip

Before running Promtail, you'll need to tweak the configuration to set up a Loki URL and log source folder (lines 9 and 18 in Listing 1).

Listing 1

Promtail Configuration

01 server:
02  http_listen_port: 9080
03  grpc_listen_port: 0
04
05 positions:
06  filename: /tmp/positions.yaml
07
08 clients:
09   - url: "http://localhost:3100/loki/api/v1/push"
10 scrape_configs:
11 - job_name: apache
12   static_configs:
13   - targets:
14       - localhost
15     labels:
16       job: "apache"
17       instance: "localserver"
18       __path__: /somelogsdir/*.log

Once you have successfully configured Promtail, run it with:

./promtail-linux-amd64 -config.file=promtail_config.yaml

If everything is working as expected, you won't get any output yet from the logs being pushed to Loki.

Evaluation

Loki has no built-in UI, so the only way to query logs at this point is to make use of the excellent Loki RESTful API (see the box entitled "Structure").

Structure

Once the logs are stored in Loki, they will be organized into streams. Each stream is identified by labels. Some labels are automatically generated (for example, "filename") and some are custom-made (See Listing 1, rows 16 and 17.)

In this case, I'll use {job="apache"} and {instance="localserver"}.

Loki will eventually store the log entries as pairs composed by a timestamp and the actual content.

The following query asks Loki to provide the most recent log entries, limiting the result to three entries:

curl -G -s  "http://localhost:3100/loki/api/v1/query_range?limit=3" --data-urlencode 'query={job="apache"}' | jq

The results of the query appear in Listing 2.

Listing 2

Sample Log Query Result Object

01 {
02   "status": "success",
03   "data": {
04     "resultType": "streams",
05     "result": [
06       {
07         "stream": {
08           "filename": "/somelogsdir/access.log",
09           "instance": "localserver",
10           "job": "apache"
11         },
12         "values": [
13           [
14             "1620748522681322318",
15             "172.17.0.1 - - [11/May/2021:15:55:22 +0000] \"GET /csrftoken HTTP/1.1\" 200 928 \"-\"\"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0\""
16           ],
17           [
18             "1620748504911781382",
19             "172.17.0.1 - - [11/May/2021:15:55:04 +0000] \"GET /csrftoken HTTP/1.1\" 200 929 \"-\"\"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0\""
20           ],
21           [
22             "1620748336477761747",
23             "172.17.0.1 - - [11/May/2021:15:50:54 +0000] \"GET /cron.php HTTP/1.1\" 200 931 \"-\"\"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0\""
24           ]
25         ]
26       }
27     ],
28    "stats": {}
29 }

LogQL

Loki supports complex queries through (potentially) terabytes worth of logs.

You can compose your own queries using the LogQL query language, which is a modified version of the Prometheus language (PromQL). LogQL might look complicated at first, but you'll soon discover that it is basically a glorified grep.

LogQL queries consist of two parts:

  • a log stream selector
  • a log pipeline

Start by selecting one or more streams, and then apply a pipeline operator specifying the string you're looking for. For example, you might want to look for all accesses coming from a specific IP address:

{job="apache"} |= "172.17.0.1"

Or you might be interested in entries related to a specific page:

{job="apache"} |= "cron.php"

The following query defines the exact file you wish to search, loosely looking for Firefox accesses through regexp syntax:

{job="apache",filename="/somelogsdir/access.log"} |~ "Firefox.*"

You can also elect not to filter specific strings. For instance, if you wish to exclude entries with an http 200 status code from the result, use the ` character to delimit the search string:

{job="apache"} != `HTTP/1.1\" 200`

LogQL also provides a way to parse specific log formats (like JSON and logfmt) using a log parser. This option won't be useful in the case of Apache logs, but you'll find great documentation on the topic in the Loki wiki [3].

Grafana and Visualization

You can display the log data in a visually meaningful form using Grafana. To speed things up, I'll deploy Grafana through the official Docker image. I will make sure to run it in host network mode so a connection to Loki is possible:

docker run -d --network host grafana/grafana

Once Grafana is up, use your browser to land on http://localhost:3000. Set up the default admin credentials, click on Configuration | Data Sources, and finally, select Add Data Source.

Select Loki as data source type and enter http://localhost:3100 as the HTTP URL parameter (Figure 1) .

Figure 1: Connecting the Grafana instance to Loki.

The Save & Test button will make sure settings are validated. You can finally move to the Explore tab, where you can freely query any data source (Figure 2). Once the query is entered and verified, share a short link with coworkers by clicking on the Share icon in the upper part of the page (again Figure 2).

Figure 2: Comfortable querying and result sharing with Grafana.

Conclusions

Loki lets you set up a complete log aggregation infrastructure in a very short time span, without having to write a single line of code (see the box entitled "Loki vs Elasticsearch"). All components can run inside a Docker container or in a Kubernetes cluster when it's time to deploy Loki as a production application.

Loki vs Elasticsearch

In the log aggregation ecosystem, Elasticsearch (and, in general, the ELK stack) is certainly a popular choice, but in my opinion, Loki might be better for certain use cases. I give the advantage to Loki for:

  • Scalability: Although Elasticsearch indexes all elements of log entries beforehand, Loki specializes in brute force text searches. Loki data is stored unstructured, meaning that Loki can handle a larger amount of data compared to Elasticsearch.
  • Metrics format: Loki stores logs with the same structure logic as Prometheus TSDB (streams). This approach means that an application stack (Grafana, Prometheus, and Loki) can pinpoint an application issue starting from a metric or the other way around.

The Loki project has a comparison page for your consideration [4]. Always choose the best tool for your use case.

Loki also supports third-party storage for its logs collection (such as AWS S3 or Apache Cassandra). Next time you deploy a machine or a service, install a Promtail agent and give Loki a try. You'll be surprised by how quickly you can get productive.

The Author

Stefano Chittaro manages multicloud deployments with a special focus on automation and observability.