A new semantic search engine for the KDE desktop
Loaded for Bear

Baloo replaces Nepomuk as the semantic search engine on the KDE desktop, but it gets off to a bumpy start.
The Nepomuk semantic search has been highly controversial since its introduction in KDE – both for users and for application developers. The release of KDE Applications 4.13 brings a new solution named Baloo to the KDE desktop – but not without some growing pains.
Desktop environments are pretty dumb, as users discover when they look for a file whose name they simply cannot remember. Semantic desktops were designed to address this problem. A semantic desktop has access to information about data, such as what data manifested, when data was created, and how data relate to each other. Using this information, a user can, for example, search for a file on the hard disk that John Doe emailed in March.
The KDE developers wanted to give their desktop environment an appropriate semantic search engine. They turned to the results of the Nepomuk (Networked Environment for Personalized, Ontology-Based Management of Unified Knowledge; Figure 1) research project [1] funded by the European Union from 2006 to 2008 to the tune of several million euros . It was intended to provide everything necessary to promote and simplify the development of a semantic desktop.
RDF and Other Calamities
Nepomuk required the RDF (Resource Description Framework) to describe and store relations [2]. The implementation of Nepomuk in KDE collects information about stored data from all KDE applications, links and processes this information, and serves it up to the search function.
Since the introduction of Nepomuk in KDE 4, many users repeatedly complained about its poor performance and lack of stability. Application developers, in turn, found the API too complicated and called for extra features. Although the KDE developers made improvements over the years, many users continue to disable Nepomuk. Especially in interactions with Akonadi, the underpinnings of the PIM programs, Nepomuk generated excessive load.
Virtuoso, Vishuoso, and a Restart
The KDE developers identified the Virtuoso database, which ran in the background and which Nepomuk used to store its RDF data, as the major obstacle to better performance. Although Virtuoso itself worked pretty fast, it hogged extremely large amounts of memory. The KDE developers therefore started to implement their own RDF store named Vishuoso [3]. However, they continually stumbled over the requirements of the EU research project. They were partly vague, incomplete, and sometimes even redundant [4].
For these reasons, the KDE developers decided to take a radical step: They ditched RDF and the old Nepomuk and developed a successor named Baloo [5]. In contrast to Nepomuk, Baloo, which is named after a character in The Jungle Book (think "Bear Necessities"), is designed to be more frugal with resources, be more reliable, and deliver superior results faster. Internally, Baloo is modular, so in future, it will be easier to add new features and improve existing ones. Additionally, the design was overhauled with the intention of preventing failures.
The KDE developers have not completely rewritten Baloo; rather, they have recycled parts of the code. Thus, they refer to Baloo as the "next generation of Nepomuk search." The changes to the programming interface are thus manageable; application developers should find it relatively easy to adapt their software.
Relations and Stores
Baloo manages relations between two "uniquely identifiable identifiers." A file has the unique identifier file:<x>
, whereas Akonadi creates a unique identifier of the form akonadi:?item=<x>
for most PIM data.
Each relation ends up in a separate and appropriate data store. In the simplest case, this is a table with two columns in a SQLite database. Baloo deliberately does not require a specific storage format or data store API. The advantage of this setup is that the data store can be tailored to match the information to be stored, thus letting Baloo store the data in the best possible way.
The search itself is performed by the search stores. Each search store only takes care of certain data. For example, one exclusively searches for files (File Search), another searches in email (Email Search), and third searches in contacts (Contact Search). Each of these search stores provide a specific API through which the search can be triggered; however, they can also provide additional APIs.
Currently, Baloo only manages files and comes with a matching search store and data store. The collected data is stored in a SQLite database. The search is additionally supported by the Xapian software, which indexes the collected dataset. Akonadi already stores all its PIM data itself; the search in this dataset is handled by search stores for contacts and email [6]. Thanks to this new architecture with individual data and search stores, Baloo is unlikely to require too much memory and should deliver matching search results extremely quickly.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Direct Download
Read full article as PDF:
Price $2.95
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Find SysAdmin Jobs
News
-
OpenMandriva Lx 23.03 Rolling Release is Now Available
OpenMandriva "ROME" is the latest point update for the rolling release Linux distribution and offers the latest updates for a number of important applications and tools.
-
CarbonOS: A New Linux Distro with a Focus on User Experience
CarbonOS is a brand new, built-from-scratch Linux distribution that uses the Gnome desktop and has a special feature that makes it appealing to all types of users.
-
Kubuntu Focus Announces XE Gen 2 Linux Laptop
Another Kubuntu-based laptop has arrived to be your next ultra-portable powerhouse with a Linux heart.
-
MNT Seeks Financial Backing for New Seven-Inch Linux Laptop
MNT Pocket Reform is a tiny laptop that is modular, upgradable, recyclable, reusable, and ships with Debian Linux.
-
Ubuntu Flatpak Remix Adds Flatpak Support Preinstalled
If you're looking for a version of Ubuntu that includes Flatpak support out of the box, there's one clear option.
-
Gnome 44 Release Candidate Now Available
The Gnome 44 release candidate has officially arrived and adds a few changes into the mix.
-
Flathub Vying to Become the Standard Linux App Store
If the Flathub team has any say in the matter, their product will become the default tool for installing Linux apps in 2023.
-
Debian 12 to Ship with KDE Plasma 5.27
The Debian development team has shifted to the latest version of KDE for their testing branch.
-
Planet Computers Launches ARM-based Linux Desktop PCs
The firm that originally released a line of mobile keyboards has taken a different direction and has developed a new line of out-of-the-box mini Linux desktop computers.
-
Ubuntu No Longer Shipping with Flatpak
In a move that probably won’t come as a shock to many, Ubuntu and all of its official spins will no longer ship with Flatpak installed.