Total Recoll
Lost and Found
ByKDE's unofficial search engine may be the most usable choice of all.
Searches have been KDE's weak point for several years. Nepomuk [1], which was introduced in the fourth release series as a sophisticated search engine, proved difficult to configure and use. Last year, Nepomuk was replaced by the supposedly easier-to-use Baloo [2], but it has been greeted with no greater enthusiasm. Currently, the most usable alternative for drive searches is Recoll [3] (Figure 1), which combines both simplicity and – for those who want them – advanced configuration options that are explained in a comprehensive manual [4].
Like Baloo, Recoll is a Qt tool that uses the Xapian search engine library [5]. Recoll's main difference from its predecessors is that it does not install as a daemon by default, a practice that has gained both Nepomuk and Baloo a reputation for being a drain on system resources. Recoll is not a standard KDE package, but it is well enough known that major distributions carry it. If your distro does not include the Recoll packages in its repository, the manual includes detailed information about building from source.
Once Recoll is installed, it needs to index your files. In my experience, Recoll indexed about a terabyte of files in one hour (Figure 2), about the same as either Nepomuk or Baloo; however, unlike my experience with Nepomuk, the process did not noticeably slow operations running concurrently.
To reduce updates to the index substantially, go to Preferences | Index Configuration, where you can exclude paths (e.g., library directories) and specify languages to include in searches. From Preferences | Index Scheduling (Figure 3), you can create a cronjob to run indexing regularly. Although you can also update the index manually, creating a cronjob avoids relying on your memory and allows you to run the update at a time it is less likely to interfere with other applications. Still another solution is to update the index at each login.
Before using Recoll, you might also check the list of supporting packages [6] that Recoll requires to read certain files. Many of these packages are probably already on most desktop computers, but if you need support for a specific file format, taking a moment to check is only sensible.
You can customize Recoll further by setting a keyboard binding or adding other customizations described in the manual, but here I will provide the minimum configuration steps you'll need to set up Recoll.
Simple and Advanced Searches
After basic configuration, simple searches are as easy as entering a term in the search field on the right side of the toolbar and pressing the Enter key (see Figure 1). By default, search results include stemming, so that searching for order will include results like preorder and ordered. Because Recoll has already indexed the files, results return in a matter of seconds.
As you scroll down the results window, clicking an Open link opens the file in its native application. Recoll is free software running on Linux, so you cannot open MS Word files in Microsoft Office; instead, they open in the Antiword document reader.
If this simple search does not help you locate a file, you can refine it in a number of ways. To start, you can get results more quickly if, instead of accepting the default choice All, you choose one of the half dozen common file types, such as media (graphic or sound), presentation, spreadsheet, or text. Alternatively, you can eliminate common file types by selecting other.
Another option is to refine a query. For example, selecting File name rather than the default Any term speeds search results by searching only for file names and not scanning file contents. More elaborately, by selecting Query language, you can narrow the search in other ways, many of which might be familiar to you from web searches. For example, prefacing a search term with a minus sign (-) excludes the term from the results; terms can also be separated by OR or AND for multiple simultaneous searches. In files with metadata, Recoll also has the ability to search for some of the more common fields. For example, you can search such formats by prefacing the search term with title:, author:, ext: (extension), dir: (directory path), or <YYYY-MM-DD>: for a date.
As in a web search, quotation marks define a phrase and the exact order in which the words must appear. However, you can also add a variety of letters immediately after the last quotation mark (i.e., with no space between) to refine a search further. For instance, adding C turns on case sensitivity, or adding D includes diacritics in the results.
Finally, Recoll also supports basic wild cards: an asterisk (*) for any number of characters, a question mark for a single character, and a set of characters in square brackets (e.g., [123]) to match any one of the specified characters. However, as the manual warns, wild cards should be used sparingly, especially at the start and end of search terms. Although wild cards might increase the likelihood that the term you are looking for will be in the search results, they can also slow the search and return many more unwanted results to scroll through.
The only trouble with some of these options is that you either have to memorize them or run Recoll with the manual open in another window. You might prefer to run Tools | Advanced search instead, which offers all these options in a graphical interface (Figure 4).
The mouseovers in the Advanced search dialog are extremely detailed and provide more than enough information to let you use the options in the Find and Filter features with a minimum of authorization.
Another way to save time with Recoll is to click Tools | Document history, which records the last 100 documents opened from the search results. Because of its limited number of records, calling up the document history (Figure 5) is often quicker than repeating a search.
The Problem of Integration
As an alternative to Nepomuk or Baloo, Recoll has at least two advantages. First, its interface is simple to use and, for many users, probably less intimidating as well. Second, its advanced search options are better integrated into the interface, and many are close enough to those of web browsers that they are easy to learn.
Unfortunately, Recoll is not integrated into the latest versions of KDE Plasma. The manual does discuss using a Unity Lens and an obsolete Krunner plugin, but what Recoll could really use is integration into the Dolphin file manager. If such a thing existed, it would show those who have had trouble with previous search indexers exactly what was intended.
As things are, non-integration is a small price to pay for Recoll's speed and convenience. As long as you are willing to memorize some of its advanced features, Recoll is a powerful tool that is totally undeserving of its obscurity. Like Krunner, it comes very close to being a superior alternative to a standard file manager, especially on large drives.
Info |
[1] Nepomuk: https://en.wikipedia.org/wiki/NEPOMUK_%28framework%29 |
Author |
Bruce Byfield is a computer journalist and a freelance writer and editor specializing in free and open source software. In addition to his writing projects, he also teaches live and e-learning courses. In his spare time, Bruce writes about Northwest coast art. You can read more of his work at http://brucebyfield.wordpress.com |
Issue 269/2023
Buy this issue as a PDF
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Find SysAdmin Jobs
News
-
MNT Seeks Financial Backing for New Seven-Inch Linux Laptop
MNT Pocket Reform is a tiny laptop that is modular, upgradable, recyclable, reusable, and ships with Debian Linux.
-
Ubuntu Flatpak Remix Adds Flatpak Support Preinstalled
If you're looking for a version of Ubuntu that includes Flatpak support out of the box, there's one clear option.
-
Gnome 44 Release Candidate Now Available
The Gnome 44 release candidate has officially arrived and adds a few changes into the mix.
-
Flathub Vying to Become the Standard Linux App Store
If the Flathub team has any say in the matter, their product will become the default tool for installing Linux apps in 2023.
-
Debian 12 to Ship with KDE Plasma 5.27
The Debian development team has shifted to the latest version of KDE for their testing branch.
-
Planet Computers Launches ARM-based Linux Desktop PCs
The firm that originally released a line of mobile keyboards has taken a different direction and has developed a new line of out-of-the-box mini Linux desktop computers.
-
Ubuntu No Longer Shipping with Flatpak
In a move that probably won’t come as a shock to many, Ubuntu and all of its official spins will no longer ship with Flatpak installed.
-
openSUSE Leap 15.5 Beta Now Available
The final version of the Leap 15 series of openSUSE is available for beta testing and offers only new software versions.
-
Linux Kernel 6.2 Released with New Hardware Support
Find out what's new in the most recent release from Linus Torvalds and the Linux kernel team.
-
Kubuntu Focus Team Releases New Mini Desktop
The team behind Kubuntu Focus has released a new NX GEN 2 mini desktop PC powered by Linux.