Total Recoll
Lost and Found
ByKDE's unofficial search engine may be the most usable choice of all.
Searches have been KDE's weak point for several years. Nepomuk [1], which was introduced in the fourth release series as a sophisticated search engine, proved difficult to configure and use. Last year, Nepomuk was replaced by the supposedly easier-to-use Baloo [2], but it has been greeted with no greater enthusiasm. Currently, the most usable alternative for drive searches is Recoll [3] (Figure 1), which combines both simplicity and – for those who want them – advanced configuration options that are explained in a comprehensive manual [4].
Like Baloo, Recoll is a Qt tool that uses the Xapian search engine library [5]. Recoll's main difference from its predecessors is that it does not install as a daemon by default, a practice that has gained both Nepomuk and Baloo a reputation for being a drain on system resources. Recoll is not a standard KDE package, but it is well enough known that major distributions carry it. If your distro does not include the Recoll packages in its repository, the manual includes detailed information about building from source.
Once Recoll is installed, it needs to index your files. In my experience, Recoll indexed about a terabyte of files in one hour (Figure 2), about the same as either Nepomuk or Baloo; however, unlike my experience with Nepomuk, the process did not noticeably slow operations running concurrently.
To reduce updates to the index substantially, go to Preferences | Index Configuration, where you can exclude paths (e.g., library directories) and specify languages to include in searches. From Preferences | Index Scheduling (Figure 3), you can create a cronjob to run indexing regularly. Although you can also update the index manually, creating a cronjob avoids relying on your memory and allows you to run the update at a time it is less likely to interfere with other applications. Still another solution is to update the index at each login.
Before using Recoll, you might also check the list of supporting packages [6] that Recoll requires to read certain files. Many of these packages are probably already on most desktop computers, but if you need support for a specific file format, taking a moment to check is only sensible.
You can customize Recoll further by setting a keyboard binding or adding other customizations described in the manual, but here I will provide the minimum configuration steps you'll need to set up Recoll.
Simple and Advanced Searches
After basic configuration, simple searches are as easy as entering a term in the search field on the right side of the toolbar and pressing the Enter key (see Figure 1). By default, search results include stemming, so that searching for order will include results like preorder and ordered. Because Recoll has already indexed the files, results return in a matter of seconds.
As you scroll down the results window, clicking an Open link opens the file in its native application. Recoll is free software running on Linux, so you cannot open MS Word files in Microsoft Office; instead, they open in the Antiword document reader.
If this simple search does not help you locate a file, you can refine it in a number of ways. To start, you can get results more quickly if, instead of accepting the default choice All, you choose one of the half dozen common file types, such as media (graphic or sound), presentation, spreadsheet, or text. Alternatively, you can eliminate common file types by selecting other.
Another option is to refine a query. For example, selecting File name rather than the default Any term speeds search results by searching only for file names and not scanning file contents. More elaborately, by selecting Query language, you can narrow the search in other ways, many of which might be familiar to you from web searches. For example, prefacing a search term with a minus sign (-) excludes the term from the results; terms can also be separated by OR or AND for multiple simultaneous searches. In files with metadata, Recoll also has the ability to search for some of the more common fields. For example, you can search such formats by prefacing the search term with title:, author:, ext: (extension), dir: (directory path), or <YYYY-MM-DD>: for a date.
As in a web search, quotation marks define a phrase and the exact order in which the words must appear. However, you can also add a variety of letters immediately after the last quotation mark (i.e., with no space between) to refine a search further. For instance, adding C turns on case sensitivity, or adding D includes diacritics in the results.
Finally, Recoll also supports basic wild cards: an asterisk (*) for any number of characters, a question mark for a single character, and a set of characters in square brackets (e.g., [123]) to match any one of the specified characters. However, as the manual warns, wild cards should be used sparingly, especially at the start and end of search terms. Although wild cards might increase the likelihood that the term you are looking for will be in the search results, they can also slow the search and return many more unwanted results to scroll through.
The only trouble with some of these options is that you either have to memorize them or run Recoll with the manual open in another window. You might prefer to run Tools | Advanced search instead, which offers all these options in a graphical interface (Figure 4).
The mouseovers in the Advanced search dialog are extremely detailed and provide more than enough information to let you use the options in the Find and Filter features with a minimum of authorization.
Another way to save time with Recoll is to click Tools | Document history, which records the last 100 documents opened from the search results. Because of its limited number of records, calling up the document history (Figure 5) is often quicker than repeating a search.
The Problem of Integration
As an alternative to Nepomuk or Baloo, Recoll has at least two advantages. First, its interface is simple to use and, for many users, probably less intimidating as well. Second, its advanced search options are better integrated into the interface, and many are close enough to those of web browsers that they are easy to learn.
Unfortunately, Recoll is not integrated into the latest versions of KDE Plasma. The manual does discuss using a Unity Lens and an obsolete Krunner plugin, but what Recoll could really use is integration into the Dolphin file manager. If such a thing existed, it would show those who have had trouble with previous search indexers exactly what was intended.
As things are, non-integration is a small price to pay for Recoll's speed and convenience. As long as you are willing to memorize some of its advanced features, Recoll is a powerful tool that is totally undeserving of its obscurity. Like Krunner, it comes very close to being a superior alternative to a standard file manager, especially on large drives.
Info |
[1] Nepomuk: https://en.wikipedia.org/wiki/NEPOMUK_%28framework%29 |
Author |
Bruce Byfield is a computer journalist and a freelance writer and editor specializing in free and open source software. In addition to his writing projects, he also teaches live and e-learning courses. In his spare time, Bruce writes about Northwest coast art. You can read more of his work at http://brucebyfield.wordpress.com |
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Latest Cinnamon Desktop Releases with a Bold New Look
Just in time for the holidays, the developer of the Cinnamon desktop has shipped a new release to help spice up your eggnog with new features and a new look.
-
Armbian 24.11 Released with Expanded Hardware Support
If you've been waiting for Armbian to support OrangePi 5 Max and Radxa ROCK 5B+, the wait is over.
-
SUSE Renames Several Products for Better Name Recognition
SUSE has been a very powerful player in the European market, but it knows it must branch out to gain serious traction. Will a name change do the trick?
-
ESET Discovers New Linux Malware
WolfsBane is an all-in-one malware that has hit the Linux operating system and includes a dropper, a launcher, and a backdoor.
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.
-
Linux Sees Massive Performance Increase from a Single Line of Code
With one line of code, Intel was able to increase the performance of the Linux kernel by 4,000 percent.
-
Fedora KDE Approved as an Official Spin
If you prefer the Plasma desktop environment and the Fedora distribution, you're in luck because there's now an official spin that is listed on the same level as the Fedora Workstation edition.
-
New Steam Client Ups the Ante for Linux
The latest release from Steam has some pretty cool tricks up its sleeve.
-
Gnome OS Transitioning Toward a General-Purpose Distro
If you're looking for the perfectly vanilla take on the Gnome desktop, Gnome OS might be for you.