Total Recoll
Lost and Found
ByKDE's unofficial search engine may be the most usable choice of all.
Searches have been KDE's weak point for several years. Nepomuk [1], which was introduced in the fourth release series as a sophisticated search engine, proved difficult to configure and use. Last year, Nepomuk was replaced by the supposedly easier-to-use Baloo [2], but it has been greeted with no greater enthusiasm. Currently, the most usable alternative for drive searches is Recoll [3] (Figure 1), which combines both simplicity and – for those who want them – advanced configuration options that are explained in a comprehensive manual [4].
Like Baloo, Recoll is a Qt tool that uses the Xapian search engine library [5]. Recoll's main difference from its predecessors is that it does not install as a daemon by default, a practice that has gained both Nepomuk and Baloo a reputation for being a drain on system resources. Recoll is not a standard KDE package, but it is well enough known that major distributions carry it. If your distro does not include the Recoll packages in its repository, the manual includes detailed information about building from source.
Once Recoll is installed, it needs to index your files. In my experience, Recoll indexed about a terabyte of files in one hour (Figure 2), about the same as either Nepomuk or Baloo; however, unlike my experience with Nepomuk, the process did not noticeably slow operations running concurrently.
To reduce updates to the index substantially, go to Preferences | Index Configuration, where you can exclude paths (e.g., library directories) and specify languages to include in searches. From Preferences | Index Scheduling (Figure 3), you can create a cronjob to run indexing regularly. Although you can also update the index manually, creating a cronjob avoids relying on your memory and allows you to run the update at a time it is less likely to interfere with other applications. Still another solution is to update the index at each login.
Before using Recoll, you might also check the list of supporting packages [6] that Recoll requires to read certain files. Many of these packages are probably already on most desktop computers, but if you need support for a specific file format, taking a moment to check is only sensible.
You can customize Recoll further by setting a keyboard binding or adding other customizations described in the manual, but here I will provide the minimum configuration steps you'll need to set up Recoll.
Simple and Advanced Searches
After basic configuration, simple searches are as easy as entering a term in the search field on the right side of the toolbar and pressing the Enter key (see Figure 1). By default, search results include stemming, so that searching for order will include results like preorder and ordered. Because Recoll has already indexed the files, results return in a matter of seconds.
As you scroll down the results window, clicking an Open link opens the file in its native application. Recoll is free software running on Linux, so you cannot open MS Word files in Microsoft Office; instead, they open in the Antiword document reader.
If this simple search does not help you locate a file, you can refine it in a number of ways. To start, you can get results more quickly if, instead of accepting the default choice All, you choose one of the half dozen common file types, such as media (graphic or sound), presentation, spreadsheet, or text. Alternatively, you can eliminate common file types by selecting other.
Another option is to refine a query. For example, selecting File name rather than the default Any term speeds search results by searching only for file names and not scanning file contents. More elaborately, by selecting Query language, you can narrow the search in other ways, many of which might be familiar to you from web searches. For example, prefacing a search term with a minus sign (-) excludes the term from the results; terms can also be separated by OR or AND for multiple simultaneous searches. In files with metadata, Recoll also has the ability to search for some of the more common fields. For example, you can search such formats by prefacing the search term with title:, author:, ext: (extension), dir: (directory path), or <YYYY-MM-DD>: for a date.
As in a web search, quotation marks define a phrase and the exact order in which the words must appear. However, you can also add a variety of letters immediately after the last quotation mark (i.e., with no space between) to refine a search further. For instance, adding C turns on case sensitivity, or adding D includes diacritics in the results.
Finally, Recoll also supports basic wild cards: an asterisk (*) for any number of characters, a question mark for a single character, and a set of characters in square brackets (e.g., [123]) to match any one of the specified characters. However, as the manual warns, wild cards should be used sparingly, especially at the start and end of search terms. Although wild cards might increase the likelihood that the term you are looking for will be in the search results, they can also slow the search and return many more unwanted results to scroll through.
The only trouble with some of these options is that you either have to memorize them or run Recoll with the manual open in another window. You might prefer to run Tools | Advanced search instead, which offers all these options in a graphical interface (Figure 4).
The mouseovers in the Advanced search dialog are extremely detailed and provide more than enough information to let you use the options in the Find and Filter features with a minimum of authorization.
Another way to save time with Recoll is to click Tools | Document history, which records the last 100 documents opened from the search results. Because of its limited number of records, calling up the document history (Figure 5) is often quicker than repeating a search.
The Problem of Integration
As an alternative to Nepomuk or Baloo, Recoll has at least two advantages. First, its interface is simple to use and, for many users, probably less intimidating as well. Second, its advanced search options are better integrated into the interface, and many are close enough to those of web browsers that they are easy to learn.
Unfortunately, Recoll is not integrated into the latest versions of KDE Plasma. The manual does discuss using a Unity Lens and an obsolete Krunner plugin, but what Recoll could really use is integration into the Dolphin file manager. If such a thing existed, it would show those who have had trouble with previous search indexers exactly what was intended.
As things are, non-integration is a small price to pay for Recoll's speed and convenience. As long as you are willing to memorize some of its advanced features, Recoll is a powerful tool that is totally undeserving of its obscurity. Like Krunner, it comes very close to being a superior alternative to a standard file manager, especially on large drives.
Info |
[1] Nepomuk: https://en.wikipedia.org/wiki/NEPOMUK_%28framework%29 |
Author |
Bruce Byfield is a computer journalist and a freelance writer and editor specializing in free and open source software. In addition to his writing projects, he also teaches live and e-learning courses. In his spare time, Bruce writes about Northwest coast art. You can read more of his work at http://brucebyfield.wordpress.com |
Issue 268/2023
Buy this issue as a PDF
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Find SysAdmin Jobs
News
-
Escuelas Linux 8.0 is Now Available
Just in time for its 25th anniversary, the developers of Escuelas Linux have released the latest version.
-
LibreOffice 7.5 has Arrived Loaded with New Features and Improvements
The favorite office suite of the Linux community has a new release that includes some visual refreshing and new features across all modules.
-
The Next Major Release of Elementary OS Has Arrived
It's been over a year since the developers of elementary OS released version 6.1 (Jólnir) but they've finally made their latest release (Horus) available with a renewed focus on the user.
-
KDE Plasma 5.27 Beta Is Ready for Testing
The latest beta iteration of the KDE Plasma desktop is now available and includes some important additions and fixes.
-
Netrunner OS 23 Is Now Available
The latest version of this Linux distribution is now based on Debian Bullseye and is ready for installation and finally hits the KDE 5.20 branch of the desktop.
-
New Linux Distribution Built for Gamers
With a Gnome desktop that offers different layouts and a custom kernel, PikaOS is a great option for gamers of all types.
-
System76 Beefs Up Popular Pangolin Laptop
The darling of open-source-powered laptops and desktops will soon drop a new AMD Ryzen 7-powered version of their popular Pangolin laptop.
-
Nobara Project Is a Modified Version of Fedora with User-Friendly Fixes
If you're looking for a version of Fedora that includes third-party and proprietary packages, look no further than the Nobara Project.
-
Gnome 44 Now Has a Release Date
Gnome 44 will be officially released on March 22, 2023.
-
Nitrux 2.6 Available with Kernel 6.1 and a Major Change
The developers of Nitrux have officially released version 2.6 of their Linux distribution with plenty of new features to excite users.