Close Search
Tutorials – Recoll
Even in the age of cloud computing, personal computers often hold thousands of files: text files, spreadsheets, word processing docs, configuration files, and HTML files, as well as email and other message formats. If it takes too long to find the file you need, chase it down with the Recoll local search engine.
Recoll [1] is free software for Linux and Windows systems that adds a local search engine to your desktop or local network. And if you think that desktop search engines don't make sense in this age of cloud computing, I beg to disagree!
Look inside any school, NGO, small/medium enterprise, or individual computer used for more than a few years: Almost always, you will find big archives of mostly textual content that will never be uploaded in the cloud or otherwise exposed to an online search engine. Sometimes the reason is mere lack of time, bandwidth, or money. Sometimes it is privacy. Sometimes the reason is easier compliance with regulations like the European General Data Protection Regulation (GDPR) [2]. In all cases, deploying local search capability could make thousands and thousands of files much more useful for their owners.
Recoll is an excellent answer to the need for a local search engine. The Recoll search tool offers flexible interfaces, good documentation, and easy installation. Thanks to a relatively simple search language, Recoll can analyze and index text inside all the most common document formats, even when those documents are "hidden" inside other files (for example, an OpenDocument file zipped and attached to an email message). In most cases, you can preview or open the files found with your search by just clicking on them inside the Recoll window.
The first part of this tutorial explains how Recoll works and how to install it and configure its most critical functions. The second part describes the Recoll search syntax and offers a few tips to help you make the best use of Recoll.
Architecture and User Interfaces
Strictly speaking, Recoll is just a wrapper, albeit a great one, for the open source information retrieval library called Xapian [3]. It is Xapian, not Recoll, that performs all the high-level indexing and classification of your documents. Xapian is also directly usable via scripts in Perl, Java, and other languages. But it is Recoll that makes the desktop search really usable, by doing all the rest of the work, from overall configuration to obscure, low-level tasks like stemming. Stemming is the process of reducing similar words to their common root. It is thanks to stemming that you can search for a word like "hacker" and receive results for "hackerS" or "hacking" in addition to the original search term.
The other tasks that Recoll handles directly are extracting text from your files, decoding your queries and, of course, presenting their results in a format that makes it easy to browse and open them from a graphical interface.
With the right libraries and plugins, you can perform Recoll searches directly from Python and other languages, or from desktop environments like Unity or KDE. This article will focus on the native Recoll GUI, its web-based equivalent, and, of course, the evergreen command-line option.
Installation
Recent Recoll versions are available as binary packages for Windows and the most popular Linux distributions. On Ubuntu, for example, type the following commands at the prompt to add a Personal Package Archive (PPA) repository for recoll and install both the graphical and command-line interfaces:
#> sudo add-apt-repository ppa:recoll-backports/recoll-1.15-on #> sudo apt-get update #> sudo apt-get install recoll -y
(Don't be fooled by the 1.15
in the repository name: The command will install the current version of Recoll, whatever it is). After those commands, typing recoll
in the desktop search bar will show you the icon that opens the Recoll native GUI. To search at the command prompt or in a shell script, use the command recollq
. Use the recollindex
command to generate an index.
You must install the Recoll web interface separately. Go to the Github page for the web interface [4], download the master.zip
archive for your version of Recoll, and unzip it to expand a folder called recoll-webui-master
. The file inside the folder called webui-standalone.py
is a mini web server, which you can reach with your browser at the address http://localhost:8080. The mini web server is a bit slow, but it works right away for all the users of the local network, with one (well documented) caveat: You cannot directly open local files from the links in its listings unless you explicitly authorize Firefox to do so (see the box entitled "Authorizing Firefox").
Authorizing Firefox
To authorize Firefox to let you open local files, add the contents of the file examples/firefox-user.js
into ~/.mozilla/firefox/<profile>/user.js
and restart Firefox.
If you plan to use Recoll on a regular basis, you might wish to configure your Linux system to start it automatically when the system boots. See your Linux distro's documentation for more on configuring an application to launch at system startup.
Indexing Configuration
No search engine is better than its index. Telling Recoll how to create and maintain the index is the most critical part of the configuration (Figure 1).

Recoll has a system-wide configuration file (/usr/share/recoll/examples/recoll.conf
on Ubuntu), but each user also gets a personal configuration – with a higher priority. The personal configuration file is stored in $HOME/.recoll/recoll.conf
. The first time you start it, the Recoll GUI will ask you how to configure the index and will save your choices in your personal file. Among other things, you may define which files types should be indexed and the default language.
By default, Recoll will only have one index for your whole home directory, but it may handle many, totally independent indexes. The only requirement is that each index has a dedicated configuration directory. The simplest way to make Recoll create a separate configuration and index seems to be to create an empty directory and then start the software from the command line with the -c
option pointing to it:
#>mkdir $HOME/.recoll-customconfig #>recoll -c $HOME/.recoll-customconfig
You can search in more indexes simultaneously by adding them in the Preferences | External Index Dialog of the GUI. Don't forget that, when you search on multiple indexes, Recoll will use all their data, but it will only use the configuration of one index: the default index, or the index explicitly set with the RECOLL_CONFDIR
environment variable or the -c
option.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Direct Download
Read full article as PDF:
Price $2.95
News
-
Another New Linux Laptop has Arrived
Slimbook has released a monster of a Linux gaming laptop.
-
Mozilla VPN Now Available for Linux
The promised subscription-based VPN service from Mozilla is now available for the Linux platform.
-
Wayland and New App Menu Coming to KDE
The 2021 roadmap for the KDE desktop environment includes some exciting features and improvements.
-
Deepin 20.1 has Arrived
Debian-based Deepin 20.1 has been released with some interesting new features.
-
CloudLinux Commits Over 1 Million Dollars to CentOS Replacement
An open source, drop-in replacement for CentOS is on its way.
-
Linux Mint 20.1 Beta has Been Released
The first beta of Linux Mint, Ulyssa, is now available for downloading.
-
Manjaro Linux 20.2 has Been Unleashed
The latest iteration of Manjaro Linux has been released with a few interesting new features.
-
Patreon Project Looks to Bring Linux to Apple Silicon
Developer Hector Martin has created a patreon page to fund his work on developing a port of Linux for Apple Silicon Macs.
-
A New Chrome OS-Like Ubuntu Remix is Now Available
Ubuntu Web looks to be your Chrome OS alternative.
-
System76 Refreshes the Galago Pro Laptop
Linux hardware maker has revamped one of their most popular laptops.