A modern search tool

Command Line – plocate

© Lead Image © Author, 123RF.com

© Lead Image © Author, 123RF.com

Article from Issue 254/2022
Author(s):

As the latest successor to locate, plocate produces some of the quickest search results possible on any system.

Real estate agents sometimes say that the key to success is location, location, location. This saying might almost summarize the history of the locate command [1] and its various successors, particularly mlocate [2] and plocate [3]. As a replacement for find, all three commands use a database solution to reduce search time. While all three share many of the same options, plocate is widely considered the most efficient choice.

Slightly different versions of locate are available in the BSD and GNU findutils, but you can also find locate, mlocate, and plocate as separate commands. Because the number of choices can be confusing, a history seems in order. First released in 1982, locate uses a database that can be read by any user. If regular expressions are not used, it displays every instance of the string entered on the system, which is inconvenient if the string is common. However, locate's most serious limitation is that the database has to be updated manually. Largely because of these problems – especially the need for manual updates – locate was succeeded briefly by slocate (secure locate) [4] until replaced in 2006 by mlocate (merging locate). Both slocate and mlocate are an improvement over locate in that they contain the utility updatedb [5] to update databases, speeding up the process by only searching for files and directories where the ctime has changed. Also, both slocate and mlocate show only the files that the current user has access to, thereby improving security, and they allow regular expressions to be used without a specific option. Written by Miloslav Trmac while he participated in the Google Summer of Code, mlocate became the preferred version until 2020, although it seems to have gone through periods of being unmaintained.

Named for the posting lists that inspired it, plocate was written to be a drop-in replacement for mlocate. While it can still use updatedb to create its database, plocate can also use the plocate-build utility [6] to create an index when a root user is logged in. Unlike mlocate, when multiple strings are searched, plocate returns only the files that match all the search strings, rather than any file that matches even one string. Another difference from mlocate: plocate is compatible with systemd and SELinux. Instead of scanning every entry one at a time, plocate scans trigrams (i.e., combinations of three bytes at a time) to increase search speeds. Although specifically designed for solid state drives (SSDs), on older hard drives, plocate can gain further speed by using the io_uring Asynchronous I/O (AIO) framework introduced in the Linux 5.1 kernel in March 2019 [7]. As its main limitation, plocate's benefits may be lost when searching for strings shorter than three bytes, for non-UTF-8 file names, or for regular expressions with numerous hits. Usually, its enhancements mean that plocate can find two files out of 27 million in .008 milliseconds, while mlocate takes 20.118 milliseconds for the same operation [8]. Even though locate and mlocate are still available, in most circumstances, plocate should be the preferred variant.

Setting Up plocate

You can find plocate in the official repositories of Arch Linux, Manjaro, Debian (Buster backports and Bullseye releases), and Ubuntu. On other distributions, you can build plocate with zstd and libtomic using a C++ compiler. Whichever way you install it, before you use plocate, you must run updatedb to create at least one database at /var/lib/mlocate/mlocate.db (Figure 1). This process may take a while to run the first time if the system is not on an SSD drive and does not run at least a Linux 5.1 kernel. Otherwise, on a 2TB system, the database is created in a matter of a few seconds. To specify a non-default location for the database, use: --database-root FILE. You can add --database-root FILE (-U FILE) to specify a non-default location for the database, and --require-visibility 0 (-l 0) to make the entire content of the database visible to ordinary users. Adding --verbose (-v) will display the files and directories being added to the database onscreen, which has the advantage of letting you know when the database creation is complete. Alternatively, you can create the database using plocate-build, even creating a plain text database if you choose. Either way, you can check that plocate is ready to use by doing a simple search (Figure 2).

Figure 1: Entering updatedb -v shows files and directories being added to the plocate database.
Figure 2: Because plocate is so fast, its output starts to display before the piped less command can operate.

Of note, mlocate includes a few options that plocate lacks. For example, mlocate includes --statistics (-S) as well as --stdio (-s), a C function for interacting with different physical devices, for compatibility with the BSD and GNU versions of locate. What remains in plocate are a dozen basic functions, which are more than enough for almost all operations.

On networks, plocate can be modified by adding multiple databases with --database PATH (-d PATH). Multiple databases can be specified one per option or in a single option with the names of databases separated by a colon (:). Note that versions of Ubuntu before 21.04 may not have these options and may not mention them in their man pages. Two other options signal that a search string should be treated as containing regular expressions: --regexp (-r) signals the presence of standard regular expressions and --regex signals that extended regular expressions are used. Because regular expressions can return a large number of results, these options can noticeably slow a search. Without one of these options, common symbols in regular expressions such as *, ?, or ! are treated as regular text.

Other options modify a search's results. With --ignore-case (-i), plocate's default case sensitivity is overwritten, so that lowercase and uppercase letters are treated as the same, except in the case of some Unicode case rules, such as a German ß being the same as ss. Moreover, like regular expressions, ignoring letter case can slow the search by producing a larger number of results, but search speed can be increased. For instance, on one hand --limit NUMBER (-l NUMBER) stops a search after the designated number of hits, and --count (-c) merely shows the number of instances. On the other hand, --basename (-b), which searches only for file names without their extensions, increases search speed only minimally at best by omitting directories from the search.

A Command for Modern Times

With the recent release of plocate, the locate family of commands seems set for the next few years. While plocate has a few limitations, it uses the latest technology and is adopted to modern computing practices. For example, with the current hardware speeds, it no longer makes sense to set whether to follow symbolic links – following them by default can be done with no noticeable effects.

Even more important, the rise of Linux on the desktop has increased the demand for simplicity. In the past, the find command, with its arcane distinctions, was sufficient because users could be assumed to have the expertise and the patience to wade through the obscurities of its man page. Today, though, with lessveteran users, simplicity and efficiency are expected as the norms. In the case of system searches, these expectations have resulted in plocate, one of the increasing number of rewrites of classic Linux commands that users have lived with for so long.

The Author

Bruce Byfield is a computer journalist and a freelance writer and editor specializing in free and open source software. In addition to his writing projects, he also teaches live and e-learning courses. In his spare time, Bruce writes about Northwest Coast art (http://brucebyfield.wordpress.com). He is also co-founder of Prentice Pieces, a blog about writing and fantasy at https://prenticepieces.com/.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Make New Friends

    If you are looking for modern display options or more speed at the command line, these alternatives to traditional Unix commands may be just what you need.

  • Command Line: Grep

    Once you understand the intricacies of grep, you can find just about anything.

  • Admin Workshop: Finding Files

    Modern computers with their multiple Gigabyte hard disks store thousands of files. A lost file can cause a lot of work,and it can also pose a security risk. Fortunately,Linux has some versatile tools for finding those “lost files.”

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News