Creating more readable regular expressions with Simple Regex Language
What Next?
You might be wondering what you can do with the finished SRL expression, since Grep and most other tools only digest conventional regular expressions.
If you just need a regular expression on the fly, you can always enter the SRL expression into the test at the SRL website (as described earlier in this article), and then copy the resulting regular expression to Grep or another tool.
Also, some languages have already begun to implement SRL support. You will find special SRL libraries for JavaScript, PHP, Python, and C++ at GitHub under the MIT license [3]. Java and C# libraries are still pending.
The functions and classes of these SRL libraries accept an SRL expression, evaluate it, and convert it into a regular expression. In the case of PHP, the user just has to create an SRL
object and check the text using the method isMatching()
:
$srl = new SRL('one of "eE" literally "rror"');$srl?isMatching('Error'); // is True
In addition to featured keywords like literally
, SRL also offers the keywords shown in Tables 1 to 6. You'll find a detailed reference and many other examples on the official SRL homepage [1].
Table 1
Character Strings
Keyword | Description |
---|---|
literally "string" |
Representative of the character string |
one of "abc" |
One of the characters a, b, or c. |
letter from a to d |
One of the characters a, b, c, or d. |
uppercase letter |
Any uppercase letter. |
any character |
Uppercase or lowercase letter from A to Z, a number from 0 to 9, or an underscore ( |
no character |
All other (special) characters. |
digit from 1 to 4 |
One of the digits 1, 2, 3, or 4, whereby |
anything |
Any character with the exception of a line break. |
new line |
Line break. |
whitespace |
A whitespace character (this includes the space character, the tabulator, and the line break). |
no whitespace |
Character that is not a whitespace character. |
tab |
Tabulator. |
backslash |
Backslash character ( |
raw "[a-z]" |
Stands for the result of the regular expression |
Table 2
Quantifiers
Keyword | Description |
---|---|
exactly 4 times |
Something repeats exactly four times. The expression |
between 2 and 4 times |
Something repeats between two and four times; the following keyword |
optional |
Something may occur, but does not have to. |
once or more |
Something must occur at least once. |
never or more |
Something must occur multiple times or not at all. |
at least 2 times |
Something must occur at least twice. |
Table 3
Groups
Keyword | Description |
---|---|
capture ( |
|
any of ( |
Each |
capture ( |
Captures the expression |
Table 4
Lookarounds
Keyword | Description |
---|---|
if followed by |
Checks whether something particular follows (lookahead). |
if not followed by |
Check whether something does not follow. |
if already had |
Checks whether something was preceding (lookbehind). |
if not already had |
Checks whether something was not preceding. |
Table 5
Flags
Keyword | Description |
---|---|
case insensitive |
Uppercase and lowercase are not of any importance. |
multi line |
The text to be checked runs over multiple lines. |
all lazy |
The evaluation is performed according to the Lazy principle. |
Table 6
Anchors
Keyword | Description |
---|---|
start with |
Something explicitly refers to the start of a string. |
must end |
Something refers to the end of a string. |
Conclusions
SRL is built on the philosophy that, if regular expressions are easier to read, errors will stand out more quickly. It is worth noting, however, that SRL expressions are also complex and difficult to understand if you aren't accustomed to the syntax. SRL does not currently feature comments, which would help to add clarity, and recursion is also missing. Classic Unix text tools such as Grep do not yet provide SRL support; however, you can convert the expression at the SRL website or use an SRL library with some programming languages.
Despite the problems, SRL is still worth a look. If you only use regular expressions occasionally, or if you are using them for the first time, you will get to your destination much faster with SRL. Even regex old-timers might find they can make their expressions more legible with SRL.
In the future, developer Karim Geiger wants to add support for additional programming languages and also standardize the SRL language to define the syntax and commands more clearly. In the long run, he imagines a kind of compiler that translates regular expressions into SRL. He has stated that a Bash version is not planned but is conceivable.
SRL commands are based on English. Geiger has resisted the suggestion to translate the natural language commands to other written languages. He fears that language proliferation would introduce a complexity that could lead to incompatible versions.
Infos
- Simple Regex Language: https://simple-regex.com
- Build Tool for the SRL: https://simple-regex.com/build
- SRL Libraries: https://github.com/SimpleRegex
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Canonical Releases Ubuntu 24.04
After a brief pause because of the XZ vulnerability, Ubuntu 24.04 is now available for install.
-
Linux Servers Targeted by Akira Ransomware
A group of bad actors who have already extorted $42 million have their sights set on the Linux platform.
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.