Solve Wordle puzzles with regular expressions
Preparing the Dictionary
Change to your home directory and surf to the following GitHub page to open the file: https://github.com/dwyl/english-words/blob/master/words_alpha.txt. Press Download, and you will see the start of a list of words. Now right click and select Save Page As to save the file in your home directory. If you're allergic to the GUI, you can use wget
instead.
To make the work a little easier, you should convert the list to uppercase (all Wordle entries are uppercase) with tr
(translate or transliterate) and store the results in a file named wordle-caps.txt
:
$ tr '[:lower:]' '[:upper:]' < words_alpha.txt > wordle-caps.txt
Your new dictionary file named wordle-caps.txt
should have just over 370,000 words. Use wc -l
(short for word count) to count the lines in the file:
wc -l wordle-caps.txt 370102 wordle-caps.txt
For Wordle, you only need the words with exactly five characters from this list. Again, you need the help of grep
. Because all of the words are already in uppercase, you only need to output the five-letter words to a text file. The following grep
command simply stores the five-letter words in a file named wordle-complete.txt
:
grep -o -w "\w\{5\}" wordle-caps.txt > wordle-complete.txt
The -o
option tells grep
to print the matching words, while -w
tells grep
that the search term is a regex. The regex string itself, \w\{5\}
is equivalent to five continuous characters. Now run another line count as follows:
$ wc -l wordle-complete.txt 15918 wordle-complete.txt
This leaves you with nearly 16,000 words, which is more than enough to solve the Wordle of the day. Let's find out.
Grep the Wordle
While you only have to do the preliminary work once, keep the wordle-complete.txt
file safe for later. To solve the wordle shown in Figure 3, you need to start with a completely random word from your Wordle dictionary. Initially, the game grid shown in Figure 3 is empty. You can run shuf
to pick five random five-letter words from the file (Listing 2). If you are not happy with the selection, simply repeat the command.
Listing 2
Game 1, Round 1
$ shuf -n 5 wordle-complete.txt FANGA FRASS SIAFU MOORY HALDU
Wow! Listing 2 resulted in an amazing collection of weird and wonderful words. In our example, we went for the word MOORY. When we entered it in Wordle, all the letter fields were gray – so at first glance, this wasn't a good guess. But now we know that the word we are looking for does not contain any of the letters from MOORY. This knowledge is actually helpful in our search for the solution.
The first command from Listing 3 filters out all words from our word list that contain the characters M, O, R, and Y. The -v
switch (---invert-match
) tells grep to invert the regex rule that follows. The command saves the results to the file wordle1
, which "only" contains 5,362 words. From this list, you can output another five arbitrary words.
Listing 3
Example 1, Attempt 2
$ grep -v '[MOORY]' wordle-complete.txt > wordle1 $ wc -l wordle1 5362 wordle1 $ shuf -n 5 wordle1 TUDEL DATED CEILE ENCUP DEFET
From the selection offered, we liked DATED best – well, it was the only word we understood, so hey ho. I wonder if Wordle will agree with us. Transferred to Wordle, the A in the second position and the T in the third position both light up green, so a pretty good guess. We now know that the second letter in the solution we are looking for is an A and the third letter is a T. The D and the E in DATED are shown in gray, so the letters do not appear in the solution.
Armed with this information, we can now narrow down the word list even further. The grep
command from line 1 of Listing 4 combines all the conditions into a single call. The circumflex (^
) means that the single statement should be inverted, similar to the -v
switch. So the full regular expression [^ED][A][T][^ED][^ED]
searches for a string of five letters. The first must not be E or D, the second must be an A, the third must be a T, and so on.
Listing 4
Example 1, Attempt 3
01 $ grep '[^ED][A][T][^ED][^ED]' wordle1 > wordle2 02 $ wc -l wordle2 03 55 wordle2 04 $ shuf -n 5 wordle2 05 HATCH 06 BATAN 07 PATTA 08 BATTS 09 WATAP
Our wordle2
file now contains only 55 potential solutions. From this, we again output five random words (line 4). The dictionary defines a watap as a thread made of the string roots of various coniferous trees and used by Native Americans, so let's go with it. Again, Wordle isn't entirely happy with our guess. But we have a matching trio of ATA in the middle of our word, which results in more fodder for grep
:
grep '[^WP][A][T][A][^WP]' wordle2 > wordle3
Another call to wc -l
tells us that wordle3
only contains 10 words, so let's just cat
the file and see what we get:
$ cat wordle3 BATAK BATAN CATAN FATAL KATAT LATAH LATAX NATAL SATAI SATAN
Time for some guesswork: FATAL looks like a good choice, but, fatally (ouch), Wordle doesn't see things our way. Not to worry, though: The L in fifth position is marked in green, and the only remaining candidate is NATAL. Lo and behold, we finished the game in four steps, but only due to bit of bad luck at the end.
New Day, New Game
Using the same logic, we can tackle the next game (Figure 4). The hard work has already been done (i.e., we already have retrieved a dictionary and created an uppercase word list). Again, we need to start with an arbitrary word, extracting it from the complete Wordle dictionary:
$ shuf -n 5 wordle-complete.txt CRAPS DAMON TAREQ GEYAN CLARO
This time we went for CLARO. Not a bad start: It looks like the C is in the right place already. The A can occur in the second, fourth, or fifth position, and the O can occur in the second, third, or fourth position. L and R do not occur at any position in the target word. The regex for this is [C][^LR][^LR][^LR][^LR]
, but we also need to pipe the output through two further greps: After all, the word needs to contain an A and an O, too (Listing 5).
Listing 5
Game 2, Round 2
$ grep -P '[C][^LR][^LR][^LR][^LR]' wordle-complete.txt | grep A | grep O > wordle1 $ wc -l wordle1 71 wordle1 $ shuf -n 5 wordle1 COCOA CANOE CHOCA COMMA COTTA
Now, I don't drink cocoa and prefer boats that are bigger than canoes, and I'm pretty sure that CHOCA isn't actually a word, so I'll go for the next word on the list COMMA – after all, I probably type hundreds of them a day. Success! We solved the Wordle in only two guesses.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
New Slimbook EVO with Raw AMD Ryzen Power
If you're looking for serious power in a 14" ultrabook that is powered by Linux, Slimbook has just the thing for you.
-
The Gnome Foundation Struggling to Stay Afloat
The foundation behind the Gnome desktop environment is having to go through some serious belt-tightening due to continued financial problems.
-
Thousands of Linux Servers Infected with Stealth Malware Since 2021
Perfctl is capable of remaining undetected, which makes it dangerous and hard to mitigate.
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.
-
Valve and Arch Linux Announce Collaboration
Valve and Arch have come together for two projects that will have a serious impact on the Linux distribution.
-
Hacker Successfully Runs Linux on a CPU from the Early ‘70s
From the office of "Look what I can do," Dmitry Grinberg was able to get Linux running on a processor that was created in 1971.
-
OSI and LPI Form Strategic Alliance
With a goal of strengthening Linux and open source communities, this new alliance aims to nurture the growth of more highly skilled professionals.
-
Fedora 41 Beta Available with Some Interesting Additions
If you're a Fedora fan, you'll be excited to hear the beta version of the latest release is now available for testing and includes plenty of updates.
-
AlmaLinux Unveils New Hardware Certification Process
The AlmaLinux Hardware Certification Program run by the Certification Special Interest Group (SIG) aims to ensure seamless compatibility between AlmaLinux and a wide range of hardware configurations.
-
Wind River Introduces eLxr Pro Linux Solution
eLxr Pro offers an end-to-end Linux solution backed by expert commercial support.