Automate your web logins
Log In and Go
Automated web logins with command-line tools and Selenium ensure you don't miss scheduling an activity.
During the COVID-19 lockdown, many activities like pools, gyms, and golf courses required people to sign in to websites before they could access these activities. These precautions helped to maintain a safe environment; however, the booking process was awkward, and it was easy to miss an activity if you weren't signed up early enough.
Luckily, some great Linux tools can automate web logins. In this article, I share two techniques to create automated web logins. The first technique uses command-line tools like xte
and xdotool
. This approach allows simple Bash scripts to replicate how you would use keystrokes to access web pages.
The second technique uses the Selenium Python library. The Selenium API allows you to tackle more complex projects by giving you access to the full Document Object Model (DOM) of a web page.
Keyboard Simulation
The most popular choices for keyboard and mouse simulation are the xautomation package [1] and the xdotool
utility [2]. The xdotool
utility is feature-rich, with special functions for desktop and window functions. The xte
tool, a part of xautomation, is a little simpler, focusing entirely on keyboard and mouse simulation.
The wmctrl
[3] utility is also very useful to help you determine which windows are open on your desktop, and it can also set the active window with a text substring.
In Ubuntu, enter
sudo apt-get install xautomation xdotool wmctrl
to install the xautomation package and the xdotool
and wmctrl
utilities.
Log In with xte
With the xte
utility, you can send a single keyboard character or strings of characters. A Bash script that uses xte
commands can emulate your actions to log in manually to a web page.
Typically people use the mouse on web pages, which is quite different from logging in 100 percent with the keyboard. Web pages often have a number of clickable items before the main form entry area, so it is important to step through and document the login procedure manually. A good simple example is to try and log in to Netflix (Figure 1).
The Bash script in Listing 1 uses xte
to automate the Netflix sign-in. This script opens a Chrome browser page (line 10) and then sets the focus to this page (line 12). Next, it sends the correct tab, text, and return key sequences (lines 15-22).
Listing 1
Netflix Sign-In
01 #!/bin/bash 02 # netflix_login.sh - script logs into Netflix 03 # 04 05 url="https://www.netflix.com/ca/login" 06 email="my_email.com" 07 pwd="my_password" 08 09 # open browser to wait for the page to open, then set focus to it 10 chromium-browser $url & 11 sleep 2 12 wmctrl -a "Netflix - Chromium" 13 14 sleep 1 # allow time to get focus before sending keys 15 xte "key Tab" 16 xte "key Tab" 17 xte "str $email" 18 xte "key Tab" 19 xte "str $pwd" 20 xte "key Tab" 21 xte "key Tab" 22 xte "key Return" 23 24 echo "Netflix Login Done..."
Setting the window focus can be tricky if you have a number of windows open. The command wmctrl -l
lists all open windows, and the command
wmctrl -a '<some title info>'
sets the mouse and keyboard focus to a specific window from a substring of the window title.
Book with xdotool
The xdotool
syntax also sends keystrokes and text and is very similar to xte
, but with a few extra features. A park booking example (Figure 2) is a bit more complex, because a booking time needs to be selected from a list. For this project, the automation script needs to manage eight entry fields (to keep things simple, I'll pass the date in the URL) and select a time slot.
Neither the xte
nor xdotool
utility supports a search text function. A simple workaround is to use the web browser's search function. By enabling caret (text cursor) navigation, it's possible to move the active cursor location according to the browser's search results.
The caret dialog is shown by pressing F7 (Figure 3). It's important to note that the caret enable/cancel and Yes/No buttons can vary between browsers.
The Bash script in Listing 2 uses the browser's search dialog to find and select a 10:00am time slot for a park. One of the first steps is to enable caret navigation (lines 12-13).
Listing 2
Book a Park Visit
01 #!/bin/bash 02 # book10am.sh - make a 10:00 park booking 03 # 04 sdate="startDate=2021-04-23" #adjust the date 05 url="https://book.parkpassproject.com/book?inventoryGroup=1554186518&&inventory=1229284276&$sdate" 06 07 chromium-browser $url & #open browser to park booking page 08 sleep 5 # wait for browser to come up 09 wmctrl -a "Chromium" 10 sleep 2 11 # Turn on caret browsing 12 xdotool key F7 13 xdotool key Return 14 sleep 1 15 16 # tab to 'Time Slot' area 17 tabcnt=8 18 xdotool key --repeat $tabcnt --delay 100 Tab 19 20 xdotool key Return 21 sleep 1 22 23 # Search for 10:00 time and select it 24 xdotool key ctrl+f 25 xdotool type '10:00' 26 xdotool key Return 27 # Close find dialog and select time 28 xdotool key Tab Tab Tab Return Return 29 30 echo "Park Time Booking Complete"
A useful feature of xdotool
is the repeat with a delay option (lines 17-18). In this script, I used this feature to tab eight times to get to the Time Slot field. A Ctrl+F keystroke opens the browser search dialog (line 24). Next, the xdotool type
option passes in the '10:00'
time string (line 25). The final step is to close the search dialog and hit Return to select the 10:00 AM – 12:00 PM time slot (line 28).
Script Limitations
The xdotool
and xte
utilities are great for simple web page automation when the HTML form items are sequential and no special decision making is required. Unfortunately, I found that when I tried to book a park time on the weekend, I started to see some limitations (Figure 4). During busy times, if I tried to book by time, xte
and xdotool
could not determine whether the time slot was taken. A simple workaround would be to search for the first Available or Not Busy slot, but this doesn't allow you to pick times you like.
For projects that require some logic (like choosing a good time from a list of times), Selenium with Python is an excellent fit.
Selenium with Python
Selenium [4] is a portable framework for testing web applications, with client-server tools and an IDE. The Selenium WebDriver component (available for Firefox, Google Chrome, Internet Explorer, Safari, Opera, and Edge) sends commands from client APIs directly to a browser. Client APIs are available for C#, Go, Java, JavaScript, PHP, Python, and Ruby. The Selenium Downloads page [5] has details on installation of the WebDriver scripts.
To install the Linux 32-bit Selenium driver (geckodriver
) for Firefox, enter:
wget https://github.com/mozilla/geckodriver/releases/download/v0.29.1/geckodriver-v0.29.1-linux32.tar.gz tar -xvzf geckodriver-v0.24.0-linux32.tar.gz chmod +x geckodriver sudo mv geckodriver /usr/local/bin
To install the Selenium library for Python, enter:
pip install selenium
The big difference between the xte
or xdotool
utility and Selenium is that Selenium can access the HTML code of the selected web page directly.
Log In with Selenium and Python
As for xte
and xdotool
, you need to do some background manual work before writing the script. Once the required web page is open, you can use the Web Developer Inspector tool to examine HTML code. To access the Inspector, Select Tools | Web Developer | Inspector from the top menubar or use the shortcut Ctrl+Shift+C.
For the Netflix sign-in example, the Email or phone number and Password inputs are needed (Figure 5). When the Inspector is open, items selected on the web page are highlighted in the Inspector pane. In this example, the Email or phone number entry uses id="id_userLoginId"
, and the password entry uses id="id_password"
. Listing 3 shows the Python code that signs in to Netflix.
Listing 3
Netflix Sign-In with Selenium
01 # 02 # netflix_login.py - automate Netflix Login 03 # 04 from selenium import webdriver 05 06 url="https://www.netflix.com/ca/login" 07 email="my_email.com" 08 pwd="my_password" 09 10 browser = webdriver.Firefox() 11 12 browser.get(url) 13 14 # wait for page to refresh 15 browser.implicitly_wait(10) 16 17 username = browser.find_element_by_id('id_userLoginId') 18 username.send_keys(email) 19 20 password = browser.find_element_by_id('id_password') 21 password.send_keys(pwd) 22 23 password.submit() 24 25 print("Login Complete")
When a web page is called, it's important to give the page some time to refresh. The implicity_wait(10) call (line 15) waits up to 10 seconds for a Selenium query to complete.
HTML items can be found by either ID (find_element_by_id()
) or by name (find_element_by_name()
). A Selenium object needs to be created before initiating any action on it. Line 17 finds and then creates a username
object from ID 'id_userLoginId'
. The send_keys()
method is used to pass text strings to <input>
tags (lines 18 and 21). Calling the submit()
method on any input object will send all the form data as a request to the web server (line 23).
Selenium Searches
From the earlier park booking example, you saw that xte
and xautomation had some limitations when a variable list of options was presented. Luckily Selenium has a number of functions that can be used for searching HTML tags and text. Like the last example, the first step is to open the web page and inspect the structure manually (Figure 6).
For this example, the Inspector shows that each status entry in the list has a <div class="jss97">
that could have a Past, Available, Not Busy, or Full status. The top-level <div class="jss94">
has both the times and the status messages. Knowing the top-level div class
now makes it possible to search for the park's time slots and get the status of each of the times.
Figure 7 shows an example that searches for the first Not Busy time slot. As in the earlier xdotool
example, the time slot list needs to be clicked to open. In Python code, this is done by finding and then clicking on the timefield
object.
The key piece in this code is:
thetimes = find_elements_by_class_name("jss94")
This operation will build an array (thetimes
) of all the time slots with their status messages.
Next, a for
loop can examine each time slot. In this example, the code looks for the first time a time slot is Not Busy:
# Get the top level div thetimes = browser.find_elements_by_class_name("jss94") for itimes in thetimes: # Find the first "Not Busy" if "Not Busy" in itimes.text: print(itimes.text) itimes.click() break
Logic could be written for different conditions, like looking for time slots between 9 and 11am, and if none are found, then looking for time slots between 2 and 4pm.
Final Comments
After using the various methods discussed in this article, I found that:
- Often my apps written during off hours would not work during peak times because I had not accounted for the increased peak callup delays.
- The browser search dialog with
xte
/xdotool
was extremely useful because it allowed me to jump to specific areas of a web page, rather than tabbing to it. - Creating apps with
xte
orxdotool
is considerably easier than using Python with Selenium. I found that some web pages were incredibly complex, and it often took some time to find the required IDs that Selenium needed. - For large web entry pages, you can always create automated web logins by mixing and matching the
xte
/xdotool
utilities and Python. - Two huge advantages in using Selenium are being able to add some decision-making logic and
implicity_wait()
methods, which wait until the page is ready and is a lot more efficient than putting in a long sleep time.
Infos
- xautomation man page: https://linux.die.net/man/7/xautomation
- xdotool man page: https://man.cx/xdotool
- wmctrl man page: https://linux.die.net/man/1/wmctrl
- Selenium: https://en.wikipedia.org/wiki/Selenium_(software)
- Selenium downloads: https://www.selenium.dev/downloads/