Distributed computing in the service of COVID research

Cycles for Science

© Lead Image © Sean_Gladwell, Fotolia.com

© Lead Image © Sean_Gladwell, Fotolia.com

Author(s):

Linux and the BOINC distributed computing platform help researchers fight the COVID-19 virus.

COVID-19 has had a dramatic impact on countries around the world. Researchers are continuing their work to develop vaccines and explore other ways of containing the virus. Many research projects require enormous computing capacities, but expensive supercomputers are not always available. Thanks to the concept of distributed computing, you can support research efforts by providing the computing power of your home PC.

The concept of using home computers to assist with research projects has been around for several years. The SETI project (Search for Extraterrestrial Intelligence) has offered home users a chance to process radio telescope data since 1999. IBM launched the World Community Grid [1], a central platform for managing volunteer distributed computing projects, in 2004. Since 2005, the World Community Grid has used BOINC [2], a software tool developed by the University of Berkeley for supporting distributed computing.

BOINC (Berkeley Open Infrastructure for Network Computing) separates the computational framework from the scientific content, which makes it quite easy to adapt to a specific research project.

The software distributes independent work units to clients, which means you can integrate computers with different capabilities into the computations without slowing down the project.

BOINC has been available since 2005 as a free tool for Linux. The client does not just use the excess computing power of the CPU, but it also has a CUDA interface, which means it can access NVidia Graphics Processing Units (GPUs) [3].

Fighting Coronavirus

Scripps Research, based in California with subsidiaries in Florida [4], is one of the world's leading biomedical research institutes. More than 3,000 scientists work at the non-profit institution, spread over several institutes. The Forli Lab, which is part of Scripps Research, focuses on molecular biology [5].

As part of the "OpenPandemics – COVID-19" initiative, the laboratory is using the World Community Grid in elaborate simulations [6] in cooperation with IBM to find chemical components to combat COVID-19. Distributed computing in the World Community Grid is responsible for screening individual components for later study.

Getting Started

To reroute the surplus computing power of your computer for COVID-19 research, first log in to the World Community Grid (WCG). To access the Grid, you just need to provide an email address and a password for the login.

You will receive an email for confirmation, which lets you verify your access to the WCG at the same time. Afterwards – with the help of the the World Community Grid website – you install the BOINC client and the matching manager on your computer.

To install the client, click on the Download link top right on the web page and then select one of the package management systems from the drop-down menu. DEB and RPM-based derivatives are available for selection. After making your choice, you are taken to a page with installation instructions. In most cases, you won't need to download the packages because the required applications are available from the software repositories of the popular distributions.

Run the commands listed in Listing 1 (for DEB-based systems) or Listing 2 (for RPM-based derivatives). The BOINC manager pops up when you launch the application. The BOINC client itself has no graphical user interface but is used exclusively for communication with the server.

Listing 1

Installation on Debian/Ubuntu

sudo apt install boinc-client boinc-manager
sudo systemctl enable boinc-client
sudo systemctl start boinc-client
sudo chmod g+r /var/lib/boinc-client/gui_rpc_auth.cfg
sudo usermod -a -G boinc $USER
exec su $USER
boincmgr -d /var/lib/boinc-client

Listing 2

Installation on RPM-based Systems

sudo yum install boinc-client boinc-manager
sudo systemctl enable boinc-client
sudo systemctl start boinc-client
sudo chmod g+r /var/lib/boinc/gui_rpc_auth.cfg
sudo usermod -a -G boinc $USER
exec su $USER
boincmgr -d /var/lib/boinc

In the BOINC Manager, another window opens with a wizard. Now select the OpenPandemics project from the list of existing projects via the World Community Grid entry (Figure 1).

Figure 1: Select the World Community Grid entry – the OpenPandemics projects will then appear on a list of existing projects.

Log in to the BOINC manager with the combination of your email address and the password, which you registered up front. In the main window of the manager, you can now start the computations. The software fades to a progress bar and shows the necessary compute time, as well as the time already granted.

In the upper part of the window, you can view the active project. If you are participating in several computations, you can change the project by choosing an entry in the selection box. After running all the tasks, a green dot appears to the left of the project's name.

Pressing Stop at the bottom center of the window interrupts the computations; instead of the green dot a red one appears. Continue lets you restart the work (Figure 2).

Figure 2: The BOINC Manager shows you all the information required for distributed computing.

Settings

The BOINC manager typically allocates the free resources of your system independently so that the computer does not suffer from the additional work. The application might possibly include your graphics card if it is an NVidia GPU.

If you have an NVidia card, you do not need to download the support for this interface typically required for CUDA-based applications from the vendor's website. Instead, BOINC detects the GPU automatically and integrates it into the computations. AMD graphics cards and Intel-based GPUs are not integrated by the software.

If computer load generated by other applications increases, the BOINC client stops its work. In this case, the manager remains active, but no further computation takes place. Once the system load has dropped back to below a given threshold value, the application automatically resumes its activity. When the activity restarts, the tool attempts to establish balanced load behavior (Figure 3).

Figure 3: The BOINC Manager does not fully load even high-end computers.

In addition, the manager integrates various options for distributing the load which let you set thresholds yourself. You can reach the advanced settings via the menu items View | Advanced View. This is where you configure basic settings for the graphics processor via Control in the context menu.

The menu Options | Calculation settings opens a very extensive dialog for fine-tuning the software. Calculation running is where you define the performance settings. This includes the threshold values for the system load.

The Network tab lets you configure the data transfer rate. This is where you specify, say, the upload and download rates. On the Daily schedule tab you can additionally define when the BOINC software is allowed to carry out its computation work. You can create a weekly schedule. The schedule is divided into two components: the times for calculations and the times for the data transfer.

For systems that operate in 24-hour mode, you can use this feature to shift the times for compute work and transfers to the night. After completing the desired settings, don't forget to press Save to store them.

Information

The active status displays are shown also in the extended view of the BOINC manager. This window is also structured by tabs. You will find a News tab with a very simple feed reader showing you the latest information on the individual projects, along with their keywords. Links take you to websites with the full details.

The Projects tab lists the projects you have selected in tabular form, and Tasks shows you the tasks within the active projects. Each of these tasks has a status indicator; using the progress bars, you can see the progress of the compute work in each task. The software updates this information in near real time (Figure 4).

Figure 4: The detailed view shows you which tasks the software will complete, along with a time scale.

You can access graphics settings in the Transfer, Statistics, and Disk tabs; some of these settings let you customize to suit individual performance criteria.

Conclusions

The World Community Grid is an innovative option for consigning idle computer capacities to the service of science. The BOINC software does not pose any problems for newcomers and is simple enough to virtually rule out any issues due to incorrect use.

The project launched by the Forli Lab and Scripps Research to explore therapeutic options against COVID-19 provides an excellent opportunity to put your surplus resources to good use. But do keep in mind that computing power costs electricity.