Building high-performance clusters with LAM/MPI
On the LAM
The venerable LAM/MPI infrastructure is a stable and practical platform for building high-performance applications.
Nowadays, a large number of commercial applications, especially in the areas of engineering, data mining, scientific/medical research, and oil exploration, are based on the capabilities of parallel computing. In most cases, programmers must design or customize these parallel applications to benefit from the possibilities of the parallel architecture. Message-Passing Interface (MPI) is a set of API functions that pass messages between processes (even between processes that are running on different physical servers) so a program can operate in a completely parallel fashion. MPI has now become a standard in the parallel programming world.
LAM/MPI [1] is an open source implementation of the MPI standard maintained by students and faculty at Indiana University. The LAM/MPI system has been around for 20 years, and it is considered a stable and mature tool. According to the website, LAM/MPI is now in "maintenance mode," with most of the new development work occurring instead on the alternative next-generation Open MPI implementation. However, LAM/MPI is still receiving critical patches and bug fixes, and the large install base of LAM/MPI systems around the world means that developers who specialize in parallel programming are still quite likely to encounter it.
In addition to offering the necessary libraries and files to implement the mandated MPI API functions, LAM/MPI also provides the LAM run-time environment. This run-time environment is based on a user-level daemon that provides many of the services required for MPI programs.
The major components of the LAM/MPI package are well integrated and designed as an extensible component framework with a number of small modules. These modules are selectable and configurable at run time. This component framework is known as the System Services Interface (SSI).
LAM/MPI provides high performance on a variety of platforms, from small off-the-shelf single-CPU clusters to large SMP machines with high-speed networks. In addition to high performance, LAM/MPI comes with a number of usability features that help with developing large-scale MPI applications.
In this article, I take a look at how to get started with parallel programming in LAM/MPI.
The Cluster
The cluster used in this article is based on an Intel blade center with five Intel blades. Each blade had two processors, two Ethernet cards, one FC card, and 8GB of memory. I used Red Hat Linux Enterprise Edition v4 as the operating system for this High-Performance Computing (HPC) example.
For low-cost implementations, you can used SCSI or SAN-based storage (for the master node only) and then make it available to all client nodes through NFS. Another possibility is to use NAS instead of SAN. A cluster filesystem such as GFS, GPFS, or Veritas can provide additional performance improvements; however, note that almost 95% of cluster computing implementations work well with a simple NAS or SAN.
Each blade system should have at least two Ethernet interfaces. You might even consider having additional Ethernet cards on your client or master nodes and then using Ethernet bonding on public network or private network interfaces to improve performance and availability.
On my network, the addressing for intra-cluster communications uses a private, non-routable address space (e.g., 10.100.10.xx/24). Slave nodes use only a 1GB Ethernet port (eth0) with an address in the private network. The master node (eth0) address is used as the gateway for the slave nodes.
SELinux is enabled by default on RHEL 4 systems, but if this enhanced security functionality is not necessary, consider turning it off. The iptables firewall is also enabled by default. If the cluster is on a completely trusted network, you can disable the firewall. If not, the firewall should be enabled on the master node only with SSH access enabled.
If your cluster contains a large number of nodes, you will also want some form of administration tool for central management of the cluster. I'll use the C3 open source cluster management toolset for this example. (See the box titled "Setting up C3.")
Setting Up C3
Several open source distributed shell tools provide cluster management capabilities, but I decided to rely on C3 [2] – an excellent, fast, and efficient tool for centralized cluster management. The C3 suite provides a number of tools for managing files and processes on the cluster.
You can download the C3 RPM from the project website [3] and install it as root. The installer will set up the package in the /opt/c3-4 directory for the current version. Then add the following symbolic links:
ln -s /opt/c3-4/cexec /usr/bin/cexec ln -s /opt/c3-4/cpush /usr/bin/cpush
The next step is to specify the cluster nodes in the /etc/c3.conf file, as shown in Listing 1. (The example in Listing 1 assumes all cluster nodes are able to resolve hostnames with their /etc/hosts files.)
Now you can test functionality of the C3 tools. For instance, the cexec tool executes a command on each cluster node. The command:
#/home/root> cexec df
outputs the filesystem layout on all client nodes.
Listing 1
Example /etc.c3.conf file
Creating a LAM User
Before you start up the LAM run time, create a non-root user to own the LAM daemons. On master node, execute:
/home/root> useradd -d lamuser /home/root> passwd lamuser (enter lamuser passwd at prompt)
To synchronize security information related to lamuser from the master node to all client nodes, use cpush:
/home/root> cpush /etc/passwd cpush /etc/shadow cpush /etc/group cpush /etc/gshadow
Also, you need to set up an SSH passwordless configuration for the lamuser account.
Installing LAM/MPI
Conceptually, using LAM/MPI for parallel computation is very easy. First, you have to launch the LAM run-time environment (or daemon) on all master and client nodes before compiling or executing any parallel code. Then you can compile the MPI programs and run them in the parallel environment provided by the LAM daemon. When you are finished, you can shut down the LAM daemons on the nodes participating in the computation.
Start by installing LAM/MPI on all nodes in the cluster. Red Hat RPM packages, along with source code, are at the LAM/MPI project website [4]. Debian packages are also available [5].
Before you install LAM/MPI, make sure you have the MPI-ready C compilers and the libaio library. The C3 tool suite described earlier is a convenient aid for installing the LAM/MPI RPM on all the files of the cluster. For example, the combination of the cpush, cexec, and rpm commands can install LAM binaries simultaneously on all cluster nodes.
cd /tmp/lam cpush lam-7.10.14.rpm cexec rpm -i lam-7.10.14.rpm
After successful installation of LAM/MPI, you have to set up the PATH environment variable. In this case, the user owning the LAM daemons is lamuser and the default shell is Bash, so I edit /home/lamuser/.bashrc to modify the PATH variable definition so it contains the path to the LAM binaries:
export PATH=/usr/local/lam/bin:$PATH
If you are using the C shell, add the following line to the /home/lamuser/.cshrc file:
set path = (/usr/local/lam/bin $path)
Note that this path might be different depending upon your distribution of Linux and the version of LAM/MPI.
To transfer the modified .bashrc or .cshrc file to all client nodes, use the C3 cpush command.
Now you are ready to define the LAM/MPI cluster nodes that will participate in your parallel computing cluster. The easiest way to define the nodes is to create a file named lamhosts in /etc/lam on the master node and then use cpush to send this file to all of the client nodes.
The lamhosts file should contain resolvable names of all nodes. A lamhosts file for my high-performance computing cluster is shown in Listing 2.
Listing 2
/etc/lam/lamhosts
The cpu setting in the lamhosts file takes an integer value and indicates how many CPUs are available for LAM to use on a specific node. If this setting is not present, the value of 1 is assumed. Notably, this number does not need to reflect the physical number of CPUs – it can be smaller than, equal to, or greater than the number of physical CPUs in the machine. It is used solely as a shorthand notation for the LAM/MPI mpirun command's C option, which means "launch one process per CPU as specified in the boot schema file."
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Rhino Linux Announces Latest "Quick Update"
If you prefer your Linux distribution to be of the rolling type, Rhino Linux delivers a beautiful and reliable experience.
-
Plasma Desktop Will Soon Ask for Donations
The next iteration of Plasma has reached the soft feature freeze for the 6.2 version and includes a feature that could be divisive.
-
Linux Market Share Hits New High
For the first time, the Linux market share has reached a new high for desktops, and the trend looks like it will continue.
-
LibreOffice 24.8 Delivers New Features
LibreOffice is often considered the de facto standard office suite for the Linux operating system.
-
Deepin 23 Offers Wayland Support and New AI Tool
Deepin has been considered one of the most beautiful desktop operating systems for a long time and the arrival of version 23 has bolstered that reputation.
-
CachyOS Adds Support for System76's COSMIC Desktop
The August 2024 release of CachyOS includes support for the COSMIC desktop as well as some important bits for video.
-
Linux Foundation Adopts OMI to Foster Ethical LLMs
The Open Model Initiative hopes to create community LLMs that rival proprietary models but avoid restrictive licensing that limits usage.
-
Ubuntu 24.10 to Include the Latest Linux Kernel
Ubuntu users have grown accustomed to their favorite distribution shipping with a kernel that's not quite as up-to-date as other distros but that changes with 24.10.
-
Plasma Desktop 6.1.4 Release Includes Improvements and Bug Fixes
The latest release from the KDE team improves the KWin window and composite managers and plenty of fixes.
-
Manjaro Team Tests Immutable Version of its Arch-Based Distribution
If you're a fan of immutable operating systems, you'll be thrilled to know that the Manjaro team is working on an immutable spin that is now available for testing.