A Rasp Pi HAT for clustering Pi Zeros

MPI Code Example

I'm not going to cover "benchmarks" on the ClusterHAT, but it is important to illustrate some real MPI code running on the cluster. Rather than run HPL [17], the high-performance Linpack benchmark, and argue over tuning options to get the "best" performance, I find it's better to run the NAS Parallel Benchmarks (NPB) [18], which are fairly simple benchmarks that cover a range of algorithms, primarily focused on computational fluid dynamics (CFD) [19]. They stress the processor, memory bandwidth, and network bandwidth; are easy to build and compile; and come in several flavors, including MPI. Also, different problem sizes or "classes" scale from very small to very large systems.

Because the ClusterHAT is a small cluster, I used only the class A test. In the interest of brevity, I only used the cg (conjugate gradient, irregular memory access and communication), ep (embarrassingly parallel), is (integer sort, random memory access), and lu (lower-upper Gauss-Seidel solver) applications with four and eight processors. Four processors included two cases: (1) Pi Zeros only, and (2) RPi3 only. The eight processors case included the RPi3 and the Pi Zeros (a total of eight cores).

For all four applications, performance, measured in millions of operations per second (MOPS), was recorded from the output for the entire MPI group and for each process in the MPI group. These results are tabulated in Table 1.

Table 1

NPB Results

Test

Class

No. of Cores

Total MOPS (RPi3 Only)

MOPS/Process (RPi3 Only)

Total MOPS (Pi Zeros Only)

MOPS/Process (Pi Zeros Only)

Total MOPS (Pi Zeros + RPi3)

MOPS/Process (Pi Zeros + RPi3)

CG

A

4

198.98

49.75

38.77

9.69

CG

A

8

71.98

9

EP

A

4

25.8

6.45

6.93

1.73

EP

A

8

13.92

1.74

IS

A

4

43.85

10.96

3.99

1

IS

A

8

6.71

0.84

LU

A

4

425.36

106.34

197.88

49.47

LU

A

8

396.22

49.53

Summary

The ClusterHAT is one of the most innovative clusters to come along in many years. It's very compact, uses a very small amount of power, runs Linux, and is fairly inexpensive (around $100 for the entire system).

Although it is obviously not designed to be a speed demon, you can use it to learn about common cluster tools and clustering and how to write parallel and distributed software. The ClusterHAT cluster can run Singularity [20] and Docker [21] containers, including resource managers. In a classroom, each student could have a small ClusterHAT cluster on which to run real applications, including AI [22].

The ClusterHAT is one of the coolest HPC pieces of hardware to come out in a long time.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Proxmox VE

    The Proxmox Virtual Environment has developed from an insider’s tip to a free VMware ESXi/ vSphere clone. We show you how to get started setting up a PVE high-availability cluster.

  • PelicanHPC

    Crunch big numbers with your very own high-performance computing cluster.

  • Samba for Clusters

    Samba Version 3.3 and the CTDB lock manager provide full cluster support.

  • LAM/MPI

    The venerable LAM/MPI infrastructure is a stable and practical platform for building high-performance applications.

  • Parallel Shells

    The most fundamental tool needed to administer a cluster is a parallel shell, which allows you to run the same command on a series of nodes. In this article, we look at pdsh.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News