Scientific computing with a crypto mining rig
The Test Candidate
We purchased a mining rig with a backplane and separate motherboard at auction for EUR750. The system did not work reliably at first. The power supply worked, but it was too loud and smelled unhealthy. The eight installed NVIDIA P106-090 mining cards from 2018 with PCIe 1.1 x4 were OK. We treated them to a new case, memory, motherboard, processor and, to be on the safe side, a new power supply for another EUR350.
We wanted to compare the performance of this used mining rig with a high-end professional system. The professional hardware we chose for comparison was a 2020 system with eight NVIDIA A100 cards and PCIe 4.0 x16. The cost for this professional system was more than EUR75,000, which was 100 times more expensive than the mining rig we bought at auction.
GPU-focused systems are optimized for computation-intensive operations, so we wanted to stay with that basic scenario in our tests. We tested two different use cases:
- Scientific computing using the BOINC crowdsource computing platform [4]
- Machine learning with the PyTorch deep learning framework [5] and a well-known test dataset to teach the system to distinguish between dog and cat images
A cheap used mining rig that sells for one percent of the cost of an advanced computer system would be a big advantage, but we were realistic. We had no illusions a EUR750 mining rig would outperform the high-end commercial system in an absolute sense. We were more curious about whether it was competitive in delivering computing power per cost. In other words, if an option delivers one tenth of the computing power but it comes at only on one hundredth of the cost, there are scenarios where it could be a viable alternative.
We were also aware that the different components of the design would affect performance in different ways. The two systems didn't just have two different GPUs. The difference between the PCIe 1.1 x4 bus and the PCIe 4.0 x16 bus also seemed significant, as well as the differences in the CPUs. For a few of the tests, we experimented with putting the GPUs from the mining rig into the newer system to isolate the GPU as a variable.
BOINC Benchmarks
We picked out three BOINC-based crowdsource projects that support GPUs. Einstein@Home [6] uses data from the LIGO gravitational-wave detectors, the MeerKAT radio telescope, the Fermi Gamma-ray Space Telescope, as well as archival data from the Arecibo radio telescope to look for spinning neutron stars (often called pulsars). The professional system with eight A100 cards needed 300 seconds in this test. The mining rig took 2,000 seconds per work unit, which is more than six times as long, but again, the professional system was 100 times more expensive.
Was the superior performance of the professional computer due to the GPU or the faster processor with faster and wider PCIe bus? To find out, we installed the P106-090 cards from the mining rig into the professional system. Despite the faster processor and the 4x instead of 1x PCIe channels, the P106-090 cards ran only one percent faster when installed on the faster system. Einstein@Home allows multiple work units to share a GPU. We would have expected that processing two work units at once would lead to a performance advantage, but calculating two jobs on one card also doubled the computing time, so it did not yield an advantage.
The prime number search with PrimeGrid [7] requires virtually no CPU interaction with the cards (less than two percent CPU load). The P106-090s of our test system required between 916 and 925 seconds (CudaPPSsieve) and about 4,500 seconds (OCL_cuda_AP27). The A100s in the professional rig completed the task in about one tenth of the time in each case.
For the third BOINC test, we selected a benchmark program for the Folding@home biomedical project [8] and launched it simultaneously on several GPUs. The benchmark measures how many nanoseconds of a process in nature the computer can model within one day. With single precision, the mining rig's P106 GPUs managed 59 ns/d when placed in the professional system, whereas the A100 achieved 259 ns/d. With Double Precision (not supported in the hardware on the P106) it was 159 ns/d on the A100, while the P106 achieved just 3 ns/d.
PyTorch
PyTorch is an open source machine learning framework. We put together a manageable script that uses a neural network to classify images on a varying number of graphics cards (or just on the CPU). To do this, the images must be transported to the graphics cards and, if the results are distributed over several cards, they also need to be merged again at the end. During training, the models also need to be updated on all cards.
The CPU is not something you can do without in machine learning projects with GPU support. On the contrary, it actually becomes more and more important as the number of compute cores increases. It first prepares the data for the GPU and then summarizes the GPU results. If you distribute the workload over many GPUs, the processor can definitely become the bottleneck for which the graphics units will have to wait. How much the communication between GPU and CPU can be reduced depends on the application. If the data can be represented as a matrix and the application is based on operations on matrices or between them, GPUs are hard to beat.
We assumed for our study that the number of cards used does not affect the quality of the predictions. We actually did not pay any further attention to the quality of the prediction, as it can depend on a variety of factors, such as the quality of the training dataset or the size of the batches. We exclusively looked at the number of images that could be trained or classified (evaluated) every second with the given hardware.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Thousands of Linux Servers Infected with Stealth Malware Since 2021
Perfctl is capable of remaining undetected, which makes it dangerous and hard to mitigate.
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.
-
Valve and Arch Linux Announce Collaboration
Valve and Arch have come together for two projects that will have a serious impact on the Linux distribution.
-
Hacker Successfully Runs Linux on a CPU from the Early ‘70s
From the office of "Look what I can do," Dmitry Grinberg was able to get Linux running on a processor that was created in 1971.
-
OSI and LPI Form Strategic Alliance
With a goal of strengthening Linux and open source communities, this new alliance aims to nurture the growth of more highly skilled professionals.
-
Fedora 41 Beta Available with Some Interesting Additions
If you're a Fedora fan, you'll be excited to hear the beta version of the latest release is now available for testing and includes plenty of updates.
-
AlmaLinux Unveils New Hardware Certification Process
The AlmaLinux Hardware Certification Program run by the Certification Special Interest Group (SIG) aims to ensure seamless compatibility between AlmaLinux and a wide range of hardware configurations.
-
Wind River Introduces eLxr Pro Linux Solution
eLxr Pro offers an end-to-end Linux solution backed by expert commercial support.
-
Juno Tab 3 Launches with Ubuntu 24.04
Anyone looking for a full-blown Linux tablet need look no further. Juno has released the Tab 3.
-
New KDE Slimbook Plasma Available for Preorder
Powered by an AMD Ryzen CPU, the latest KDE Slimbook laptop is powerful enough for local AI tasks.