Application development for the Cell processor
Random
The PPE program contains a loop (Listing 2), which distributes the workload over the SPEs involved and sets a seed for creating pseudo random numbers from the current system time. To launch a program on an SPE, three steps are required. First, the spe_context_create() function (line 7) needs to create an SPE context. Second, the spe_program_load() function (line 8) needs to specify the program to execute; the programmer needs to declare the spe_program_handle_t variable in the PPE program header for this. This variable is always declared externally, that is, outside of the function. The name is identical to the name that the SPE program will be given later when you compile it.
Listing 2
For Loop for Controlling the SPEs
The third step is for the spe_context_run() function to launch the program you want to execute. Normally, this function would block the PPE program while the SPE program is running, thus preventing any other SPE programs from launching parallel to it. A Posix thread helps to avoid this by executing the spu_pthread() function (line 10), which in turn launches an SPE program without interrupting the PPE program flow.
Now the SPE program needs to know where the parameters for the forthcoming calculations are located. Each SPE has a mailbox for incoming messages (four 32-bit words) and a mailbox for outgoing messages (one 32-bit word). Another mailbox triggers a software interrupt when data is available. In this case, the PPE program calls spe_in_mbox_write() (line 13) to pass in the start address of the array in which the parameters for the calculations are stored. The SPE context defines which SPE receives the message; its start address is the first function argument.
When all SPE programs have terminated, the PPE program releases the memory for the SPE context in question. Finally, the PPE program outputs the SPE's results on the console.
SPE Culture
The SPE's work starts with the compute_pi() function (Listing 3). compute_pi() expects a seed as an argument, which it will use to generate random numbers, and the number of pairs of numbers to calculate. The function returns an approximate value for PI as a function value. To allow this to happen, the main() function (Listing 4) reads the main memory address at which the structure with the parameters for the current SPE program is located. This address is also referred to as an effective address.
Listing 3
compute_pi Function
Listing 4
Main Function in the SPE Program
Because the spu_read_in_mbox() function can only read single 32-bit words, it must be called twice to retrieve the full 64-bit address (lines 7 and 8). The variables declared inside the SPE program all lie within the SPE's local memory space. Pointers also reference memory addresses in the local memory. Because the Cell processor uses Big Endian architecture, the first word contains the higher value, and the second word contains the lower value bits.
Next, the SPE program must reserve a tag ID to distinguish DMA data transfers between main and local memory (line 10). An SPE can manage up to 32 tag IDs. Following this, the spu_mfcdma64() function transfers the parameter block that points to the main memory address previously retrieved from the mailbox to the spe_par variable in local memory (line 12). This function can handle both read and write DMA transfer. The sixth argument defines the transfer direction, as a comparison with line 18 shows.
The spu_mfcdma64() function does not wait for the memory transfer to complete. To ensure data integrity, the SPE program must wait until the DMA controller (Memory Flow Controller, MFC) has finished; the mfc_read_tag_status_all() (line 14) makes sure of this. The mfc_write_tag_mask() function (lines 19 and 20) tells us which of the 32 possible parallel DMA transfers it is waiting for.
Now the calculations can start, and the results, which are again stored in the spe_par structure, make their way back into main memory. Finally, line 22 releases the tag ID.
Instilling Life
Creating the object files is the next step. Because the PPE SPE processor cores use different instruction sets, two different compilers must be used to build the source:
/opt/cell/toolchain/bin/spu-gcc -o pi_libspe_spe.spuo pi_libspe_spe.c /opt/cell/toolchain/bin/ppu-gcc -c pi_libspe_ppe.c
The .spuo suffix indicates an object file based on the SPE instruction set. To create a single executable, the ppu-embedspu tool converts the SPE program's object code into a format that the PPE can read:
/opt/cell/toolchain/bin/ppu-embedspu pi_libspe_spe pi_libspe_spe.spuo pi_libspe_spe.o
The first parameter is the name used by the PPE to address the SPE program; it is identical to the name of the spe_program_handle_t type variable, which is declared in the pi_libspe_ppe.c source file.
The second parameter is the name of the file containing the SPE object code, and the third refers to the file where ppu-embedspu will write the PPE-readable object code. Finally, the developer must link the PPE and SPE programs with the libspe2 library to create an executable:
/opt/cell/toolchain/bin/ppu-gcc -o pi_libspe pi_libspe_ppe.o pi_libspe_spe.o -lspe2
If you have access to a computer with Cell hardware, you can simply copy the pi_libspe executable to it and execute the program. If you are using the simulator, you will need to take a small detour.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
ESET Discovers New Linux Malware
WolfsBane is an all-in-one malware that has hit the Linux operating system and includes a dropper, a launcher, and a backdoor.
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.
-
Linux Sees Massive Performance Increase from a Single Line of Code
With one line of code, Intel was able to increase the performance of the Linux kernel by 4,000 percent.
-
Fedora KDE Approved as an Official Spin
If you prefer the Plasma desktop environment and the Fedora distribution, you're in luck because there's now an official spin that is listed on the same level as the Fedora Workstation edition.
-
New Steam Client Ups the Ante for Linux
The latest release from Steam has some pretty cool tricks up its sleeve.
-
Gnome OS Transitioning Toward a General-Purpose Distro
If you're looking for the perfectly vanilla take on the Gnome desktop, Gnome OS might be for you.
-
Fedora 41 Released with New Features
If you're a Fedora fan or just looking for a Linux distribution to help you migrate from Windows, Fedora 41 might be just the ticket.
-
AlmaLinux OS Kitten 10 Gives Power Users a Sneak Preview
If you're looking to kick the tires of AlmaLinux's upstream version, the developers have a purrfect solution.
-
Gnome 47.1 Released with a Few Fixes
The latest release of the Gnome desktop is all about fixing a few nagging issues and not about bringing new features into the mix.