Application development for the Cell processor
Random
The PPE program contains a loop (Listing 2), which distributes the workload over the SPEs involved and sets a seed for creating pseudo random numbers from the current system time. To launch a program on an SPE, three steps are required. First, the spe_context_create() function (line 7) needs to create an SPE context. Second, the spe_program_load() function (line 8) needs to specify the program to execute; the programmer needs to declare the spe_program_handle_t variable in the PPE program header for this. This variable is always declared externally, that is, outside of the function. The name is identical to the name that the SPE program will be given later when you compile it.
Listing 2
For Loop for Controlling the SPEs
The third step is for the spe_context_run() function to launch the program you want to execute. Normally, this function would block the PPE program while the SPE program is running, thus preventing any other SPE programs from launching parallel to it. A Posix thread helps to avoid this by executing the spu_pthread() function (line 10), which in turn launches an SPE program without interrupting the PPE program flow.
Now the SPE program needs to know where the parameters for the forthcoming calculations are located. Each SPE has a mailbox for incoming messages (four 32-bit words) and a mailbox for outgoing messages (one 32-bit word). Another mailbox triggers a software interrupt when data is available. In this case, the PPE program calls spe_in_mbox_write() (line 13) to pass in the start address of the array in which the parameters for the calculations are stored. The SPE context defines which SPE receives the message; its start address is the first function argument.
When all SPE programs have terminated, the PPE program releases the memory for the SPE context in question. Finally, the PPE program outputs the SPE's results on the console.
SPE Culture
The SPE's work starts with the compute_pi() function (Listing 3). compute_pi() expects a seed as an argument, which it will use to generate random numbers, and the number of pairs of numbers to calculate. The function returns an approximate value for PI as a function value. To allow this to happen, the main() function (Listing 4) reads the main memory address at which the structure with the parameters for the current SPE program is located. This address is also referred to as an effective address.
Listing 3
compute_pi Function
Listing 4
Main Function in the SPE Program
Because the spu_read_in_mbox() function can only read single 32-bit words, it must be called twice to retrieve the full 64-bit address (lines 7 and 8). The variables declared inside the SPE program all lie within the SPE's local memory space. Pointers also reference memory addresses in the local memory. Because the Cell processor uses Big Endian architecture, the first word contains the higher value, and the second word contains the lower value bits.
Next, the SPE program must reserve a tag ID to distinguish DMA data transfers between main and local memory (line 10). An SPE can manage up to 32 tag IDs. Following this, the spu_mfcdma64() function transfers the parameter block that points to the main memory address previously retrieved from the mailbox to the spe_par variable in local memory (line 12). This function can handle both read and write DMA transfer. The sixth argument defines the transfer direction, as a comparison with line 18 shows.
The spu_mfcdma64() function does not wait for the memory transfer to complete. To ensure data integrity, the SPE program must wait until the DMA controller (Memory Flow Controller, MFC) has finished; the mfc_read_tag_status_all() (line 14) makes sure of this. The mfc_write_tag_mask() function (lines 19 and 20) tells us which of the 32 possible parallel DMA transfers it is waiting for.
Now the calculations can start, and the results, which are again stored in the spe_par structure, make their way back into main memory. Finally, line 22 releases the tag ID.
Instilling Life
Creating the object files is the next step. Because the PPE SPE processor cores use different instruction sets, two different compilers must be used to build the source:
/opt/cell/toolchain/bin/spu-gcc -o pi_libspe_spe.spuo pi_libspe_spe.c /opt/cell/toolchain/bin/ppu-gcc -c pi_libspe_ppe.c
The .spuo suffix indicates an object file based on the SPE instruction set. To create a single executable, the ppu-embedspu tool converts the SPE program's object code into a format that the PPE can read:
/opt/cell/toolchain/bin/ppu-embedspu pi_libspe_spe pi_libspe_spe.spuo pi_libspe_spe.o
The first parameter is the name used by the PPE to address the SPE program; it is identical to the name of the spe_program_handle_t type variable, which is declared in the pi_libspe_ppe.c source file.
The second parameter is the name of the file containing the SPE object code, and the third refers to the file where ppu-embedspu will write the PPE-readable object code. Finally, the developer must link the PPE and SPE programs with the libspe2 library to create an executable:
/opt/cell/toolchain/bin/ppu-gcc -o pi_libspe pi_libspe_ppe.o pi_libspe_spe.o -lspe2
If you have access to a computer with Cell hardware, you can simply copy the pi_libspe executable to it and execute the program. If you are using the simulator, you will need to take a small detour.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
-
New Pentesting Distribution to Compete with Kali Linux
SnoopGod is now available for your testing needs
-
Juno Computers Launches Another Linux Laptop
If you're looking for a powerhouse laptop that runs Ubuntu, the Juno Computers Neptune 17 v6 should be on your radar.
-
ZorinOS 17.1 Released, Includes Improved Windows App Support
If you need or desire to run Windows applications on Linux, there's one distribution intent on making that easier for you and its new release further improves that feature.
-
Linux Market Share Surpasses 4% for the First Time
Look out Windows and macOS, Linux is on the rise and has even topped ChromeOS to become the fourth most widely used OS around the globe.
-
KDE’s Plasma 6 Officially Available
KDE’s Plasma 6.0 "Megarelease" has happened, and it's brimming with new features, polish, and performance.
-
Latest Version of Tails Unleashed
Tails 6.0 is based on Debian 12 and includes GNOME 43.
-
KDE Announces New Slimbook V with Plenty of Power and KDE’s Plasma 6
If you're a fan of KDE Plasma, you'll be thrilled to hear they've announced a new Slimbook with an AMD CPU and the latest version of KDE Plasma desktop.
-
Monthly Sponsorship Includes Early Access to elementary OS 8
If you want to get a glimpse of what's in the pipeline for elementary OS 8, just set up a monthly sponsorship to help fund its continued existence.