
The first Macintosh Cluster to be developed for high-performance scientific computing was the Appleseed Cluster at UCLA. This cluster took advantage of the tremendous single-processor computational power offered by Macintosh G3 and G4 computers, along with the easy networking capability of the Mac OS, to create a cluster of computers that could run high-performance plasma physics calculations in parallel. Since the advent of the PowerPC G4 processor with AltiVec the performance of Macs has only gotten better. What is more, Dr. Dean Dauger has developed a program called Pooch that makes setting up and running a cluster of Macs very simple. So a Macintosh cluster is a proven platform for high-performance scientific computing at low cost, with little maintenance or setup involved, and with the opportunity of increasing your processing power just by buying another Mac.
Dr. Timberlake, a Mac fan himself, could not resist the chance to create a
Mac cluster for his computing needs. In addition, he saw an opportunity to
provide Berry College with a high-performance computing platform that
would be housed in Berry's new state-of-the-art Science Building. The
cluster could be used by undergraduates at Berry to learn about parallel
computing and high-performance computational science. Additionally, a
Linux-based Beowulf cluster is also being developed at Berry, under the
supervision of Dr. Kris Powers. The parallel (if you will) development of
these two clusters will provide an unprecedented opportunity to compare
the performance of these two computing platforms as well as to develop
truly portable parallel programs. And so, the Bifrost Cluster was born ...
Update 8/22/2002
The Bifrost Cluster has just grown by leaps and bounds. Thanks to the
generosity of Berry's Charter
School of Education and Human Sciences the Bifrost Cluster has
expanded to 29 nodes! The folks in the Charter School have agreed to
allow 25 iMacs, housed in a computer lab in the newly renovated
Cook Building, to be used as part of the cluster during times when the lab
is closed. New performance specs have now been posted for the 29-node
cluster (see below).
Update 7/22/2003
We recently added two new machines to the Bifrost Cluster. The new machines, dubbed
Hunnin and Munnin after Odin's ravens, are both Dual 1.42 GHz PowerMac G4 computers
with 2 GB of memory. We also upgraded the memory on the other PowerMac G4s. The new
machines and new memory should allow the core cluster (excluding the Cook iMacs) to
perform approximately three times faster than before. This gives us the ability to
run significant calculations all day, every day. When the Cook computer lab is closed
we can add the 25 iMacs for even faster performance on big jobs. On a less happy note,
this upgrade represents that last of my start-up funds from Berry. So it is unlikely
that there will be another upgrade soon. Someday, though, I'm going to get a grant and
get my hands on some PowerMac G5s!
We also would like to announce the publication of the first article to feature calculations performed with the Bifrost Cluster. The article, "Correlation of the Photodetachment Rate of a Scarred Resonance State with the Classical Lyapunov Exponent" by Todd Timberlake and John Foreman was published in Physical Review Letters 90, 103001. A pdf version of the article is available here.
Finally, we would like to point out that the Bifrost Cluster has been featured on the Pooch Users Page. We would like to thank Dean Dauger for featuring our cluster and, more importantly, for the excellent software he has created to help us run it.
| Name | Processor | Memory | Peak Performance |
|---|---|---|---|
![]() Fenrir |
PowerPC G4 733 MHz |
1.13 GB | 2.8 GigaFlops |
![]() Garm |
PowerPC G4 867 MHz |
1.13 GB | 3.3 GigaFlops |
![]() Jormungand & Nidhogg |
2 PowerPC G4 933 MHz |
1.25 GB | 3.6 GigaFlops |
![]() The Cook iMacs |
25 PowerPC G4 800 MHz |
??? MB | ?? GigaFlops |
![]() Hunnin & Munnin |
2 PowerPC G4 Dual 1.42 GHz |
2 GB | 11.0 GigaFlops for 1 machine with both processors |
![]() Titan |
PowerPC G4 400 Mhz |
384 MB | 1.4 GigaFlops |
| Software | Company | Function |
|---|---|---|
![]() Mac OS X |
Apple Computer | All of the machines on the cluster run Mac OS X, although Titan often runs under Mac OS 9. Parallel jobs can be run on any computer running OS 9 or OS X. |
![]() Pooch |
Dauger Research | Pooch handles the distribution of jobs the the various machines in the cluster. Without Pooch the various machines wouldn't be able to pass the necessary information back and forth in order to complete a job in parallel. |
| Absoft
Pro Fortran for Mac OS X |
Absoft Corporation | Pro Fortran is our compiler of choice for the FORTRAN code that we run on the Bifrost Cluster. The compiler is optimized for the G4 processor and has many common subroutines that are precompiled to take advantage of the G4's AltiVec vector processor. |
![]() IMSL |
Visual Numerics | IMSL is a library of mathematical and statistical subroutines that are useful in scientific computation. Using IMSL routines saves us the time of developing our own code to perform basic mathematical functions like integrating systems of differential equations. In addition, the IMSL routines we run on the Bifrost Cluster have been optimized for the G4 processor and take advantage of AltiVec. |
The Bifrost Cluster will likely be used for many other projects once it is up and running. The cluster is readily available for the use of any faculty in Berry's School of Mathematical and Natural Sciences. We expect to use the Bifrost Cluster in tandem with Berry's new Beowulf cluster to give students practice creating parallel code, to compare performance of code on different platforms, to ensure the portability of our parallel code, and to get an overall sense of what is the best high-performance computing solution for a small liberal-arts college.
| CPUs | Peak Performance |
|---|---|
| Fenrir Garm Jormungand Nidhogg |
11.3 GigaFlops |
| Fenrir Garm Jormungand Nidhogg 25 Cook iMacs |
74.52 GigaFlops |
| Hunnin Munnin |
21.3 GigaFlops |
| Garm Jormungand Nidhogg Hunnin Munnin |
23.2 GigaFlops |
Note that Altivec Fractal Carbon splits the job up into equal-sized chunks and sends one chunk to each processor (so the dual-processor machines get two chunks). This means that the performance of the cluster will be limited by how long it takes the slowest processor to finish its chunk. This is why we don't use Titan at all when we are testing, because it is so much slower than the others that it would dramatically reduce the performance of the whole cluster. Even without Titan this effect is noticeable. Note that there is not much improvement between the third and fourth rows of the table. Although in the fourth row we are splitting the job into 7 pieces instead of 4, it takes Garm (867 MHz) almost as long to finish 1/7 of the job as it takes one of the 1.42 GHz processors to finish 1/4 of the job.
The way around this dilemma is to split of the job differently. In our research work we use what is called Master-Slave parallelism. This means that one computer is just responsible for coordinating the jobs performed by all the other computers. In our work the master does very little work, while the slaves are working constantly to carry out the tasks assigned to them by the master (hence the names). In that situation, Titan makes a great master since the master node does not need to be fast. Also, since Titan is a portable computer it means we can control the cluster from anywhere that has an internet connection. Not only does Master-Slave parallelism make Titan a useful part of the cluster, but it also helps us deal with the fact that the cluster is inhomogeneous. The total program is broken down into a large number of pieces (say, 800). Titan then sends out one piece to each processor. As soon as that processor completes its piece and returns the results, Titan sends it another piece. This means that the fast processors are kept busy instead of sitting around waiting for the slow processors to finish. This type of parallelism really maximizes the contribution of all of the machines in the cluster.
We have not yet had a chance to test the cluster using the new dual-processor machines AND the Cook iMacs, but hopefully we will get to that soon. We also plan to do some more rigorous testing using standard testing packages so that we can compare the performance of the Bifrost Cluster to clusters using other operating systems. Once we have those results I will post them here.