at the

Viking Center for Computational Physics

Berry College

[THE IDEA] [THE NAME] [THE PEOPLE] [THE MACHINES] [THE SOFTWARE] [THE RESEARCH] [THE PERFORMANCE]

The Idea

The idea for the Bifrost Cluster began when Dr. Todd Timberlake accepted a postion as a physics professor at Berry College. Dr. Timberlake had worked in the area of computational quantum chaos (see below) as a graduate student and wanted to continue his research in this field after his move to Berry College. As a graduate student in the Physics Department at the University of Texas, Dr. Timberlake had run his FORTRAN programs on a Cray J90 supercomputer. When he moved to Berry he needed to find another platform on which he could run his high-performance calculations. After searching about for a solution he finally landed on the perfect answer ... a Macintosh Cluster!

The first Macintosh Cluster to be developed for high-performance scientific computing was the Appleseed Cluster at UCLA. This cluster took advantage of the tremendous single-processor computational power offered by Macintosh G3 and G4 computers, along with the easy networking capability of the Mac OS, to create a cluster of computers that could run high-performance plasma physics calculations in parallel. Since the advent of the PowerPC G4 processor with AltiVec the performance of Macs has only gotten better. What is more, Dr. Dean Dauger has developed a program called Pooch that makes setting up and running a cluster of Macs very simple. So a Macintosh cluster is a proven platform for high-performance scientific computing at low cost, with little maintenance or setup involved, and with the opportunity of increasing your processing power just by buying another Mac.

Dr. Timberlake, a Mac fan himself, could not resist the chance to create a Mac cluster for his computing needs. In addition, he saw an opportunity to provide Berry College with a high-performance computing platform that would be housed in Berry's new state-of-the-art Science Building. The cluster could be used by undergraduates at Berry to learn about parallel computing and high-performance computational science. Additionally, a Linux-based Beowulf cluster is also being developed at Berry, under the supervision of Dr. Kris Powers. The parallel (if you will) development of these two clusters will provide an unprecedented opportunity to compare the performance of these two computing platforms as well as to develop truly portable parallel programs. And so, the Bifrost Cluster was born ...

Update 8/22/2002
The Bifrost Cluster has just grown by leaps and bounds. Thanks to the generosity of Berry's Charter School of Education and Human Sciences the Bifrost Cluster has expanded to 29 nodes! The folks in the Charter School have agreed to allow 25 iMacs, housed in a computer lab in the newly renovated Cook Building, to be used as part of the cluster during times when the lab is closed. New performance specs have now been posted for the 29-node cluster (see below).

Update 7/22/2003
We recently added two new machines to the Bifrost Cluster. The new machines, dubbed Hunnin and Munnin after Odin's ravens, are both Dual 1.42 GHz PowerMac G4 computers with 2 GB of memory. We also upgraded the memory on the other PowerMac G4s. The new machines and new memory should allow the core cluster (excluding the Cook iMacs) to perform approximately three times faster than before. This gives us the ability to run significant calculations all day, every day. When the Cook computer lab is closed we can add the 25 iMacs for even faster performance on big jobs. On a less happy note, this upgrade represents that last of my start-up funds from Berry. So it is unlikely that there will be another upgrade soon. Someday, though, I'm going to get a grant and get my hands on some PowerMac G5s!

We also would like to announce the publication of the first article to feature calculations performed with the Bifrost Cluster. The article, "Correlation of the Photodetachment Rate of a Scarred Resonance State with the Classical Lyapunov Exponent" by Todd Timberlake and John Foreman was published in Physical Review Letters 90, 103001. A pdf version of the article is available here.

Finally, we would like to point out that the Bifrost Cluster has been featured on the Pooch Users Page. We would like to thank Dean Dauger for featuring our cluster and, more importantly, for the excellent software he has created to help us run it.


The Name

In Norse Mythology, Bifrost is the legendary Rainbow Bridge that connects Midgard (the world of men) to Asgard (the world of the gods). This name was chosen for our cluster because it ties together several themes: Vikings (Berry College's mascot), the Rainbow (the classic Apple emblem), and bridges (as in the connections between computers running an application in parallel). The Bifrost Cluster logo shown above attempts to illustrate the relationship of these themes.

The People

Currently there are three people working on the Bifrost Cluster project.

The Machines

These are the machines that make up the Bifrost Cluster. All of these machines are connected on the Berry College campus networ. Peak performance values were determined using the AltiVec Fractal Carbon Demo developed by Dauger Research.

Name Processor Memory Peak Performance

Fenrir
PowerPC G4
733 MHz
1.13 GB 2.8 GigaFlops

Garm
PowerPC G4
867 MHz
1.13 GB 3.3 GigaFlops

Jormungand & Nidhogg
2
PowerPC G4
933 MHz
1.25 GB 3.6 GigaFlops

The Cook iMacs
25
PowerPC G4
800 MHz
??? MB ?? GigaFlops

Hunnin & Munnin
2
PowerPC G4
Dual 1.42 GHz
2 GB 11.0 GigaFlops
for 1 machine with
both processors

Titan
PowerPC G4
400 Mhz
384 MB 1.4 GigaFlops


The Software

Below is a listing of the software we are currently running on the Bifrost Cluster.
Software Company Function

Mac OS X
Apple Computer All of the machines on the cluster run Mac OS X, although Titan often runs under Mac OS 9. Parallel jobs can be run on any computer running OS 9 or OS X.

Pooch
Dauger Research Pooch handles the distribution of jobs the the various machines in the cluster. Without Pooch the various machines wouldn't be able to pass the necessary information back and forth in order to complete a job in parallel.
Absoft Pro Fortran
for Mac OS X
Absoft Corporation Pro Fortran is our compiler of choice for the FORTRAN code that we run on the Bifrost Cluster. The compiler is optimized for the G4 processor and has many common subroutines that are precompiled to take advantage of the G4's AltiVec vector processor.

IMSL
Visual Numerics IMSL is a library of mathematical and statistical subroutines that are useful in scientific computation. Using IMSL routines saves us the time of developing our own code to perform basic mathematical functions like integrating systems of differential equations. In addition, the IMSL routines we run on the Bifrost Cluster have been optimized for the G4 processor and take advantage of AltiVec.

The Research

The Bifrost Cluster is being developed for the purpose of computational physics research, although is will likely be used for other purposes as well. The initial jobs that will be run on the cluster are simulations of both quantum and classical periodically driven systems. These systems are of great interest in the field of Quantum Chaos. For details about the physics that we are studying with the Bifrost Cluster see Dr. Timberlake's Home Page. From a computational perspective we will primarily be solving very large systems of ordinary differential equations using a variety of initial conditions. This code should parallelize well on a large scale since each node can solve the system for a different initial condition. Once these various systems of ODEs are solved the results are put into a large square matrix, which is then diagonalized. This process is harder to parallelize, but currently the matrices we are using are small enough that this part doesn't take long to run serially.

The Bifrost Cluster will likely be used for many other projects once it is up and running. The cluster is readily available for the use of any faculty in Berry's School of Mathematical and Natural Sciences. We expect to use the Bifrost Cluster in tandem with Berry's new Beowulf cluster to give students practice creating parallel code, to compare performance of code on different platforms, to ensure the portability of our parallel code, and to get an overall sense of what is the best high-performance computing solution for a small liberal-arts college.


The Performance

We have had great success using the small version of the cluster (without the 25 iMacs) for research purposes. Dr. Timberlake and Mr. Foreman completed a computational project using the cluster and published their results in Physical Review Letters 90, 103001. Since the 25 iMacs and two dual-processor machines were added to the cluster we have only had time to do some preliminary testing, but the results are impressive. What we have done is to test the performance of the current configuration of the cluster using the AltiVec Fractal Carbon Demo This gives us a useful benchmark that shows us what can be done on the Bifrost Cluster. The results of our tests are indicated below.

CPUs Peak Performance
Fenrir
Garm
Jormungand
Nidhogg
11.3 GigaFlops
Fenrir
Garm
Jormungand
Nidhogg
25 Cook iMacs
74.52 GigaFlops
Hunnin
Munnin
21.3 GigaFlops
Garm
Jormungand
Nidhogg
Hunnin
Munnin
23.2 GigaFlops

Note that Altivec Fractal Carbon splits the job up into equal-sized chunks and sends one chunk to each processor (so the dual-processor machines get two chunks). This means that the performance of the cluster will be limited by how long it takes the slowest processor to finish its chunk. This is why we don't use Titan at all when we are testing, because it is so much slower than the others that it would dramatically reduce the performance of the whole cluster. Even without Titan this effect is noticeable. Note that there is not much improvement between the third and fourth rows of the table. Although in the fourth row we are splitting the job into 7 pieces instead of 4, it takes Garm (867 MHz) almost as long to finish 1/7 of the job as it takes one of the 1.42 GHz processors to finish 1/4 of the job.

The way around this dilemma is to split of the job differently. In our research work we use what is called Master-Slave parallelism. This means that one computer is just responsible for coordinating the jobs performed by all the other computers. In our work the master does very little work, while the slaves are working constantly to carry out the tasks assigned to them by the master (hence the names). In that situation, Titan makes a great master since the master node does not need to be fast. Also, since Titan is a portable computer it means we can control the cluster from anywhere that has an internet connection. Not only does Master-Slave parallelism make Titan a useful part of the cluster, but it also helps us deal with the fact that the cluster is inhomogeneous. The total program is broken down into a large number of pieces (say, 800). Titan then sends out one piece to each processor. As soon as that processor completes its piece and returns the results, Titan sends it another piece. This means that the fast processors are kept busy instead of sitting around waiting for the slow processors to finish. This type of parallelism really maximizes the contribution of all of the machines in the cluster.

We have not yet had a chance to test the cluster using the new dual-processor machines AND the Cook iMacs, but hopefully we will get to that soon. We also plan to do some more rigorous testing using standard testing packages so that we can compare the performance of the Bifrost Cluster to clusters using other operating systems. Once we have those results I will post them here.






[TO DR T's RESEARCH WEBPAGE]

Todd K. Timberlake (ttimberlake@berry.edu)