Abstract
Installation and configuration of computer cluster is a very difficult
task. It starts from the deployment and configuration of hardware resources.
Next step is Operating System (OS) installation on server and compute
nodes. After that comes the installation of cluster middleware tools.
Cluster middleware enables job execution on cluster computers. Cluster
middleware includes job management system, parallel libraries and runtime
environments and tools for node managing and monitoring.
Word "cluster distribution" comes from the analogy with OS
distribution. Idea of cluster distribution is to provide integrated
set of software packages needed for cluster installation. Cluster distribution
consists of cluster middleware tools and automatic installation tool.
Some cluster distributions come with OS, but most of them allow user
to choose OS.
We have analyzed following set of Linux based cluster distributions:
Rocks, OSCAR, OpenSCE, Scyld Beowulf and Clustermatic. There are many
other cluster distributions. Some of them are: Extreme Cluster Administration
Toolkit (xCAT), ClusterKnoppix, Warewulf and SCore. Numerous enterprise
solutions exist as well.
NPACI Rocks [2] is one of the most popular cluster distributions today.
Rocks cluster distribution is based on Red Hat Linux distribution. Rocks
uses Red Hat kickstart mechanism to deploy software packages on compute
nodes. Software packages are arranged in Rolls. Every Roll has set of
specific packages. Rocks enables usage of several Job Management Systems
– SGE, OpenPBS and Condor. Rocks is updated regulary and Roll system
provides scalable installation. Disadvantage of Rocks is dependency
on Red Hat Linux OS.
Open Source Cluster Application Resources (OSCAR) [4] consists of typical
cluster middleware software. In contrast to Rocks, OSCAR is not bound
to any specific OS. User has to install Red Hat or Mandrake Linux on
server node. OSCAR provides set of cluster middleware RPMs and automatic
installation tool – System Installation Suite. Cluster middleware packages
are OpenPBS job management system, LAM/MPI, Mpich and PVM parallel libraries
and Clumon and Ganglia node monitoring system.
OpenSCE [3] has slightly different approach then other Cluster Distributions.
While other cluster distributions are trying to integrate various existing
solutions, OpenSCE has its own set of tools: parallel library, job management
system, monitoring system and automatic installation tool. OpenSCE provides
special cluster middleware tool - KSIX. KSIX creates global process
space for processes on all cluster nodes. Disadvantage of OpenSCE is
that its set of new cluster tools lacks evaluation and tests that other
tools have already passed.
Scyld Beowulf [7] is commercial cluster solution. Scyld uses BProc system
to create global process space. In contrast to Rocks and OSCAR, automatic
installation tool installs very limited set of software and OS packages
on compute nodes. Scyld comes with standard set of parallel libraries.
Advantage of Scyld is global process space, which assures that there
will not be any uncontrolled "runaway" processes on compute
nodes. Disadvantage of Scyld is that it does not come with job management
system, but it is possible to integrate some of existing job management
systems.
Clustermatic [1] is very similar to Scyld Beowulf. Clustermatic also
uses BProc to achieve global process space. It provides tool for automatic
compute nodes installation, which installs very limited set of packages
on compute nodes. Clustermatic comes with ZPL programming language and
Supermon monitoring system. Disadvantage of Clustermatic is lack of
advanced job management system.
References
[1] Clustermatic, http://www.clustermatic.org
[2] NPACI Rocks, http://www.rocksclusters.org
[3] OpenSCE, http://www.opensce.org
[4] OSCAR, http://oscar.openclustergroup.org
[5] P. M. Papadopoulos, M. J. Katz, G. Bruno: "NPACI Rocks: Tools
and Techniques for Easily Deploying Manageable Linux Clusters"
[6] SCore, http://www.pccluster.org
[7] Scyld Beowulf, http://www.scyld.com
[8] xCAT, http://xcat.org/
[9] The Warewulf Cluster Project, http://warewulf-cluster.org
Biography
Emir Imamagić graduated from the Department of Electronics,
Microelectronics, Computer and Intelligent Systems, Faculty of Electrical
Engineering and Computing, University of Zagreb in May 2004. His research
interests are high performance computing, distributed computing, computer
clusters and grid systems.
Before graduation, he has worked on the AliEn Grid project at CERN,
Switzerland in summer 2003 and on the MidArc middleware project at Ericsson
Nicola Tesla in summer 2002. He is currently working as a researcher
on the CRO-GRID Infrastructure project at University Computing Centre.