CARNet WebWorld - CUC 2004

Abstract

Installation and configuration of computer cluster is a very difficult task. It starts from the deployment and configuration of hardware resources. Next step is Operating System (OS) installation on server and compute nodes. After that comes the installation of cluster middleware tools. Cluster middleware enables job execution on cluster computers. Cluster middleware includes job management system, parallel libraries and runtime environments and tools for node managing and monitoring.

Word "cluster distribution" comes from the analogy with OS distribution. Idea of cluster distribution is to provide integrated set of software packages needed for cluster installation. Cluster distribution consists of cluster middleware tools and automatic installation tool. Some cluster distributions come with OS, but most of them allow user to choose OS.

We have analyzed following set of Linux based cluster distributions: Rocks, OSCAR, OpenSCE, Scyld Beowulf and Clustermatic. There are many other cluster distributions. Some of them are: Extreme Cluster Administration Toolkit (xCAT), ClusterKnoppix, Warewulf and SCore. Numerous enterprise solutions exist as well.

NPACI Rocks ‎[2] is one of the most popular cluster distributions today. Rocks cluster distribution is based on Red Hat Linux distribution. Rocks uses Red Hat kickstart mechanism to deploy software packages on compute nodes. Software packages are arranged in Rolls. Every Roll has set of specific packages. Rocks enables usage of several Job Management Systems – SGE, OpenPBS and Condor. Rocks is updated regulary and Roll system provides scalable installation. Disadvantage of Rocks is dependency on Red Hat Linux OS.

Open Source Cluster Application Resources (OSCAR) [4] consists of typical cluster middleware software. In contrast to Rocks, OSCAR is not bound to any specific OS. User has to install Red Hat or Mandrake Linux on server node. OSCAR provides set of cluster middleware RPMs and automatic installation tool – System Installation Suite. Cluster middleware packages are OpenPBS job management system, LAM/MPI, Mpich and PVM parallel libraries and Clumon and Ganglia node monitoring system.

OpenSCE [3] has slightly different approach then other Cluster Distributions. While other cluster distributions are trying to integrate various existing solutions, OpenSCE has its own set of tools: parallel library, job management system, monitoring system and automatic installation tool. OpenSCE provides special cluster middleware tool - KSIX. KSIX creates global process space for processes on all cluster nodes. Disadvantage of OpenSCE is that its set of new cluster tools lacks evaluation and tests that other tools have already passed.

Scyld Beowulf [7] is commercial cluster solution. Scyld uses BProc system to create global process space. In contrast to Rocks and OSCAR, automatic installation tool installs very limited set of software and OS packages on compute nodes. Scyld comes with standard set of parallel libraries. Advantage of Scyld is global process space, which assures that there will not be any uncontrolled "runaway" processes on compute nodes. Disadvantage of Scyld is that it does not come with job management system, but it is possible to integrate some of existing job management systems.

Clustermatic [1] is very similar to Scyld Beowulf. Clustermatic also uses BProc to achieve global process space. It provides tool for automatic compute nodes installation, which installs very limited set of packages on compute nodes. Clustermatic comes with ZPL programming language and Supermon monitoring system. Disadvantage of Clustermatic is lack of advanced job management system.

References

[1] Clustermatic, http://www.clustermatic.org
[2] NPACI Rocks, http://www.rocksclusters.org
[3] OpenSCE, http://www.opensce.org
[4] OSCAR, http://oscar.openclustergroup.org
[5] P. M. Papadopoulos, M. J. Katz, G. Bruno: "NPACI Rocks: Tools and Techniques for Easily Deploying Manageable Linux Clusters"
[6] SCore, http://www.pccluster.org
[7] Scyld Beowulf, http://www.scyld.com
[8] xCAT, http://xcat.org/
[9] The Warewulf Cluster Project, http://warewulf-cluster.org

Biography

Emir Imamagić graduated from the Department of Electronics, Microelectronics, Computer and Intelligent Systems, Faculty of Electrical Engineering and Computing, University of Zagreb in May 2004. His research interests are high performance computing, distributed computing, computer clusters and grid systems.

Before graduation, he has worked on the AliEn Grid project at CERN, Switzerland in summer 2003 and on the MidArc middleware project at Ericsson Nicola Tesla in summer 2002. He is currently working as a researcher on the CRO-GRID Infrastructure project at University Computing Centre.