Building Clusters with CompactPCI® Serial

Version 2

    Author: Manfred Schmitz, CTO MEN Mikro Elektronik

     

    If some years ago supercomputers were still built up of systems with a special technology, today mostly standard computer technologies are used. For this, a large number of single comparatively cost-effective servers are combined to form computer clusters.

     

    A computer cluster in most cases consists of a large number of single interconnected computers which are used to process parts of a total task in parallel. Seen from the outside a computer cluster acts as a single computer. The nodes are interconnected using a fast network. Building such server farms considerably increases the computing capacity and availability. In particular, the failure safety of a cluster compared to that of a single computer is a decisive advantage. If a system within a cluster fails, this has no direct influence on all other systems which are part of the cluster. This way redundancy is achieved.

    Mainly two kinds of computer clusters are distinguished:

    High-Availability Clusters

    High-availability clusters are supposed to increase availability and ensure better failure safety. In case of an error, the tasks of the defective host of the cluster are automatically transferred to another host. Areas of usage are applications in which down-times of only some minutes per year maximum are allowed.

    High Performance Computing Cluster

    High Performance Computing Clusters are used to carry out calculations that are distributed over several hosts. From the user's point of view the cluster is a central unit, but which, from a logic point of view, consists of several networked systems. Fields of application are mostly found in the areas of science and military, but server farms for rendering 3D computer graphics and computer animations are also built up of this kind of cluster.

     

    CompactPCI Serial, PICMG CPCI-S.0 is predestined for building high-availability clusters. But also for compact systems with high computing performance, solutions based on CPCI-S.0 are hard to beat.

     

    CompactPCI Serial defines up to 9 slots on a 3U backplane. The distribution computer responsible for this part of the cluster is plugged into the system slot. It is connected to the 8 cluster nodes via a 1 Gb/s (optionally 10 Gb/s) full-mesh Ethernet network. Such a 9-slot unit is a typical sub-cluster in a cluster network. Based on modern Intel technology it provides 9 x 4 = 36 cores with 4 GB memory each. For availability reasons, the sub-cluster, which has a typical power consumption of 400 W, is equipped with its own PSU, which can also be redundant if required.

     

    Eight of the sub-clusters are connected via Ethernet to form one cluster. A CompactPCI Serial computer is used for central management tasks as an NAS (Network-Attached-Storage). In total, the system has 288 cores plus management units. This cluster computer consisting of CompactPCI Serial components has a power consumption of only 3500 W and only needs 20 U in a 19" cabinet. The total volume of the CompactPCI Serial cluster is only 50% of that of a solution with 1U servers. If required, it is also suited for operation in extreme temperature ranges and in mobile applications.

     

    The declared aim during the development of the CompactPCI Serial standard was to make it suitable for as wide a range of applications as possible – starting from a smart modular industrial PC up to a supercomputer.