Difference between revisions of "QUonG initiative"

From APEWiki
Jump to: navigation, search
Line 15: Line 15:
 
=== The '''QUonG''' hardware ===
 
=== The '''QUonG''' hardware ===
  
'''QUonG''' is a massively parallel computing platform built up from
+
The '''QUonG''' system is a massively parallel computing platform built
commodity multi-core processors coupled with latest generation GPUs;
+
up as a cluster of hybrid elementary computing nodes. Every node is made
its communication mesh - being tailored to the characteristics of LQCD
+
of commodity Intel-based multi-core host processors, each coupled with a
algorithms - is a point-to-point, high performance, low latency network
+
number of latest generation NVIDIA GPU's, acting as floating point
where the computing nodes are topologically arranged as vertexes of a 3
+
accelerators.
 +
 
 +
The '''QUonG''' communication mesh is built upon our proprietary
 +
component: the [[APEnet+_project|APEnet+]] network card. Its design was
 +
driven by the requirements of typical LQCD algorithms; this asked for a
 +
point-to-point, high performance, low latency network where the
 +
computing nodes are topologically arranged as vertexes of a 3
 
dimensional torus.
 
dimensional torus.
  
The network is built upon the [[APEnet+_project|APEnet+]] project.
+
Our reference system is a cluster of '''QUonG elementary computing
 +
units''', each of which is a combination of multi-core CPU, GPU
 +
accelerator and an [[APEnet+_project|APEnet+]] card.
 +
This approach makes for a flexible and modular platform that can be
 +
tailored to different application requirements by tuning the GPU vs.
 +
CPU ratio per unit.
  
 
[[File:Elemento_QUonG.png|border|right|300px]]  
 
[[File:Elemento_QUonG.png|border|right|300px]]  
 +
 +
=== The '''QUonG''' Software Stack ===
 +
 +
(in progress)
  
 
=== The '''QUonG''' Prototype ===
 
=== The '''QUonG''' Prototype ===
 +
 
The first prototype of the '''QUonG''' parallel system is expected to
 
The first prototype of the '''QUonG''' parallel system is expected to
 
be delivered before the end of 2011. The chosen ratio for this first
 
be delivered before the end of 2011. The chosen ratio for this first
 
deliverable is one (multi-core) host for two GPU's.
 
deliverable is one (multi-core) host for two GPU's.
  
As a consequence, the '''QUonG''' elementary mechanical assembly is a
+
As a consequence, the current target for the '''QUonG''' elementary
3U "sandwich" made of two Intel-based servers plus a NVIDIA
+
mechanical assembly is a 3U "sandwich" made of two Intel-based
S2050/70/90 multiple GPU system equipped with the APEnet+ network
+
servers plus a NVIDIA S2050/70/90 multiple GPU system equipped with
board.
+
the [[APEnet+_project|APEnet+]] network board.
  
 
The assembly collects two '''QUonG''' elementary computing units, each
 
The assembly collects two '''QUonG''' elementary computing units, each
 
made of one server hosting the interface board to control 2 (out of 4)
 
made of one server hosting the interface board to control 2 (out of 4)
GPU's inside the S2050/70/90 and one APEnet+ board. In this way, a
+
GPU's inside the S2050/70/90 and one [[APEnet+_project|APEnet+]]
'''QUonG''' elementary computing unit is topologically equivalent to
+
board. In this way, a '''QUonG''' elementary computing unit is
two vertexes of the APEnet+ 3-dim mesh.
+
topologically equivalent to two vertexes of the
 +
[[APEnet+_project|APEnet+]] 3-dim mesh.
  
  
Line 54: Line 71:
  
 
A typical '''QUonG''' PFlops scale installation, composed of 16
 
A typical '''QUonG''' PFlops scale installation, composed of 16
'''QUonG''' racks interconnected via APEnet+ 3-dim torus network, will
+
'''QUonG''' racks interconnected via [[APEnet+_project|APEnet+]] 3-dim
be characterized by an estimated power consumption of the order of 400
+
torus network, will be characterized by an estimated power consumption
kW and a cost ratio of the order of 5 kEuro/TFlops.
+
of the order of 400 kW and a cost ratio of the order of 5 kEuro/TFlops.
  
 
[[File:QUonG_rack.png|border|left]]
 
[[File:QUonG_rack.png|border|left]]

Revision as of 11:39, 22 June 2011

The QUonG Proposal

The QUonG proposal is an initiative by Logo-infn.png INFN (Istituto Nazionale di Fisica Nucleare) targeted at nurturing a hardware-software ecosystem centering on the subject of Lattice QCD.

Leveraging on the acquired know-how of the APE group in the co-development of High Performance Computing systems dedicated (but not limited) to LQCD, QUonG aims at providing the scientific community a comprehensive platform for the development of LQCD multi-node parallel applications.

The QUonG hardware

The QUonG system is a massively parallel computing platform built up as a cluster of hybrid elementary computing nodes. Every node is made of commodity Intel-based multi-core host processors, each coupled with a number of latest generation NVIDIA GPU's, acting as floating point accelerators.

The QUonG communication mesh is built upon our proprietary component: the APEnet+ network card. Its design was driven by the requirements of typical LQCD algorithms; this asked for a point-to-point, high performance, low latency network where the computing nodes are topologically arranged as vertexes of a 3 dimensional torus.

Our reference system is a cluster of QUonG elementary computing units, each of which is a combination of multi-core CPU, GPU accelerator and an APEnet+ card. This approach makes for a flexible and modular platform that can be tailored to different application requirements by tuning the GPU vs. CPU ratio per unit.

Elemento QUonG.png

The QUonG Software Stack

(in progress)

The QUonG Prototype

The first prototype of the QUonG parallel system is expected to be delivered before the end of 2011. The chosen ratio for this first deliverable is one (multi-core) host for two GPU's.

As a consequence, the current target for the QUonG elementary mechanical assembly is a 3U "sandwich" made of two Intel-based servers plus a NVIDIA S2050/70/90 multiple GPU system equipped with the APEnet+ network board.

The assembly collects two QUonG elementary computing units, each made of one server hosting the interface board to control 2 (out of 4) GPU's inside the S2050/70/90 and one APEnet+ board. In this way, a QUonG elementary computing unit is topologically equivalent to two vertexes of the APEnet+ 3-dim mesh.


The envisioned QUonG Rack

Building from the configuration of the first QUonG prototype equipped with current NVIDIA Tesla GPU's with Fermi architecture, we envision a final shape for a deployed QUonG system as an assembly of fully populated standard 42U height QUonG racks, each one capable of 60 TFlops/rack in single precision (30 TFlops/rack in double precision) of peak performance, at a cost of 5kEuro/TFlops and for an estimated power consumption of 25 kW/rack.

A typical QUonG PFlops scale installation, composed of 16 QUonG racks interconnected via APEnet+ 3-dim torus network, will be characterized by an estimated power consumption of the order of 400 kW and a cost ratio of the order of 5 kEuro/TFlops.

QUonG rack.png