Difference between revisions of "QUonG initiative"

From APEWiki
Jump to: navigation, search
(Created page with ''''QUonG''' is our proposal for a new generation of hybrid GPU-CPU HPC cluster dedicated (but not limited) to LQCD simulations. ==Project Background== Many scientific computatio...')
 
Line 1: Line 1:
'''QUonG''' is our proposal for a new generation of hybrid GPU-CPU HPC cluster dedicated (but not limited) to LQCD simulations.
+
==The '''QUonG''' Project==
  
==Project Background==
+
'''QUonG''' is an INFN (Istituto Nazionale di Fisica Nucleare) initiative
Many scientific computations need multi-node parallelism for
+
targeted to develop a High Performance Computing system dedicated (but not
matching up both space (memory) and time (speed) ever-increasing
+
limited) to [http://en.wikipedia.org/wiki/Lattice_qcd Lattice QCD] computations.  
requirements. The use of GPUs as accelerators introduces yet another
 
level of complexity for the programmer and may potentially result in
 
large overheads due to bookkeeping of memory buffers. Additionally,
 
top-notch problems may easily employ more than a PetaFlops of
 
sustained computing power, requiring thousands of GPUs orchestrated
 
via some parallel programming model, mainly Message Passing
 
Interface (MPI).
 
  
==APEnet+ aim and features==
+
'''QUonG''' is a massively parallel computing platform built up from
The project target is the development of a low latency, high
+
commodity multi-core processors coupled with last generation GPUs;
bandwidth direct network, supporting state-of-the-art wire speeds
+
exploits the characteristics of LQCD algorithms, its network mesh is a
and PCIe X8 gen2 while improving the price/performance ratio on
+
point-to-point, high performance, low latency 3 dimensional toroidal
scaling the cluster size.
+
network interconnecting the computing nodes.
The network interface provides hardware support for the RDMA
 
programming model.
 
A Linux kernel driver, a set of low-level RDMA APIs and an OpenMPI
 
library driver are available; this allows for painless porting of standard
 
applications.
 
  
===Highlights===
+
The network is built upon the [[APEnet+_project|APEnet+]] project.
* APEnet+ is a packet-based direct network of point-to-point links with 2D/3D toroidal topology.
 
* Packets have a fixed size envelope (header+footer) and are auto-routed to their final destinations according to wormhole dimension-ordered static routing, with dead-lock avoidance.
 
* Error detection is implemented via CRC at packet level.
 
* Basic RDMA capabilities, PUT and GET, are implemented at the firmware level.
 
* Fault-tolerance features (will be added from 2011).
 
  
 +
The final shape of a deployed '''QUonG''' system is an assembly of
 +
standard 42U racks, each one capable of 60 TFlops/rack of peak
 +
performance, at a cost of 5kEuro/TFlops and for an estimated power
 +
consumption of 25 kW/rack.
  
=== GPU Cluster installation ===
+
A first '''QUonG''' system prototype is expected to be delivered at the
* Where: APE lab (INFN Roma 1)
+
end of the year 2011.
* What: [[GPUcluster]]
 
  
 
+
===Highlights===
----
 
Go to:
 
Public info about APEnet+ <br>
 
[[APEnet+_Pubblications|APEnet+ Pubblications]]
 
----
 
----
 
Internal links (require login):<br>
 
[[APEnet+_HW|APEnet+ HW]], [[APEnet+_SW|APEnet+ SW]], [[APEnet+_specs|APEnet+ specification]], [[NextDeadlinesForPubblication|Next Deadlines For Pubblication]]
 

Revision as of 10:15, 21 June 2011

The QUonG Project

QUonG is an INFN (Istituto Nazionale di Fisica Nucleare) initiative targeted to develop a High Performance Computing system dedicated (but not limited) to Lattice QCD computations.

QUonG is a massively parallel computing platform built up from commodity multi-core processors coupled with last generation GPUs; exploits the characteristics of LQCD algorithms, its network mesh is a point-to-point, high performance, low latency 3 dimensional toroidal network interconnecting the computing nodes.

The network is built upon the APEnet+ project.

The final shape of a deployed QUonG system is an assembly of standard 42U racks, each one capable of 60 TFlops/rack of peak performance, at a cost of 5kEuro/TFlops and for an estimated power consumption of 25 kW/rack.

A first QUonG system prototype is expected to be delivered at the end of the year 2011.

Highlights