Published on Feb 21, 2020
High performance computing (HPC) has come of age. No longer is it the preserve of computer scientists in research labs, plugging together printed circuit boards and writing new flavours of parallel operating systems.
HPC is a stable, mature technology, an enabling technology for an ever increasing number of scientists and researchers wishing to build and run computational models in their own particular disciplines. HPC has finally delivered on its promises.
Here we take a look at the current state of high performance computing from the perspective of the European user community, and assess the needs and aspirations of this community in terms of where HPC might be going, and where, perhaps, it should be going.
We aim to capture a snapshot of HPC activities, from the technology itself through related services to the direct views of its European user base, and attempt to draw the whole together into some form of roadmap for large scale computing in the twenty-first century.
Quadrics Supercomputer World (QSW) offer a PCI-compatible high-performance ``fat tree'' interconnect based on the original Meiko Computing Surface network. Called QsNet and built from QSW's Elan III network chips and Elite III switch chips, it offers some of the highest performance currently available in cluster networking systems..
The QsNet network is currently used inside the UltraSPARC-II-based QM-1, and QSW plans to produce systems in partnership with Compaq in the third quarter of 1999; the first of these will be the Compaq ``Sierra''.
It is as yet unclear whether QSW intend to make the QsNet technology available as an ``off-the-shelf'' networking product QsNet consists of two hardware building blocks: a programmable network interface called Elan and a high-bandwidth, low-latency communication switch called Elite. Elite switches can be interconnected in a fat-tree topology. With respect to software, QsNet provides several layers of communication libraries that trade off between performance and ease of use.
QsNet combines these hardware and software components to implement efficient and protected access to a global virtual memory via remote direct memory access (DMA) operations. It also enhances network fault tolerance via link level and end to end protocols that detect faults and automatically retransmit packets.
The Elan network interface (we refer to the Elan3 version of Elan in this article) connects the Quadrics network to a processing node containing one or more CPUs. In addition to generating and accepting packets to and from the network, Elan provides substantial local processing power to implement high-level, message-passing protocols such as the Message-Passing Interface (MPI). The internal functional structure of Elan, shown in Figure 1, centers around two primary processing engines: the microcode processor and the thread processor.
The 32-bit microcode processor supports four hardware threads. Each thread can independently issue pipelined memory requests to the memory system. Up to eight requests can be outstanding at any given time. Scheduling for the microcode processor permits a thread to wake up, schedule a new memory access based on the result of a previous memory access, and go back to sleep in as few as two system clock cycles.
Elan contains routing tables that translate every virtual processor number into a sequence of tags that determine the network route. The system software can load several routing tables to provide different routing strategies. Elan has an 8-Kbyte memory cache (organized as four sets of 2 Kbytes) and a 64-Mbyte SDRAM.
|Are you interested in this topic.Then mail to us immediately to get the full report.
email :- firstname.lastname@example.org