What acceleration technologies are used by the world’s fastest supercomputers?

From scientific simulation, visualization, data analysis to machine learning, various modern computing workloads are driving supercomputing centers, cloud service providers, and enterprises to rethink their computing architectures.

Relying on processor, network, or software optimization alone can no longer meet the latest needs of researchers, engineers, and data scientists.

The data center takes its place as a new computing unit, so companies must focus on the entire technology stack.

The latest ranking of the world’s most powerful systems shows that this trend of using full-stack methods in the new generation of supercomputers continues.

In the latest TOP500 list released at the SC21 High Performance Computing Conference this week, NVIDIA technology provides acceleration for 355 supercomputer systems, accounting for more than 70% of the list, and more than 90% of the new systems use NVIDIA technology . Compared with the 342 systems on the TOP500 list released in June (68% of the list) using NVIDIA technology, the growth trend is obvious.

NVIDIA also continues to maintain its lead on the Green500 list of the world’s most energy-efficient systems, occupying 23 of the top 25 systems in the list, which is the same as in June. On average, the energy efficiency of systems using NVIDIA GPUs is 3.5 times higher than that of non-GPU systems.

The GPU-accelerated Azure supercomputer from Microsoft ranks tenth on the list. This is the first time that cloud-based systems rank among the top 10, and a new generation of cloud-native systems has emerged.

Artificial intelligence is bringing about a revolution in scientific computing. In recent years, the number of papers on high-performance computing and machine learning has increased sharply, from about 600 in 2018 to nearly 5,000 in 2020.


New benchmarks including HPL-AI and MLPerf HPC also emphasize the continued convergence of high-performance computing and AI workloads.

As a new benchmark that combines high-performance computing and artificial intelligence workloads, HPL-AI uses deep learning and the foundation of many scientific and commercial work—mixed-precision computing, while also providing the traditional standard scale of high-performance computing benchmarks— High accuracy of double-precision calculations.

The MLPerf HPC benchmark is suitable for computing methods that use artificial intelligence to achieve supercomputer simulation acceleration and enhancement. It is mainly used to test the three key tasks of the high-performance computing center astrophysics (Cosmoflow), weather (Deepcam) and molecular dynamics (Opencatalyst) The performance of the load.

NVIDIA solves the problem of the entire stack through GPU-accelerated processing, intelligent networks, GPU-optimized applications, and libraries that support the integration of AI and high-performance computing. This method improves the performance of workloads and promotes scientific breakthroughs.

Let’s take a look at how NVIDIA helps supercomputers achieve performance improvements.

Speed ​​up calculation
The parallel processing capability of GPU, coupled with more than 2500 GPU optimized applications, allows users to reduce the time of high-performance computing tasks from a few weeks to a few hours in most cases.

NVIDIA has been optimizing the CUDA-X library and GPU accelerated applications, so it is normal for users to find that the performance of their GPU architecture suddenly improves.

As a result, the performance of the most widely used scientific application (we call it the “golden suite”) has increased 16 times in the past 6 years, and it continues to improve.


Caption: Full-stack innovation brings 16 times the performance improvement of top high-performance computing, artificial intelligence and machine learning applications.

To help users quickly improve performance, NVIDIA provides the latest versions of artificial intelligence and high-performance computing software through containers in the NGC catalog. Users only need to drag and run the application on the supercomputer in the data center or cloud.

Fusion of high-performance computing and artificial intelligence
The application of artificial intelligence in high-performance computing can help researchers speed up simulation while maintaining the accuracy of traditional simulation methods.

For this reason, more and more researchers are beginning to use artificial intelligence to accelerate the speed of research, such as this year’s most prestigious Gordon Bell prize (Gordon Bell prize) finalists in the field of supercomputing. Major companies are racing to build E-level artificial intelligence computers to support this new model of fusion of high-performance computing and artificial intelligence.

Some relatively new benchmarks (such as HPL-AI and MLPerf HPC) also confirm this trend, emphasizing the continued integration of high-performance computing and AI workloads.

To promote this trend, last week NVIDIA launched a series of advanced new libraries and software development tool suites for high-performance computing.

Graphs are a key data structure in modern data science. Through a new Python package called the Deep Library (DGL), users can now project graphs into the deep neural network framework.

NVIDIA Modulus has built and trained a machine learning model with embedded physical information that can be used to learn and follow the laws of physics.

NVIDIA introduced three new libraries:

ReOpt-can improve the operational efficiency of the logistics industry with a scale of up to 10 trillion US dollars.
cuQuantum-can accelerate quantum computing research.
cuNumeric-Accelerate NumPy for scientists, data scientists, machine learning and artificial intelligence researchers in the Python community.
NVIDIA’s virtual world simulation and 3D workflow collaboration platform NVIDIA Omniverse is responsible for putting everything together.

Omniverse can be used to simulate warehouses, factories, physical and biological systems, 5G edges, robots, self-driving cars and even digital twins of avatars.

NVIDIA announced last week that it will use Omniverse to build a supercomputer called Earth-2 to predict climate change by creating a digital twin Earth.

Cloud native supercomputing

As supercomputers take on more and more workloads in data analysis, artificial intelligence, simulation, and visualization, CPUs have to support more communication tasks on large and complex systems.

DPU (data processor) can offload a variety of operations, effectively reducing this pressure.

As a fully integrated on-chip data center platform, NVIDIA BlueField DPU can offload and manage the infrastructure tasks of the data center, release the processor resources of the host, thereby achieving stronger security and more efficient supercomputing orchestration.

Combined with the NVIDIA Quantum InfiniBand platform, this architecture provides the best bare metal performance while natively supporting multi-node tenant isolation.

NVIDIA’s Quantum InfiniBand platform provides predictable isolation of bare metal performance. And with zero-trust security protection, these new systems are also more secure.

BlueField DPU isolates user applications from infrastructure tasks. The latest BlueField software platform NVIDIA DOCA 1.2 supports next-generation distributed firewalls and wider wire-speed data encryption. NVIDIA Morpheus will assume that the intruder has entered the data center and will use data science based on deep learning to detect the intruder’s activities in real time.

Leave a Comment

Your email address will not be published. Required fields are marked *