High Power Computing – HPC
Blog Credit: Trupti Thakur
Image Courtesy: Google
High Performance Computing (HPC)
High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business.
The existing cloud environment is not best suited to handle the complex compute-intensive technology which is increasingly paving its way for mainstream and commercial use. With the increased adoption of HPC across cloud and on-prem deployments in nearly every industry including space exploration, drug discovery, financial modeling, automotive design, and systems engineering and usage that can span Clouds and On-prem deployments, security will become a great concern at an accelerating rate.
HPC is technology that uses clusters of powerful processors, working in parallel, to process massive multi-dimensional datasets (big data) and solve complex problems at extremely high speeds. HPC systems typically perform at speeds more than one million times faster than the fastest commodity desktop, laptop or server systems.
For decades the HPC system paradigm was the supercomputer, a purpose-built computer that embodies millions of processors or processor cores. Supercomputers are still with us; at this writing, the fastest supercomputer is the US-based Frontier (link resides outside ibm.com), with a processing speed of 1.102 exaflops, or quintillion floating point operations per second (flops). But today, more and more organizations are running HPC solutions on clusters of high-speed computers servers, hosted on premises or in the cloud.
HPC workloads uncover important new insights that advance human knowledge and create significant competitive advantage. For example, HPC is used to sequence DNA, automate stock trading, and run artificial Intelligence (AI) algorithms and simulations—like those enabling self-driving automobiles—that analyze terabytes of data streaming from IoT sensors, radar and GPS systems in real time to make split-second decisions.
How does HPC work?
A standard computing system solves problems primarily using serial computing—it divides the workload into a sequence of tasks, and then executes the tasks one after the other on the same processor.
In contrast, HPC leverages
- Massively parallel computing. Parallel computing runs multiple tasks simultaneously on multiple computer servers or processors. Massively parallel computing is parallel computing using tens of thousands to millions of processors or processor cores.
- Computer clusters (also called HPC clusters). An HPC cluster consists of multiple high-speed computer servers networked together, with a centralized scheduler that manages the parallel computing workload. The computers, called nodes, use either high-performance multi-core CPUs or, more likely today, GPUs (graphical processing units), which are well suited for rigorous mathematical calculations, machine learning models and graphics-intensive tasks. A single HPC cluster can include 100,000 or more nodes.
- High-performance components: All the other computing resources in an HPC cluster—networking, memory, storage and file systems—are high-speed, high-throughput and low-latency components that can keep pace with the nodes and optimize the computing power and performance of the cluster.
HPC and cloud computing
As recently as a decade ago, the high cost of HPC—which involved owning or leasing a supercomputer or building and hosting an HPC cluster in an on-premises data center—put HPC out of reach for most organizations.
Today HPC in the cloud—sometimes called HPC as a service, or HPCaaS—offers a significantly faster, more scalable and more affordable way for companies to take advantage of HPC. HPCaaS typically includes access to HPC clusters and infrastructure hosted in a cloud service provider’s data center, plus ecosystem capabilities (such as AI and data analytics) and HPC expertise.
Today HPC in the cloud is driven by three converging trends:
- Surging demand. Organizations across all industries are becoming increasingly dependent on the real-time insights and competitive advantage that results from solving the complex problems only HPC apps can solve. For example, credit card fraud detection—something virtually all of us rely on and most of us have experienced at one time or another—relies increasingly on HPC to identify fraud faster and reduce annoying false positives, even as fraud activity expands and fraudsters’ tactics change constantly.
- Prevalence of lower-latency, higher-throughput RDMA networking. RDMA—remote direct memory access—enables one networked computer to access another networked computer’s memory without involving either computer’s operating system or interrupting either computer’s processing. This helps minimize latency and maximize throughput. Emerging high-performance RDMA fabrics—including Infiniband, Virtual Interface Architecture, and RDMA over converged ethernet (RoCE)—are essentially making cloud-based HPC possible.
- Widespread public-cloud and private-cloud HPCaaS availability. Today every leading public cloud service provider offers HPC services. And while some organizations continue to run highly regulated or sensitive HPC workloads on-premises, many are adopting or migrating to private-cloud HPC solutions offered by hardware and solution vendors.
HPC use cases
HPC applications have become synonymous with AI apps in general, and with machine learning and deep learning apps in particular; today most HPC systems are created with these workloads in mind. These HPC applications are driving continuous innovation in:
Healthcare, genomics and life sciences. The first attempt to sequence a human genome took 13 years; today, HPC systems can do the job in less than a day. Other HPC applications in healthcare and life sciences include drug discovery and design, rapid cancer diagnosis, and molecular modeling.
Financial services. In addition to automated trading and fraud detection (noted above), HPC powers applications in Monte Carlo simulation and other risk analysis methods.
Government and defense. Two growing HPC use cases in this area are weather forecasting and climate modeling, both of which involve processing vast amounts of historical meteorological data and millions of daily changes in climate-related data points. Other government and defense applications include energy research and intelligence work.
Energy. In some cases overlapping with government and defense, energy-related HPC applications include seismic data processing, reservoir simulation and modeling, geospatial analytics, wind simulation and terrain mapping.
High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems.
Overview
HPC integrates systems administration (including network and security knowledge) and parallel programming into a multidisciplinary field that combines digital electronics, computer architecture, system software, programming languages, algorithms and computational techniques. HPC technologies are the tools and systems used to implement and create high performance computing systems. Recently[, HPC systems have shifted from supercomputing to computing clusters and grids. Because of the need of networking in clusters and grids, High Performance Computing Technologies are being promoted[ by the use of a backbone, because the collapsed backbone architecture is simple to troubleshoot and upgrades can be applied to a single router as opposed to multiple ones.
The term is most commonly associated with computing used for scientific research or computational science. A related term, high-performance technical computing (HPTC), generally refers to the engineering applications of cluster-based computing (such as computational fluid dynamics and the building and testing of virtual prototypes). HPC has also been applied to business uses such as data warehouses, line-of-business (LOB) applications, and transaction processing.
High-performance computing (HPC) as a term arose after the term “supercomputing”. HPC is sometimes used as a synonym for supercomputing; but, in other contexts, “supercomputer” is used to refer to a more powerful subset of “high-performance computers”, and the term “supercomputing” becomes a subset of “high-performance computing”. The potential for confusion over the use of these terms is apparent.
Because most current applications are not designed for HPC technologies but are retrofitted, they are not designed or tested for scaling to more powerful processors or machines. Since networking clusters and grids use multiple processors and computers, these scaling problems can cripple critical systems in future supercomputing systems. Therefore, either the existing tools do not address the needs of the high performance computing community or the HPC community is unaware of these tools. A few examples of commercial HPC technologies include:
- the simulation of car crashes for structural design
- molecular interaction for new drug design
- the airflow over automobiles or airplanes
In government and research institutions, scientists simulate galaxy creation, fusion energy, and global warming, as well as work to create more accurate short- and long-term weather forecasts. The world’s tenth most powerful supercomputer in 2008, IBM Roadrunner (located at the United States Department of Energy’s Los Almos National Laboratory simulated the performance, safety, and reliability of nuclear weapons and certifies their functionality.
TOP500
A list of the most powerful high-performance computers can be found on the TOP500 list. The TOP500 list ranks the world’s 500 fastest high-performance computers, as measured by the High Performance LINPACK (HPL) benchmark. Not all computers are listed, either because they are ineligible (e.g., they cannot run the HPL benchmark) or because their owners have not submitted an HPL score (e.g., because they do not wish the size of their system to become public information for defense reasons). In addition, the use of the single LINPACK benchmark is controversial, in that no single measure can test all aspects of a high-performance computer. To help overcome the limitations of the LINPACK test, the U.S. government commissioned one of its originators, Jack Dongarra of the University of Tennessee, to create a suite of benchmark tests, including LINPACK and others, called the HPC Challenge benchmark suite. This evolving suite has been used in some HPC procurements, but, because it is not reducible to a single number, it has been unable to overcome the publicity advantage of the less useful TOP500 LINPACK test. The TOP500 list is updated twice a year, once in June at the ISC European Supercomputing Conference and again at a US Supercomputing Conference in November.
Many ideas for the new wave of grid computing were originally borrowed from HPC.
Blog By : Trupti Thakur