High-performance computing in the next decade
Prof. Marc Snir says that systems with exascale performance are likely to be deployed in 2020-2023. He is involved in several efforts to define the software research agenda for exascale and start executing on it. In his opinion exascale may be the end of the road, at least for a system using silicon technology. However, we should start thinking seriously about alternative technologies and on a computer science agenda that is not predicated on Moore's Law. Prof. Hoefler is asking Prof. Snir about his views on high-performance computing in the next decade.
Torsten Hoefler: Dear Prof. Snir, as one of the pioneers in parallel computing, what do you believe was the most substantial ingredient to its success today?
Marc Snir: "Necessity is the mother of invention" — we have plenty of computational problems that require more performance than provided by a serial execution. Parallelism happened because it was needed and because it was relatively simple to build parallel hardware by replicating serial hardware.
TH: What role did high-performance computing (HPC) play in the development of parallel programming as we know it today?
MS: Both MPI and OpenMP were developed in support of high-performance computing. They are finding their way into a broader range of application, including data analytics; but the bulk of data analytics applications are using frameworks that were developed for cloud computing, It will be interesting to see what happens when these two environments meld.
TH: The recent success of machine learning, especially deep neural networks in combination with massive data, has been enabled by parallelism and novel compute devices such as GPUs. Do you see this trend continues that HPC technologies will drive large markets?
MS: I do not think that HPC technologies drive large markets; to the contrary, HPC finds ingenious ways of using components that have broad markets: workstations, PC and blade servers have provided, in turn, the node technology for supercomputers; GPUs come from the graphic market; Infiniband has a big market in shared storage. The examples can be multiplied, both in hardware and in software.
The large emerging market for "deep learning" is both a blessing and a curse: It motivates vendors to develop components that can be used in HPC, such as high-performance GPUs; it also motivates vendors to jack up the price of these components — many users of "deep learning" are less price sensitive than HPC users. The questions that is still unresolved is to what extend components developed for deep neural networks will be general purpose enough to be useful for HPC. A GPU developed for the ML market emphasizes half-precision and single-precision, not double-precision. Perhaps not ideal, but useful: The use of lower precision is a good way of saving energy in HPC computations, The IBM True North neuromorphic chip is unlikely to accelerate most HPC applications.
One set of technologies that is likely to help both large-scale ML and traditional HPC are the technologies that will facilitate the design and exploitation of heterogeneous nodes, which multiple types of compute nodes, specialized accelerators (including FPGAs) , and storage devices.
This includes packaging solutions (e.g., for integrating multiple chipsets onto a transposer), standard hardware interfaces for connecting these components, and software APIs and programming models to handle heterogeneity.
TH: Moore's law is predicted to end soon, one could argue it's already ended with the recent slow-down of integration. How will HPC as a fore-runner of the computing industry react?
MS: Moore’s law has slowed and is ending soon. Furthermore, there is no proven logic technology that is ready to replace CMOS, so none will be used industrially in the coming decade: It takes a decade from laboratory demonstration to broad use in manufacturing. Even worse, none of the proposed replacement technologies (with the possible exception of tunneling devices) promise a better delay x power product; it is possible that there will be no progress in device performance beyond 2024. Additional advances will come from packaging, microarchitecture, software and algorithms. In the past, the best way to improve performance was to use the latest advance in silicon technology as quickly as possible. We have layered, inefficient software stacks, we maintain back compatibility to decades old ISAs. So, there is 1-2 order of magnitudes performance improvements to be gained by "reducing friction" in our systems.
HPC has "performance" in its name; performance is a key concern for HPC, so the HPC community will be at the forefront of the effort to get more performance from barely improving device technology.
TH: What will be the "next big thing" to be adopted in the community? FPGAs, neuromorphic computing, quantum computing?
MS: I did a search on Google for "Computer-aided Programming": Nothing published with this title in the last 20 years. It is ripe for being the "next big thing": Can we use supercomputers to speed-up the development of supercomputing codes?