Age of Mixed-Precision: Algorithms, Libraries, and Applications

Organizer: Piotr Luszczek, (MIT LLSC & UTK ICL)

This is a special special session of IEEE HPEC 2025.

AI has triggered an explosion of new capabilities for computing with mixed-precision number formats. Algorithms that can take advantage of these innovations can be accelerated by many orders of magnitude. Future progress in computational methods is dependent on the development of these new approaches. This session will focus on the use of mixed-precision formats across a variety of hardware platforms. The session will highlight advancements in algorithms, software, libraries, and applications that have incorporated these new number formats and leveraged their potential for improvements in performance, bandwidth, and energy consumption.

Invited talks:

Abstracts and Bios

Why Exact Summation and Dot Products Obviate the Need for Mixed Precision
Abstract
In a stable computation, the relative accuracy of outputs is similar to that of inputs. It is wasteful to use precision that exceeds the input relative accuracy (call that the working precision) or the discretization error of an algorithm, with two major exceptions: Accumulation of many small quantities into a large quantity, and dot products. Both exaggerate the lack of associative addition in floating-point arithmetic. Monte Carlo methods work poorly with 32-bit floats because accumulations stop changing after billions of quantities are summed. Linear algebra, computer graphics, and AI training inference rely heavily on dot products as their kernel operation, which drives programmers to use twice the working position for the accumulator.
The posit real number format offers the ability to increase and reduce precision by one bit at a time without decoding the number (that is, extracting the exponent and significand). However, it unburdens the programmer from having to tediously make such precision decisions by instead providing a means for computing sums and dot products with no rounding or overflow. This allows many 64-bit floating-point workloads to instead use 16-bit posits as the working precision, saving memory, bandwidth, energy, and execution time without reducing programmer productivity.
Short Bio
Prof. John L. Gustafson (www.johngustafson.net) is Chief Scientist of Vq Research and a Visiting Scholar at Arizona State University. He is the inventor of several novel forms of computer arithmetic first introduced in his 2015 book, The End of Error: Unum Computing. He is best known for his 1988 argument showing that parallel processing performance need not be limited by "Amdahl's law," now generally known as Gustafson's law. Previously, he has been Senior Fellow and Chief Product Architect at AMD and a Director of Intel Labs. He is a recipient of the inaugural Gordon Bell Prize and is a Golden Core member of IEEE.
Mixed Feelings about Mixed Precisions
Abstract
The future of simulations lies in leveraging hardware features designed for the AI market, particularly in low-precision computations. Modern NVIDIA GPUs exemplify this trend, offering significant performance gains through low-precision computations, resulting in reduced elapsed time, smaller memory footprints, and energy savings. We harness these capabilities to develop fast mixed-precision linear algebra algorithms. Our adaptive precision conversion strategy dynamically adjusts computation accuracy, maintaining high precision only where necessary within the matrix operator, while still meeting application-worthy precision requirements. This talk will illustrate how these algorithms revolutionize computational efficiency for geospatial statisticians, bioinformaticians, and geophysicists, having significant implications for environmental computational statistics, genome-wide association studies in computational biology, and seismic processing.
Short Bio
Hatem holds the position of Princial Research Scientist at KAUST where he is also advising several KAUST students in their MS and PhD research. His research interests include parallel numerical algorithms. parallel programming models, mixed-precision computations, low-rank matrix approximations, performance optimizations for manycore architectures, and high performance computing. He has contributed to the integration of numerical algorithms into mainstream vendors’ scientific libraries such as NVIDIA cuBLAS and HPE/Cray LibSci. He has been collaborating with domain scientists, i.e., astronomers, statisticians, computational chemists, bioinformaticians, and geophysicists on leveraging their applications to meet the challenges at exascale. He received best paper awards at EuroPar, ACM PASC, and IEEE ISC conferences. He has co-authored all four of KAUST Gordon Bell finalist papers since 2022. In November 2024, he received the prestigious ACM Gordon Bell Prize (shared) in climate modeling for his contributions to developing an exascale climate emulator.