Numerical Behavior of the NVIDIA Tensor Cores

Fasi, Massimiliano and Higham, Nicholas J. and Mikaitis, Mantas and Pranesh, Srikara (2020) Numerical Behavior of the NVIDIA Tensor Cores. [MIMS Preprint]

Warning
There is a more recent version of this item available.
[thumbnail of fhms20.pdf] Text
fhms20.pdf

Download (221kB)

Abstract

We explore the floating-point arithmetic used by the NVIDIA Volta tensor cores, which are hardware accelerators for mixed-precision matrix multiplication. We investigate what precision is used for intermediate results, whether subnormal numbers are supported, what rounding mode is used, in which order the operations in the dot products arising in the matrix multiplication are performed, and whether partial sums are normalized. These aspects are not documented by NVIDIA, and we gain insight by running carefully designed numerical experiments on these hardware accelerators. Knowing the answers to these questions is important if one wishes to: 1) build hardware that computes a matrix-matrix product matching the results of NVIDIA tensor cores; 2) achieve bit-reproducible results when designing on conventional hardware with IEEE 754 floating-point arithmetic code meant to run on NVIDIA tensor cores; and 3) understand the differences between results produced by code that utilizes tensor cores and code that uses only IEEE 754-compliant arithmetic operations. As an additional result, we point out a non-monotonicity issue that arises in floating-point multi-operand addition without the normalization of the intermediate results.

Item Type: MIMS Preprint
Subjects: MSC 2010, the AMS's Mathematics Subject Classification > 65 Numerical analysis
MSC 2010, the AMS's Mathematics Subject Classification > 68 Computer science
Depositing User: Mr Mantas Mikaitis
Date Deposited: 23 Apr 2020 19:33
Last Modified: 23 Apr 2020 19:33
URI: https://eprints.maths.manchester.ac.uk/id/eprint/2761

Available Versions of this Item

Actions (login required)

View Item View Item