CPFloat: A C library for emulating low-precision arithmetic

Fasi, Massimiliano and Mikaitis, Mantas (2020) CPFloat: A C library for emulating low-precision arithmetic. [MIMS Preprint]

There is a more recent version of this item available.

Text
fami20.pdf
Download (782kB)

Abstract

Low-precision floating-point arithmetic can be simulated via software by executing each arithmetic operation in hardware and rounding the result to the desired number of significant bits. For IEEE-compliant formats, rounding requires only standard mathematical library functions, but handling subnormals, underflow, and overflow demands special attention, and numerical errors can cause mathematically correct formulae to behave incorrectly in finite arithmetic. Moreover, the ensuing algorithms are not necessarily efficient, as the library functions these techniques build upon are typically designed to handle a broad range of cases and may not be optimized for the specific needs of rounding algorithms. CPFloat is a C library that offers efficient routines for rounding arrays of binary32 and binary64 numbers to lower precision. The software exploits the bit level representation of the underlying formats and performs only low-level bit manipulation and integer arithmetic, without relying on costly library calls. In numerical experiments the new techniques bring a considerable speedup (typically one order of magnitude or more) over existing alternatives in C, C++, and MATLAB. To the best of our knowledge, CPFloat is currently the most efficient and complete library for experimenting with custom low-precision floating-point arithmetic available in any language.

Item Type:	MIMS Preprint
Subjects:	MSC 2010, the AMS's Mathematics Subject Classification > 65 Numerical analysis
Depositing User:	Mr Massimiliano Fasi
Date Deposited:	20 Oct 2020 10:59
Last Modified:	20 Oct 2020 10:59
URI:	https://eprints.maths.manchester.ac.uk/id/eprint/2785

Available Versions of this Item

CPFloat: A C library for emulating low-precision arithmetic. (deposited 20 Oct 2020 10:59) [Currently Displayed]
- CPFloat: A C library for simulating low-precision arithmetic. (deposited 06 Mar 2022 09:32)
  - CPFloat: A C library for simulating low-precision arithmetic. (deposited 22 May 2022 06:54)
    - CPFloat: A C library for simulating low-precision arithmetic. (deposited 23 Oct 2022 14:44)

Actions (login required)

View Item