Blanchard, Pierre and Higham, Desmond J. and Higham, Nicholas J. (2019) Accurately Computing the LogSumExp and Softmax Functions. [MIMS Preprint]
This is the latest version of this item.
Text
paper.pdf Download (501kB) 
Abstract
Evaluating the logsumexp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically equivalent, these variants behave differently in floatingpoint arithmetic \new{and shifting can introduce subtractive cancellation}. We give rounding error analyses of different evaluation algorithms and interpret the error bounds using condition numbers for the functions. We conclude, based on the analysis and numerical experiments, that the shifted formulas are of similar accuracy to the unshifted ones, so can safely be used, but that a divisionfree variant of softmax can suffer from loss of accuracy.
Item Type:  MIMS Preprint 

Uncontrolled Keywords:  logsumexp, softmax, floatingpoint arithmetic, rounding error analysis, overflow, underflow, condition number 
Subjects:  MSC 2010, the AMS's Mathematics Subject Classification > 65 Numerical analysis 
Depositing User:  Nick Higham 
Date Deposited:  31 Jan 2020 09:13 
Last Modified:  31 Jan 2020 09:13 
URI:  http://eprints.maths.manchester.ac.uk/id/eprint/2744 
Available Versions of this Item

Accurate Computation of the LogSumExp and Softmax Functions. (deposited 08 Sep 2019 11:10)
 Accurately Computing the LogSumExp and Softmax Functions. (deposited 31 Jan 2020 09:13) [Currently Displayed]
Actions (login required)
View Item 