Howell, G. W. and Demmel, J. W. and Fulton, C. T. and Hammarling, S. and Marmol, K. (2006) Cache Efficient Bidiagonalization Using BLAS 2.5 Operators. [MIMS Preprint]
PDF
bidiag23_03_06.pdf Download (289kB) |
Abstract
On cache based computer architectures using current standard algorithms, Householder bidiagonalization requires a significant portion of the execution time for computing matrix singular values and vectors In this paper we reorganize the sequence of operations for Householder bidiagonalization of a general m × n matrix, so that two (_GEMV) vector-matrix multiplications can be done with one pass of the unreduced trailing part of the matrix through cache. Two new BLAS 2.5 operations approximately cut in half the transfer of data from main memory to cache. We give detailed algorithm descriptions and compare timings with the current LAPACK bidiagonalization algorithm.
Item Type: | MIMS Preprint |
---|---|
Subjects: | MSC 2010, the AMS's Mathematics Subject Classification > 65 Numerical analysis |
Depositing User: | Sven Hammarling |
Date Deposited: | 07 Apr 2006 |
Last Modified: | 08 Nov 2017 18:18 |
URI: | https://eprints.maths.manchester.ac.uk/id/eprint/210 |
Actions (login required)
View Item |