8. IRLBA: Fast Partial Singular Value Decomposition Method (2/3)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

130 Handb ook of Big Data

where

k+1

⎡

⎢

⎣

(B)

0 ρ

(B)

0 α

k+1

⎤

⎥

⎦

∈ R

(k+1)×(k+1)

(8.20)

which can be thought of as augmenting K

m−k

A, p

k+1

) with Equation 8.12 and

m−k

(AA

k+1

) with Equation 8.15. The next vector p

k+2

can now be obtained normally

from the GKLB algorithm, since

k+1

]



k+1



(8.21)

and (A

k+1

)

(B)

=0for1≤ i ≤ m. The GKLB algorithm will set p

k+2

=(A

k+1

−

k+1

)/β

k+1

with β

k+1

= p

k+2

 yielding

k+1

k+2

]



k+1



. (8.22)

Thus, after continuing m − k steps of the GKLB algorithm, we have analogous to

Equation 8.5 the modiﬁed GKLB (MGKLB) decomposition:

+ r

, (8.23)

where

⎡

⎢

⎣

k+1

k+2

m−1

0 α

⎤

⎥

⎦

, (8.24)

k+1

,...,p

] ∈ R

n×m

and

k+1

,...,q

] ∈ R

×m

have orthonormal

columns, the residual vector r

∈ R

satisﬁes

= 0. Notice that the ﬁrst k columns

of Equation 8.23 satisfy Equation 8.8. The following MGKLB algorithm implements the

modiﬁcation of the GKLB algorithm that was described above.

Algorithm 8.2 Modified GKLB

Input: A ∈ R

×n

: large rectangular matrix,

k : number of augmenting vectors,

m : maximum size of MGKLB decomposition (8.23),

k+1

:= [P

(B)

,...,P

(B)

k+1

] ∈ R

n×(k+1)

orthogonal matrix (8.13),

:= [Q

(B)

,...,Q

(B)

] ∈ R

×k

orthogonal matrix (8.14),

k+1

⎡

⎢

⎣

(B)

0 ρ

(B)

0 α

k+1

⎤

⎥

⎦

∈ R

(k+1)×(k+1)

∈ R

(k+1)×(k+1)

(8.20).

IRLBA 131

Output:

:= [

k+1

k+2

,...,p

] ∈ R

n×m

: matrix with orthonormal columns,

:= [

k+1

,...,q

] ∈ R

×m

: matrix with orthonormal c olumns,

∈ R

m×m

: matrix (8.24),

∈ R

: r esidual vector.

1. q

k+1

:= Ap

k+1

;

2. q

k+1

:= q

k+1

−



i=1

(B)

; α

k+1

:= q

k+1

;

k+1

:= [

k+1

];

3. for j = k +1:m

4. r

:= A

− α

;

5. Reortho gonalization: r

:= r

−

(

);

6. if j<mthen

7. β

:= r

; p

j+1

:= r

/β

;

j+1

:= [

j+1

];

8. q

j+1

:= Ap

j+1

− β

;

9. Reortho gonalization: q

j+1

:= q

j+1

−

(

j+1

);

10. α

j+1

:= q

j+1

; q

j+1

:= q

j+1

/α

j+1

;

j+1

:= [

j+1

];

11. endif

12. endfor

The MGKLB algorithm can also be used in place of the GKLB algorithm when k =0,

where the input matrix

k+1

:= p

k+1

. We now describe the algorithm for computing the

largest singular triplets by augmenting the GKLB method with Ritz vectors, the implicitly

restarted Lanczos bidiagonalization algorithm augmented with Ritz vectors (IRLBA). The

IRLBA is a simpliﬁcation of the actual computations carried out. For instance, the number

of augmented vectors used at each restart typically is larger than the number of desired

singular triplets. It was observed in [5,20,21] that a shift too close to a desired singular

value caused dampening of the associated desired singular vector and hence slowed down

convergence considerably. To overcome this problem in this context, we can augment by

k + ak (instead of k) singular triples, where ak is chosen by user, typically 2 or 3, such

that k + ak ≤ m − 3. The term −3 secures that at least 3 orthogonalization steps can be

carried out between restarts. See [5] for a heuristic strategy on adjusting ak in the context

of implicitly shifted GKLB method for least square problems. A similar heuristic strategy

can be used for augmentation.

Algorithm 8.3 Implicitly Restarted Lanczos Bidiagonalization Algorithm

Augmented with Ritz Vectors

Input: A ∈ R

×n

∈ R

: initial vector of unit length,

m : number of bidiagonalization steps,

k : number of desired singular triplets,

δ : toler a nce for accepting computed approximate singular triple, cf. (8.10).

Output: Computed set of approximate singular triples {σ

}

j=1

of A.

1. Compute partial GKLB decomposition (8.5) using GKLB Algorithm

or MGKLB A lgorithm with k =0.

2. Compute the singular value decomposition (8.7) of B

3. Check convergence: If all k desired singular triplets satisfy (8.10) then exit.

4. Determine the matric es

k+1

and

k+1

, by (8.13) (8.14), and (8.20), respectively.

5. Compute partial MGKLB decomposition (8.23) using MGKLB Algorithm.

6. Goto 2.

132 Handb ook of Big Data

8.4 Numerical Performance

AsimpleMATLABcode,irlbar, that implements IRLBA is given in the Appendix. For

a full MATLAB implementation of IRLBA, see [2]. The speed of IRLBA is signiﬁcant and

can be seen by the following straightforward example. Consider, the 1, 977, 885 × 109, 900

Rucci1 matrix in the Florida Sparse Matrix Collection [8]. The matrix group Rucci contains

matrices for least-squares problems and can be downloaded using the University of Florida’s

Sparse Matrix Collection program UFget. The MATLAB code irlbar was more than three

timesasfastasMATLAB’sinternalsvds for computing the six largest singular triplets of

the Rucci1 matrix within the same tolerance, 10

−5

. Speed was measured with MATLAB’s

tic and toc in version 8.3.0.532 (R2014a) on an iMac with 32 GB memory and 3.5 Ghz

Intel processor.

Anecdotal evidence of performance of the IRLBA in the statistical programming

language R was done by Bryan Lewis [23]. Lewis used IRLBA to compute the ﬁve largest

singular triplets on the Netﬂix training dataset (480, 189 ×17, 770 matrix) in a few minutes

on a laptop; see http://illposed.net/irlba.html for details.

The software implementations of IRLBA have existed for some time now in both

MATLAB [2] and R [23]. Recently Kane and Lewis [17] created the irlbpy package

for Python, a pip-installable open-source implementation of the IRLBA that is available

from Github at https://github.com/bwlewis/irlbpy. The irlbpy package is compatible

with dense and sparse data, accepting either numpy 2D arrays or matrices, or scipy

sparse matrices as input. The performance of irlbpy, the IRLBA augmented with Ritz

vector, is demonstrated in the graphs in Figure 8.1. The benchmarks in Figure 8.1 were

performedonaMacBookProwithaquad-core2.7GHzIntelCorei7with16GBof

1600 MHz DDR3 RAM running Python version 2.7.3, Numpy version 1.7.0, and SciPy

version 0.12.0. All matrices were square and randomly generated. Elements in the graphs

in Figure 8.1 represent the order of the matrices. These graphs show that in practice, when

searching for the largest singular triplets, IRLBA scales linearly with the size of the data,

which gives it a tremendous advantage over the traditional SVD methods. The IRLBA is

particularly well suited to problems when m and n are not too diﬀerent, which are often

the most computationally challenging ones. All examples in Figure 8.1 can be found at

https://github.com/bwlewis/irlbpy.

8.5 Conclusion

This chapter provides an overview of the IRLBA method augmented with Ritz vectors

developed in 2005 by Baglama and Reichel [1] and shows that the method is well suited

for ﬁnding the largest singular triplets of very large matrices. The method can easily be

implemented in a variety of programming languages and is computationally fast when

compared to similar methods. We thank Michael Kane and Bryan Lewis for all of their

work in this book and with the software implementations of IRLBA.

IRLBA 133

Seconds

Algorithm

IRLB

SVD

0.0e+00 2.5e+06 5.0e+06 7.5e+06 1.0e+07

Elements

(a)

0.0e+00 2.5e+06 5.0e+06 7.5e+06 1.0e+07

Elements

(b)

100

Seconds

0.0e+00 2.5e+06 5.0e+06 7.5e+06 1.0e+07

Elements

(c)

Seconds

FIGURE 8.1

(a) Performance comparison of the irlbpy and the Python numpy implementation of the

SVD calculating the 10 largest singular triplets of square random matrices. (b) The CPU

time required for irlbpy on dense square random matrices for speciﬁed values of nu (the

number of singular vectors). (c) The CPU time required for irlbpy on sparse square random

matrices for speciﬁed values of nu (the number of singular vectors).

Appendix

function [U,S,V] = irlbar(A)

% Sample MATLAB code for IRLBA Augmentation Ritz.

% Complete MATLAB code: http://www.netlib.org/numeralgo/na26.tgz

% Handbook of Big Data

% IRLBA: Fast Partial Singular Value Decomposition

% Author: James Baglama

m=20;k=6;ak=3;Smax = 1; J = 1; delta = 1d-5;

P = randn(size(A,2),1);

134 Handb ook of Big Data

for iter=1:1000

P(:,J) = P(:,J)/norm(P(:,J));

Q(:,J) = A*P(:,J);

ifJ>1,Q(:,J) = Q(:,J) - Q(:,1:J-1)*B(1:J-1,J); end

B(J,J) = norm(Q(:,J));

Q(:,J) = Q(:,J)/B(J,J);

for i=J:m

r = (Q(:,i)’*A)’ - B(i,i)*P(:,i);

r = r - P(:,1:i)*(r’*P(:,1:i))’;

ifi<m

B(i,i+1) = norm(r);

P(:,i+1) = r/B(i,i+1);

Q(:,i+1) = A*P(:,i+1) - B(i,i+1)*Q(:,i);

B(i+1,i+1) = norm(Q(:,i+1));

Q(:,i+1) = Q(:,i+1)/B(i+1,i+1);

end

[U,S,V] = svd(B); Smax = max(Smax,S(1,1));

rnorm = norm(r);

P(:,1:k+ak) = P*V(:,1:k+ak);

Q(:,1:k+ak) = Q*U(:,1:k+ak);

S = S(1:k+ak,1:k+ak);

rho = rnorm*U(m,1:k+ak);

if max(abs(rho(1:k))) < Smax*delta || iter == 1000

V = P(:,1:k); U = Q(:,1:k); S = S(1:k,1:k);

return;

end

B = [S, rho’];

J = k+ak+1;

P(:,J)= r/rnorm;

end

References

1. James Baglama and Lothar Reichel. Augmented implicitly restarted Lanczos bidiago-

nalization methods. SIAM Journal on Scientiﬁc Computing, 27(1):19–42, 2005.

2. James Baglama and Lothar Reichel. Primer for the Matlab functions IRLBA and

IRLBABLK. Available online. http://www.netlib.org/numeralgo/na26.tgz, 2006.

3. James Baglama and Lothar Reichel. Restarted block Lanczos bidiagonalization

methods. Numerical Algorithms, 43(3):251–272, 2006.

4. James Baglama and Lothar Reichel. An implicitly restarted block Lanczos bidiagonal-

ization method using leja shifts. BIT Numerical Mathematics, 53(2):285–310, 2013.

5. James Baglama and Daniel Richmond. Implicitly restarting the LSQR algorithm.

Electronic Transactions on Numeric a l Analysis, 42:85–105, 2014.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8. IRLBA: Fast Partial Singular Value Decomposition Method (2/3)

Create new playlist

Sign In

Sign Up

Table of Contents for
8. IRLBA: Fast Partial Singular Value Decomposition Method (2/3)