Parallel Global and Local Convergent Algorithms for Solving the Inverse Additive Singular Value Problem



Download 0.58 Mb.
Page3/3
Date09.06.2018
Size0.58 Mb.
#53782
1   2   3

4.2 Parallel LP algorithm

Each processor execute the next parallel algorithm of LP with the portion of data it contains.


Parallel LP algorithm for IASVP
1. T = [ trace(AitAr) ]r,i=1,l

(* using ddot of BLAS,MPI_REDUCE of MPI *)

2. [L,U] = lu(T)

(* using pdgetrf, pdgetrs of ScaLAPACK *)

3. Compute T-1

(* using pdgetri of ScaLAPACK *)

4. stop = 0

5. For k = 0,1,..., While stop = 0

5.1 A(k) = A0 + c1(k)A1 + ... + cl(k)Al

(* using daxpy of BLAS *)

5.2 [P(k),S(k),Q(k)] = svd(A(k))

(* using pdgesvd of ScaLAPACK *)

5.3 X(k) = P(k)S*Q(k)t

(* using dscal of BLAS,pdgemm of PBLAS *)

5.4 d(k)=[trace(Art (X(k) - A0t)]r =1,l

(* using ddot of BLAS,

MPI_REDUCE of MPI,*)

5.5 Compute c(k+1) solving Tc(k+1) = d(k)

(* using pdgemv of PBLAS *)

5.6 Broadcast c(k+1) to all processors

(* using dgebs2d,dgebr2d of BLACS *)

5.7 error = || c(k+1)- c(k)||2

(* using daxpy, dnrm of BLAS *)

5.8 If error < tol Then stop = 1


The theoretical computational cost is:

TA = O() + k O() Flops

TC=O(Prc m2+Prc m2) + k O(Prc m).

4.3 Experimental results

We present some results from numerical experiences carried out with the two methods proposed in this work for solving the IASVP. The algorithms have been implemented in FORTRAN90 language and have been tested and validated on a cluster of PCs, formed by six PCs, each one with Pentium Xeon processors, working at 2 Ghz, with 1 Gbytes of RAM, and connected through a SCI network. For the sake of simplicity we only show a few results. A more detailed analysis and figures can be found in [Flo04].



First, we analyse the convergence of both algorithms. In Table 1 the convergence of MI and LP are tested with different c(0), with m=n=l=10. MI shows a fast convergence when c(0) is close to c*, and give good approximations to S*; LP always converges to S* although with a high number of iterations and with less precision.
Table 1. Convergence of MI and EP algorithms, with differents initial points c(0).

c(0) = c* +  

MI

i

K

||S(k)-S*||2

Convergence

1e-3

2

2e-9

yes

1e-2

3

4e-9

yes

1e-1

14000

5e2

no

1e0

14000

3e1

no

1e1

14000

2e3

no

1e3

14000

5e0

no

c(0) = c* +  

LP

i

K

||S(k)-S*||2

Convergence

1e-3

117

2e-5

yes

1e-2

395

3e-5

yes

1e-1

757

8e-5

yes

1e0

6872

1e-4

yes

1e1

2234

4e-5

yes

1e3

2183

4e-5

yes

We can take advantage of the global convergence properties of LP algorithm to transform the MI into a global convergent algorithm. In this sense we have used as initial guest for the MI algorithm a value c(k) obtained from the LP algorithm with a small accuracy. The idea was first proposed by Chen and Chu in [11] and the objetive is to find a c(0) for MI which is close enough to c*, to guarantee the quadratic convergence of Newton’s method.

In Table 2 we present a typical test, for this case showing the convergence of MI algorithm when c(0) is chosen from LP iteration. The test has been carried out in a problem with size m=n=l=10. Similar results have been found with other sizes.
Table 2. Convergence of MI taking c(0) = c(k) , where c(k) is LP result.


c(0)

K

||S(k)-S*||2




c* + 1e-1

23

1e-2

LP

c* + 1e1

950

4e-3

c* + 1e3

899

4e-3

c(23) of LP

3

1e-10

MI

c(950) of LP

3

6e-10

c(899) of LP

3

6e-10

Table 3. Runtime and speedup of MI.



 

Initial part, steps 1,2,3

Prc

Runtime (seconds)

Speedup

1

14

46

107

312

1

1

1

1

2

8

26

58

177

1.8

1.8

1.8

1.8

4

4

187

32

58

3.5

0.2

3.3

5.4

6

3

7

21

37

4.7

6.6

5.1

8.4

m

512

768

1024

1408

512

768

1024

1408

 

One step of loop 4. For k=0,1,2,...

Prc

Runtime (seconds)

Speedup

1

355

1741

5458

19075

1

1

1

1

2

184

893

2784

9656

1.9

1.9

2.0

2.0

4

90

471

1434

4994

3.9

3.7

3.8

3.8

6

60

323

1010

3471

5.9

5.4

5.4

5.5

m

512

768

1024

1408

512

768

1024

1408

Now, we analyze the performance of the parallel algorithms. We use parallel runtime and speedup [8] to assess the MI and LP. Both methods have a different parallel behaviour of initial part and iterative part of their algorithms. Besides, the total time depends on the number of iterations and this depends on how far is the initial point from the solution. Thus in Table 3, we show the runtime and speedup of MI separately for both parts. Also, in Table 4 we show the runtime and speedup of LP for both parts. The results are shown for different sizes of problem, with m=n=l, and different number of processors.


Table 4. Runtime and speedup of LP.

 

Initial part, steps 1,2,3,4

Prc

Runtime (seconds)

Speedup

1

260

1259

4068

13672

1

1

1

1

2

128

630

1967

6843

2.0

2.0

2.1

2.0

4

69

339

1053

3531

3.8

3.7

3.9

3.9

6

52

225

706

2376

5.0

5.6

5.8

5.8

m

512

768

1024

1408

512

768

1024

1408

 

One step of loop 5. For k=0,1,2,...

Prc

Runtime (seconds)

Speedup

1

23

79

182

552

1

1

1

1

2

17

54

126

359

1.4

1.5

1.4

1.5

4

7

24

65

130

3.3

3.3

2.8

4.2

6

5

12

45

64

4.6

6.6

4.0

8.6

m

512

768

1024

1408

512

768

1024

1408

As it can be appreciated, parallel performance of MI and LP algorithms are good enough, and better as the problem size increases, alleviating the high computational cost of the sequential algorithms.



5 Conclusions

We have explored two parallel algorithms to solve the IASVP, by adapting some well-know methods for solving the IAEP. Convergence features of both algorithms have been studied and contrasted with numerical experiences.



LP algorithm is a good alternative to solve the problem of the local convergence of MI algorithm if it is used in the estimate of the initial value, c(0), for MI.

The algorithms developed have been parallelized with good performance on a cluster of PCs, by using standard public domain software: MPI, BLAS, LAPACK, ScaLAPACK, thus guaranteeing the portability of the code.

Performances of both parallel algorithms are fairly good; the speedups reached are near to the optimum, especially in the case of large size problems.


Acknowledgement


This work has been supported by Spanish CICYT Grant TIC2003-08238-C02-02

References:

[1] Anderson, E. et al., LAPACK User Guide, SIAM, 1995.

[2] Blackford, L. et al., ScaLAPACK User’s Guide, SIAM, 1997.

[3] Choi, J et al., A Proposoal for a Set of Parallel Basic Linear Algebra Subprograms, Tecnical report ut-cs-95-292, Department of Computer Science, University of Tenessee, 1995.

[4] Groupp, W. et al., Using MPI: Portable Parallel Programming with Message Passing Interface. MIT Press, 1994.

[5] Dennis, J. and Schnabel, R., Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice Hall, 1983.

[6] Dongarra, J. and Van de Geijn A., Two Dimensional Basic Linear Algebra Comunications Subprograms, Tecnical report st-cs-91-138, Department of Computer Science, Univertisy of Tenessee, 1991.

[7] Hammarling, S. et al., An Extended Set of Fortran Basic Linear Algebra Subroutines, ACM Trans. Methemathical Software, 1988.

[8] Kumar, V. et al., Introduction to Parallel Computing. Design and Analysis Algorithms, The Benjamin/Cummings Publishing Company, 1994.

[9] Bai, Z. et al., The Intercept Method for Inverse Singular Value Problems, 2003.

[10] Chen, X. and Chu, T., On the Least Squares Solution of Inverse Eigenvalue Problems, SIAM, Journal on Numerical Analysis, Vol. 33, No. 6, 1996, pp 2417-2430.

[11] Chu M., Numerical Methods for Inverse Singular Value Problems. SIAM, Journal Numerical Analysis, Vol. 29, 1992, pp 885-903.

[12] Friedland, S. et al., The Formulation and Analysis of Numerical Methods for Inverse Eigenvalue Problems, SIAM, Journal on NumericalAnalysis, Vol. 4, No. 3, 1987, pp 634-667.

[13] Flores, G., Un conjunto de métodos para la resolución del PIAVS, Technical Report, DSIC/04, Departamento de Sistemas y Computación, Universidad Politécnica de Valencia, 2004.



[14] Engl, H. and Kügler, P. Nonlinear Inverse Problems: Theoretical Aspects and Some Industrial Applications. Institute for Pure&Applied Mathematics. University of California, Los Angeles.

http://www.ipam.ucla.edu/programs/invtut/, 2003.



Download 0.58 Mb.

Share with your friends:
1   2   3




The database is protected by copyright ©ininet.org 2024
send message

    Main page