|
|
| |
| # lines: |
205 | | # code: |
205 | | # comment: | 0 | |
# blank: | 0 |
| # Variables: | 33 |
| # Callers: | 1 |
| # Callings: | 0 |
| # Words: | 55 |
| # Keywords: | 35 |
|
|
|
|
|
..
.. Array Arguments ..
..
Purpose
=======
PDLAUU2 computes the product U * U' or L' * L, where the triangular
factor U or L is stored in the upper or lower triangular part of
the matrix sub( A ) = A(IA:IA+N-1,JA:JA+N-1).
If UPLO = 'U' or 'u' then the upper triangle of the result is stored,
overwriting the factor U in sub( A ).
If UPLO = 'L' or 'l' then the lower triangle of the result is stored,
overwriting the factor L in sub( A ).
This is the unblocked form of the algorithm, calling Level 2 BLAS.
No communication is performed by this routine, the matrix to operate
on should be strictly local to one process.
Notes
=====
Each global data object is described by an associated description
vector. This vector stores the information required to establish
the mapping between an object element and its corresponding process
and memory location.
Let A be a generic term for any 2D block cyclicly distributed array.
Such a global array has an associated description vector DESCA.
In the following comments, the character _ should be read as
"of the global array".
NOTATION STORED IN EXPLANATION
--------------- -------------- --------------------------------------
DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
DTYPE_A = 1.
CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
the BLACS process grid A is distribu-
ted over. The context itself is glo-
bal, but the handle (the integer
value) may vary.
M_A (global) DESCA( M_ ) The number of rows in the global
array A.
N_A (global) DESCA( N_ ) The number of columns in the global
array A.
MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
the rows of the array.
NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
the columns of the array.
RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
row of the array A is distributed.
CSRC_A (global) DESCA( CSRC_ ) The process column over which the
first column of the array A is
distributed.
LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
array. LLD_A >= MAX(1,LOCr(M_A)).
Let K be the number of rows or columns of a distributed matrix,
and assume that its process grid has dimension p x q.
LOCr( K ) denotes the number of elements of K that a process
would receive if K were distributed over the p processes of its
process column.
Similarly, LOCc( K ) denotes the number of elements of K that a
process would receive if K were distributed over the q processes of
its process row.
The values of LOCr() and LOCc() may be determined via a call to the
ScaLAPACK tool function, NUMROC:
LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
An upper bound for these quantities may be computed by:
LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
Arguments
=========
UPLO (global input) CHARACTER*1
Specifies whether the triangular factor stored in the matrix
sub( A ) is upper or lower triangular:
= 'U': Upper triangular,
= 'L': Lower triangular.
N (global input) INTEGER
The number of rows and columns to be operated on, i.e. the
order of the order of the triangular factor U or L. N >= 0.
A (local input/local output) DOUBLE PRECISION pointer into the
local memory to an array of dimension (LLD_A, LOCc(JA+N-1)).
On entry, the local pieces of the triangular factor L or U.
On exit, if UPLO = 'U', the upper triangle of the distributed
matrix sub( A ) is overwritten with the upper triangle of the
product U * U'; if UPLO = 'L', the lower triangle of sub( A )
is overwritten with the lower triangle of the product L' * L.
IA (global input) INTEGER
The row index in the global array A indicating the first
row of sub( A ).
JA (global input) INTEGER
The column index in the global array A indicating the
first column of sub( A ).
DESCA (global and local input) INTEGER array of dimension DLEN_.
The array descriptor for the distributed matrix A.
=====================================================================
.. Parameters ..
|
|
|
|
01 SUBROUTINE PDLAUU2( UPLO , N , A , IA , JA , DESCA )
02
03 * -- ScaLAPACK auxiliary routine(version 1.7) --
04 * University of Tennessee , Knoxville , Oak Ridge National Laboratory ,
05 * and University of California , Berkeley.
06 * May 1 , 1997
07
08 * .. Scalar Arguments ..
09 CHARACTER UPLO
10 INTEGER IA , JA , N
11 INTEGER BLOCK_CYCLIC_2D , CSRC_ , CTXT_ , DLEN_ , DTYPE_ ,
12 $LLD_ , MB_ , M_ , NB_ , N_ , RSRC_
13 PARAMETER( BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 ,
14 $CTXT_ = 2 , M_ = 3 , N_ = 4 , MB_ = 5 , NB_ = 6 ,
15 $RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 )
16 DOUBLE PRECISION ONE
17 PARAMETER( ONE = 1.0D + 0 )
18 * ..
19 * .. Local Scalars ..
20 INTEGER IACOL , IAROW , ICURR , IDIAG , IIA , IOFFA , JJA ,
21 $LDA , MYCOL , MYROW , NA , NPCOL , NPROW
22 DOUBLE PRECISION AII
23 * ..
24 * .. External Subroutines ..
25 EXTERNAL BLACS_GRIDINFO , INFOG2L , DGEMV , DSCAL
26 * ..
27 * .. External Functions ..
28 LOGICAL LSAME
29 DOUBLE PRECISION DDOT
30 EXTERNAL DDOT , LSAME
31 * ..
32 * .. Executable Statements ..
33
34 * Quick return if possible
35
36 IF( N.EQ.0 )
36
37 $ RETURN
38
39 * Get grid parameters and compute local indexes
40
41 CALL BLACS_GRIDINFO( DESCA( CTXT_ ) , NPROW , NPCOL , MYROW , MYCOL )
42 CALL INFOG2L( IA , JA , DESCA , NPROW , NPCOL , MYROW , MYCOL , IIA , JJA ,
43 $ IAROW , IACOL )
44
45 IF( MYROW.EQ.IAROW .AND. MYCOL.EQ.IACOL ) THEN
46
46
47 LDA = DESCA( LLD_ )
48 IDIAG = IIA + ( JJA - 1 ) * LDA
49 IOFFA = IDIAG
50
51 IF( LSAME( UPLO , 'U' ) ) THEN
52
53 * Compute the product U * U'.
54
54
55 DO 10 NA = N - 1 , 1 , - 1
55
56 AII = A( IDIAG )
57 ICURR = IDIAG + LDA
58 A( IDIAG ) = AII*AII + DDOT( NA , A( ICURR ) , LDA ,
59 $ A( ICURR ) , LDA )
60 CALL DGEMV( 'No transpose' , N - NA - 1 , NA , ONE ,
61 $ A( IOFFA + LDA ) , LDA , A( ICURR ) , LDA , AII ,
62 $ A( IOFFA ) , 1 )
63 IDIAG = IDIAG + LDA + 1
64 IOFFA = IOFFA + LDA
65 10 CONTINUE
65
66 AII = A( IDIAG )
67 CALL DSCAL( N , AII , A( IOFFA ) , 1 )
68
69 ELSE
70
71 * Compute the product L' * L.
72
72
73 DO 20 NA = 1 , N - 1
73
74 AII = A( IDIAG )
75 ICURR = IDIAG + 1
76 A(IDIAG) = AII*AII + DDOT( N - NA , A( ICURR ) , 1 ,
77 $ A( ICURR ) , 1 )
78 CALL DGEMV( 'Transpose' , N - NA , NA - 1 , ONE , A( IOFFA + 1 ) ,
79 $ LDA , A( ICURR ) , 1 , AII , A( IOFFA ) , LDA )
80 IDIAG = IDIAG + LDA + 1
81 IOFFA = IOFFA + 1
82 20 CONTINUE
82
83 AII = A( IDIAG )
84 CALL DSCAL( N , AII , A( IOFFA ) , LDA )
85
86 END IF
87
88 END IF
89
90 RETURN
91
92 * End of PDLAUU2
93
94 END19
8
|
|
Variables in Routine PDLAUU2()
| Summary Report |
| Data Type | Quantity | Size(byte) |
| CHARACTER | 1 | 1 |
| DOUBLE PRECISION | 3 | 12 |
| INTEGER | 27 | 108 |
| LOGICAL | 1 | 1 |
| REAL | 1 | 4 |
| TOTAL | 33 | 126 |
List of Variables
CHARACTER
DOUBLE PRECISION
INTEGER
| BLOCK_CYCLIC_2D | CSRC_ | CTXT_ | DLEN_ | DTYPE_ |
| IA | IACOL | IAROW | ICURR | IDIAG |
| IIA | IOFFA | JA | JJA | LDA |
| LLD_ | M_ | MB_ | MYCOL | MYROW |
| N | N_ | NA | NB_ | NPCOL |
| NPROW | RSRC_ | | | |
LOGICAL
REAL
Variables Dependence Graph Put the mouse over a right hand side variable to display the corresponding line of the dependence | | - | | - | - | | A | <--- | AA( IDIAG ) = AII*AII + DDOT( NA, A( ICURR ), LDA,{2A(IDIAG) = AII*AII + DDOT( N-NA, A( ICURR ), 1,}, ICURRA( IDIAG ) = AII*AII + DDOT( NA, A( ICURR ), LDA,{2A(IDIAG) = AII*AII + DDOT( N-NA, A( ICURR ), 1,}, LDAA( IDIAG ) = AII*AII + DDOT( NA, A( ICURR ), LDA,, AIIA( IDIAG ) = AII*AII + DDOT( NA, A( ICURR ), LDA,{2A(IDIAG) = AII*AII + DDOT( N-NA, A( ICURR ), 1,}, NA(IDIAG) = AII*AII + DDOT( N-NA, A( ICURR ), 1,, NAA( IDIAG ) = AII*AII + DDOT( NA, A( ICURR ), LDA,{2A(IDIAG) = AII*AII + DDOT( N-NA, A( ICURR ), 1,}, DDOTA( IDIAG ) = AII*AII + DDOT( NA, A( ICURR ), LDA,{2A(IDIAG) = AII*AII + DDOT( N-NA, A( ICURR ), 1,} |
| AII | <--- | AAII = A( IDIAG ){2AII = A( IDIAG ), 3AII = A( IDIAG ), 4AII = A( IDIAG )}, IDIAGAII = A( IDIAG ){2AII = A( IDIAG ), 3AII = A( IDIAG ), 4AII = A( IDIAG )} |
| ICURR | <--- | IDIAGICURR = IDIAG + LDA{2ICURR = IDIAG + 1}, LDAICURR = IDIAG + LDA |
| IDIAG | <--- | IDIAGIDIAG = IDIAG + LDA + 1{2IDIAG = IDIAG + LDA + 1}, IIAIDIAG = IIA + ( JJA - 1 ) * LDA, JJAIDIAG = IIA + ( JJA - 1 ) * LDA, LDAIDIAG = IIA + ( JJA - 1 ) * LDA{2IDIAG = IDIAG + LDA + 1, 3IDIAG = IDIAG + LDA + 1} |
| IOFFA | <--- | IDIAGIOFFA = IDIAG, IOFFAIOFFA = IOFFA + LDA{2IOFFA = IOFFA + 1}, LDAIOFFA = IOFFA + LDA |
| LDA | <--- | LLD_LDA = DESCA( LLD_ ) |
| NA | <--- | NDO 10 NA = N-1, 1, -1{2DO 20 NA = 1, N-1} |
|
|
Analysis elements of the routine PDLAUU2() Put the mouse over each element to display detailed matching information
Assigned variables |
| | | AII , BLOCK_CYCLIC_2D , CSRC_ , CTXT_ , DLEN_ , DTYPE_ , ICURR , IDIAG , IOFFA , LDA , LLD_ , M_ , MB_ , N_ , NA , NB_ , ONE , RSRC_ |
|
Active variables |
| | | A , AII , BLOCK_CYCLIC_2D , CSRC_ , CTXT_ , DDOT , DESCA , DLEN_ , DTYPE_ , IA , IACOL , IAROW , ICURR , IDIAG , IIA , IOFFA , JA , JJA , LDA , LLD_ , LSAME , M_ , MB_ , MYCOL , MYROW , N , N_ , NA , NB_ , NPCOL , NPROW , ONE , RSRC_ , UPLO |
|
Accessed arrays [ array name : associated index ] |
| | A | : ICURR , ICURR , ICURR , ICURR , ICURR , ICURR , IDIAG , IDIAG , IDIAG , IDIAG , IDIAG , IDIAG , IOFFA , IOFFA , IOFFA , IOFFA , IOFFA+1 , IOFFA+LDA |
| | DESCA | : CTXT_ , LLD_ |
| | LSAME | : UPLO, 'U' |
|
Conditional statements [ statement : associated predicate ] |
| | do | : ( 10 NA = N - 1 , 1 , - 1 ) , ( 20 NA = 1 , N - 1 ) |
| | if | : ( possible ) , ( N.EQ.0 ) , ( MYROW.EQ.IAROW .AND. MYCOL.EQ.IACOL ) , ( (LSAME( UPLO , 'U' ) ) ) |
|
| List of variables | A AII BLOCK_CYCLIC_2D CSRC_ CTXT_ DDOT DLEN_
| DTYPE_ IA IACOL IAROW ICURR IDIAG IIA IOFFA
| JA JJA LDA LLD_ LSAME M_ MB_ MYCOL
| MYROW N N_ NA NB_ NPCOL NPROW ONE
| RSRC_ UPLO | | close
| |
A
AII
BLOCK_CYCLIC_2D
CSRC_
CTXT_
DDOT
DLEN_
DTYPE_
IA
IACOL
IAROW
ICURR
IDIAG
IIA
IOFFA
JA
JJA
LDA
LLD_
LSAME
M_
MB_
MYCOL
MYROW
N
N_
NA
NB_
NPCOL
NPROW
ONE
RSRC_
UPLO
| |