|
|
| |
| # lines: |
587 | | # code: |
587 | | # comment: | 0 | |
# blank: | 0 |
| # Variables: | 77 |
| # Callers: | 1 |
| # Callings: | 6 |
| # Words: | 278 |
| # Keywords: | 149 |
|
|
|
|
|
..
.. Array Arguments ..
..
Bugs
====
Support for UPLO='U' is limited to calling the old, slow, PZHETRD
code.
Purpose
=======
PZHENTRD is a prototype version of PZHETRD which uses tailored
codes (either the serial, ZHETRD, or the parallel code, PZHETTRD)
when the workspace provided by the user is adequate.
PZHENTRD reduces a complex Hermitian matrix sub( A ) to Hermitian
tridiagonal form T by an unitary similarity transformation:
Q' * sub( A ) * Q = T, where sub( A ) = A(IA:IA+N-1,JA:JA+N-1).
Features
========
PZHENTRD is faster than PZHETRD on almost all matrices,
particularly small ones (i.e. N < 500 * sqrt(P) ), provided that
enough workspace is available to use the tailored codes.
The tailored codes provide performance that is essentially
independent of the input data layout.
The tailored codes place no restrictions on IA, JA, MB or NB.
At present, IA, JA, MB and NB are restricted to those values allowed
by PZHETRD to keep the interface simple. These restrictions are
documented below. (Search for "restrictions".)
Notes
=====
Each global data object is described by an associated description
vector. This vector stores the information required to establish
the mapping between an object element and its corresponding process
and memory location.
Let A be a generic term for any 2D block cyclicly distributed array.
Such a global array has an associated description vector DESCA.
In the following comments, the character _ should be read as
"of the global array".
NOTATION STORED IN EXPLANATION
--------------- -------------- --------------------------------------
DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
DTYPE_A = 1.
CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
the BLACS process grid A is distribu-
ted over. The context itself is glo-
bal, but the handle (the integer
value) may vary.
M_A (global) DESCA( M_ ) The number of rows in the global
array A.
N_A (global) DESCA( N_ ) The number of columns in the global
array A.
MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
the rows of the array.
NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
the columns of the array.
RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
row of the array A is distributed.
CSRC_A (global) DESCA( CSRC_ ) The process column over which the
first column of the array A is
distributed.
LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
array. LLD_A >= MAX(1,LOCr(M_A)).
Let K be the number of rows or columns of a distributed matrix,
and assume that its process grid has dimension p x q.
LOCr( K ) denotes the number of elements of K that a process
would receive if K were distributed over the p processes of its
process column.
Similarly, LOCc( K ) denotes the number of elements of K that a
process would receive if K were distributed over the q processes of
its process row.
The values of LOCr() and LOCc() may be determined via a call to the
ScaLAPACK tool function, NUMROC:
LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
An upper bound for these quantities may be computed by:
LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
Arguments
=========
UPLO (global input) CHARACTER
Specifies whether the upper or lower triangular part of the
Hermitian matrix sub( A ) is stored:
= 'U': Upper triangular
= 'L': Lower triangular
N (global input) INTEGER
The number of rows and columns to be operated on, i.e. the
order of the distributed submatrix sub( A ). N >= 0.
A (local input/local output) COMPLEX*16 pointer into the
local memory to an array of dimension (LLD_A,LOCc(JA+N-1)).
On entry, this array contains the local pieces of the
Hermitian distributed matrix sub( A ). If UPLO = 'U', the
leading N-by-N upper triangular part of sub( A ) contains
the upper triangular part of the matrix, and its strictly
lower triangular part is not referenced. If UPLO = 'L', the
leading N-by-N lower triangular part of sub( A ) contains the
lower triangular part of the matrix, and its strictly upper
triangular part is not referenced. On exit, if UPLO = 'U',
the diagonal and first superdiagonal of sub( A ) are over-
written by the corresponding elements of the tridiagonal
matrix T, and the elements above the first superdiagonal,
with the array TAU, represent the unitary matrix Q as a
product of elementary reflectors; if UPLO = 'L', the diagonal
and first subdiagonal of sub( A ) are overwritten by the
corresponding elements of the tridiagonal matrix T, and the
elements below the first subdiagonal, with the array TAU,
represent the unitary matrix Q as a product of elementary
reflectors. See Further Details.
IA (global input) INTEGER
The row index in the global array A indicating the first
row of sub( A ).
JA (global input) INTEGER
The column index in the global array A indicating the
first column of sub( A ).
DESCA (global and local input) INTEGER array of dimension DLEN_.
The array descriptor for the distributed matrix A.
D (local output) DOUBLE PRECISION array, dimension LOCc(JA+N-1)
The diagonal elements of the tridiagonal matrix T:
D(i) = A(i,i). D is tied to the distributed matrix A.
E (local output) DOUBLE PRECISION array, dimension LOCc(JA+N-1)
if UPLO = 'U', LOCc(JA+N-2) otherwise. The off-diagonal
elements of the tridiagonal matrix T: E(i) = A(i,i+1) if
UPLO = 'U', E(i) = A(i+1,i) if UPLO = 'L'. E is tied to the
distributed matrix A.
TAU (local output) COMPLEX*16, array, dimension
LOCc(JA+N-1). This array contains the scalar factors TAU of
the elementary reflectors. TAU is tied to the distributed
matrix A.
WORK (local workspace/local output) COMPLEX*16 array,
dimension (LWORK)
On exit, WORK( 1 ) returns the optimal LWORK.
LWORK (local or global input) INTEGER
The dimension of the array WORK.
LWORK is local input and must be at least
LWORK >= MAX( NB * ( NP +1 ), 3 * NB )
For optimal performance, greater workspace is needed, i.e.
LWORK >= 2*( ANB+1 )*( 4*NPS+2 ) + ( NPS + 4 ) * NPS
ICTXT = DESCA( CTXT_ )
ANB = PJLAENV( ICTXT, 3, 'PZHETTRD', 'L', 0, 0, 0, 0 )
SQNPC = INT( SQRT( DBLE( NPROW * NPCOL ) ) )
NPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB )
NUMROC is a ScaLAPACK tool functions;
PJLAENV is a ScaLAPACK envionmental inquiry function
MYROW, MYCOL, NPROW and NPCOL can be determined by calling
the subroutine BLACS_GRIDINFO.
RWORK (local workspace/local output) COMPLEX*16 array,
dimension (LRWORK)
On exit, RWORK( 1 ) returns the optimal LRWORK.
LRWORK (local or global input) INTEGER
The dimension of the array RWORK.
LRWORK is local input and must be at least
LRWORK >= 1
For optimal performance, greater workspace is needed, i.e.
LRWORK >= MAX( 2 * N )
INFO (global output) INTEGER
= 0: successful exit
< 0: If the i-th argument is an array and the j-entry had
an illegal value, then INFO = -(i*100+j), if the i-th
argument is a scalar and had an illegal value, then
INFO = -i.
Further Details
===============
If UPLO = 'U', the matrix Q is represented as a product of elementary
reflectors
Q = H(n-1) . . . H(2) H(1).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with
v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in
A(ia:ia+i-2,ja+i), and tau in TAU(ja+i-1).
If UPLO = 'L', the matrix Q is represented as a product of elementary
reflectors
Q = H(1) H(2) . . . H(n-1).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with
v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in
A(ia+i+1:ia+n-1,ja+i-1), and tau in TAU(ja+i-1).
The contents of sub( A ) on exit are illustrated by the following
examples with n = 5:
if UPLO = 'U': if UPLO = 'L':
( d e v2 v3 v4 ) ( d )
( d e v3 v4 ) ( e d )
( d e v4 ) ( v1 e d )
( d e ) ( v1 v2 e d )
( d ) ( v1 v2 v3 e d )
where d and e denote diagonal and off-diagonal elements of T, and vi
denotes an element of the vector defining H(i).
Alignment requirements
======================
The distributed submatrix sub( A ) must verify some alignment proper-
ties, namely the following expression should be true:
( MB_A.EQ.NB_A .AND. IROFFA.EQ.ICOFFA .AND. IROFFA.EQ.0 ) with
IROFFA = MOD( IA-1, MB_A ) and ICOFFA = MOD( JA-1, NB_A ).
=====================================================================
.. Parameters ..
|
|
|
|
001 SUBROUTINE PZHENTRD( UPLO , N , A , IA , JA , DESCA , D , E , TAU , WORK ,
002 $LWORK , RWORK , LRWORK , INFO )
003
004 * -- ScaLAPACK routine(version 1.7) --
005 * University of Tennessee , Knoxville , Oak Ridge National Laboratory ,
006 * and University of California , Berkeley.
007 * October 15 , 1999
008
009 * .. Scalar Arguments ..
010 CHARACTER UPLO
011 INTEGER IA , INFO , JA , LRWORK , LWORK , N
012 INTEGER BLOCK_CYCLIC_2D , DLEN_ , DTYPE_ , CTXT_ , M_ , N_ ,
013 $MB_ , NB_ , RSRC_ , CSRC_ , LLD_
014 PARAMETER( BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 ,
015 $CTXT_ = 2 , M_ = 3 , N_ = 4 , MB_ = 5 , NB_ = 6 ,
016 $RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 )
017 DOUBLE PRECISION ONE
018 PARAMETER( ONE = 1.0D + 0 )
019 COMPLEX*16 CONE
020 PARAMETER( CONE =( 1.0D + 0 , 0.0D + 0 ) )
021 * ..
022 * .. Local Scalars ..
023 LOGICAL LQUERY , UPPER
024 CHARACTER COLCTOP , ROWCTOP
025 INTEGER ANB , CTXTB , I , IACOL , IAROW , ICOFFA , ICTXT ,
026 $IINFO , INDB , INDRD , INDRE , INDTAU , INDW , IPW ,
027 $IROFFA , J , JB , JX , K , KK , LLRWORK , LLWORK ,
028 $LRWMIN , LWMIN , MINSZ , MYCOL , MYCOLB , MYROW ,
029 $MYROWB , NB , NP , NPCOL , NPCOLB , NPROW , NPROWB ,
030 $NPS , NQ , ONEPMIN , ONEPRMIN , SQNPC , TTLRWMIN ,
031 $TTLWMIN
032 * ..
033 * .. Local Arrays ..
034 INTEGER DESCB( DLEN_ ) , DESCW( DLEN_ ) , IDUM1( 3 ) ,
035 $IDUM2( 3 )
036 * ..
037 * .. External Subroutines ..
038 EXTERNAL BLACS_GET , BLACS_GRIDEXIT , BLACS_GRIDINFO ,
039 $BLACS_GRIDINIT , CHK1MAT , DESCSET , IGAMN2D ,
040 $PCHK1MAT , PDLAMR1D , PB_TOPGET , PB_TOPSET ,
041 $PXERBLA , PZELSET , PZHER2K , PZHETD2 , PZHETTRD ,
042 $PZLAMR1D , PZLATRD , PZTRMR2D , ZHETRD
043 * ..
044 * .. External Functions ..
045 LOGICAL LSAME
046 INTEGER INDXG2L , INDXG2P , NUMROC , PJLAENV
047 EXTERNAL LSAME , INDXG2L , INDXG2P , NUMROC , PJLAENV
048 * ..
049 * .. Intrinsic Functions ..
050 INTRINSIC DBLE , DCMPLX , ICHAR , INT , MAX , MIN , MOD , SQRT
051 * ..
052 * .. Executable Statements ..
053
054 * This is just to keep ftnchek and toolpack / 1 happy
055 IF( BLOCK_CYCLIC_2D*CSRC_*CTXT_*DLEN_*DTYPE_*LLD_*MB_*M_*NB_*N_*
055
056 $ RSRC_.LT.0 )RETURN
057 * Get grid parameters
058
059 ICTXT = DESCA( CTXT_ )
060 CALL BLACS_GRIDINFO( ICTXT , NPROW , NPCOL , MYROW , MYCOL )
061
062 * Test the input parameters
063
064 INFO = 0
065 IF( NPROW.EQ. - 1 ) THEN
065
066 INFO = - ( 600 + CTXT_ )
067 ELSE
067
068 CALL CHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , INFO )
069 UPPER = LSAME( UPLO , 'U' )
070 IF( INFO.EQ.0 ) THEN
070
071 NB = DESCA( NB_ )
072 IROFFA = MOD( IA - 1 , DESCA( MB_ ) )
073 ICOFFA = MOD( JA - 1 , DESCA( NB_ ) )
074 IAROW = INDXG2P( IA , NB , MYROW , DESCA( RSRC_ ) , NPROW )
075 IACOL = INDXG2P( JA , NB , MYCOL , DESCA( CSRC_ ) , NPCOL )
076 NP = NUMROC( N , NB , MYROW , IAROW , NPROW )
077 NQ = MAX( 1 , NUMROC( N + JA - 1 , NB , MYCOL , DESCA( CSRC_ ) ,
078 $ NPCOL ) )
079 LWMIN = MAX(( NP + 1 )*NB , 3*NB )
080 ANB = PJLAENV( ICTXT , 3 , 'PZHETTRD' , 'L' , 0 , 0 , 0 , 0 )
081 MINSZ = PJLAENV( ICTXT , 5 , 'PZHETTRD' , 'L' , 0 , 0 , 0 , 0 )
082 SQNPC = INT( SQRT( DBLE( NPROW*NPCOL ) ) )
083 NPS = MAX( NUMROC( N , 1 , 0 , 0 , SQNPC ) , 2*ANB )
084 TTLWMIN = 2*( ANB + 1 )*( 4*NPS + 2 ) + ( NPS + 2 )*NPS
085 LRWMIN = 1
086 TTLRWMIN = 2*NPS
087
088 WORK( 1 ) = DCMPLX( DBLE( TTLWMIN ) )
089 RWORK( 1 ) = DBLE( TTLRWMIN )
090 LQUERY =( LWORK.EQ. - 1 .OR. LRWORK.EQ. - 1 )
091 IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO , 'L' ) ) THEN
091
092 INFO = - 1
093
094 * The following two restrictions are not necessary provided
095 * that either of the tailored codes are used.
096
097 ELSE IF( IROFFA.NE.ICOFFA .OR. ICOFFA.NE.0 ) THEN
097
098 INFO = - 5
099 ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN
099
100 INFO = - ( 600 + NB_ )
101 ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN
101
102 INFO = - 11
103 ELSE IF( LRWORK.LT.LRWMIN .AND. .NOT.LQUERY ) THEN
103
104 INFO = - 13
105 END IF
106 END IF
107 IF( UPPER ) THEN
107
108 IDUM1( 1 ) = ICHAR( 'U' )
109 ELSE
109
110 IDUM1( 1 ) = ICHAR( 'L' )
111 END IF
112 IDUM2( 1 ) = 1
113 IF( LWORK.EQ. - 1 ) THEN
113
114 IDUM1( 2 ) = - 1
115 ELSE
115
116 IDUM1( 2 ) = 1
117 END IF
118 IDUM2( 2 ) = 11
119 IF( LRWORK.EQ. - 1 ) THEN
119
120 IDUM1( 3 ) = - 1
121 ELSE
121
122 IDUM1( 3 ) = 1
123 END IF
124 IDUM2( 3 ) = 13
125 CALL PCHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , 3 , IDUM1 , IDUM2 ,
126 $ INFO )
127 END IF
128
129 IF( INFO.NE.0 ) THEN
129
130 CALL PXERBLA( ICTXT , 'PZHENTRD' , - INFO )
131 RETURN
132 ELSE IF( LQUERY ) THEN
132
133 RETURN
134 END IF
135
136 * Quick return if possible
137
138 IF( N.EQ.0 )
138
139 $ RETURN
140
141 ONEPMIN = N*N + 3*N + 1
142 LLWORK = LWORK
143 CALL IGAMN2D( ICTXT , 'A' , ' ' , 1 , 1 , LLWORK , 1 , 1 , - 1 , - 1 , - 1 ,
144 $ - 1 )
145
146 ONEPRMIN = 2*N
147 LLRWORK = LRWORK
148 CALL IGAMN2D( ICTXT , 'A' , ' ' , 1 , 1 , LLRWORK , 1 , 1 , - 1 , - 1 , - 1 ,
149 $ - 1 )
150
151 * Use the serial , LAPACK , code : ZTRD on small matrices if we
152 * we have enough space.
153
154 NPROWB = 0
155 IF(( N.LT.MINSZ .OR. SQNPC.EQ.1 ) .AND. LLWORK.GE.ONEPMIN .AND.
156 $ LLRWORK.GE.ONEPRMIN .AND. .NOT.UPPER ) THEN
157 NPROWB = 1
158 NPS = N
159 ELSE
159
160 IF( LLWORK.GE.TTLWMIN .AND. LLRWORK.GE.TTLRWMIN .AND. .NOT.
160
161 $ UPPER ) THEN
162 NPROWB = SQNPC
163 END IF
164 END IF
165
166 IF( NPROWB.GE.1 ) THEN
166
167 NPCOLB = NPROWB
168 SQNPC = NPROWB
169 INDB = 1
170 INDRD = 1
171 INDRE = INDRD + NPS
172 INDTAU = INDB + NPS*NPS
173 INDW = INDTAU + NPS
174 LLWORK = LLWORK - INDW + 1
175
176 CALL BLACS_GET( ICTXT , 10 , CTXTB )
177 CALL BLACS_GRIDINIT( CTXTB , 'Row major' , SQNPC , SQNPC )
178 CALL BLACS_GRIDINFO( CTXTB , NPROWB , NPCOLB , MYROWB , MYCOLB )
179 CALL DESCSET( DESCB , N , N , 1 , 1 , 0 , 0 , CTXTB , NPS )
180
181 CALL PZTRMR2D( UPLO , 'N' , N , N , A , IA , JA , DESCA , WORK( INDB ) ,
182 $ 1 , 1 , DESCB , ICTXT )
183
184 * Only those processors in context CTXTB are needed for a while
185
186 IF( NPROWB.GT.0 ) THEN
187
187
188 IF( NPROWB.EQ.1 ) THEN
188
189 CALL ZHETRD( UPLO , N , WORK( INDB ) , NPS , RWORK( INDRD ) ,
190 $ RWORK( INDRE ) , WORK( INDTAU ) ,
191 $ WORK( INDW ) , LLWORK , INFO )
192 ELSE
193
193
194 CALL PZHETTRD ( 'L' , N , WORK( INDB ) , 1 , 1 , DESCB ,
195 $ RWORK( INDRD ) , RWORK( INDRE ) ,
196 $ WORK( INDTAU ) , WORK( INDW ) , LLWORK ,
197 $ INFO )
198
199 END IF
200 END IF
201
202 * All processors participate in moving the data back to the
203 * way that PZHENTRD expects it.
204
205 CALL PDLAMR1D ( N - 1 , RWORK( INDRE ) , 1 , 1 , DESCB , E , 1 , JA ,
206 $ DESCA )
207
208 CALL PDLAMR1D ( N , RWORK( INDRD ) , 1 , 1 , DESCB , D , 1 , JA ,
209 $ DESCA )
210
211 CALL PZLAMR1D ( N , WORK( INDTAU ) , 1 , 1 , DESCB , TAU , 1 , JA ,
212 $ DESCA )
213
214 CALL PZTRMR2D( UPLO , 'N' , N , N , WORK( INDB ) , 1 , 1 , DESCB , A ,
215 $ IA , JA , DESCA , ICTXT )
216
217 IF( MYROWB.GE.0 )
217
218 $ CALL BLACS_GRIDEXIT( CTXTB )
219
220 ELSE
221
221
222 CALL PB_TOPGET( ICTXT , 'Combine' , 'Columnwise' , COLCTOP )
223 CALL PB_TOPGET( ICTXT , 'Combine' , 'Rowwise' , ROWCTOP )
224 CALL PB_TOPSET( ICTXT , 'Combine' , 'Columnwise' , '1 - tree' )
225 CALL PB_TOPSET( ICTXT , 'Combine' , 'Rowwise' , '1 - tree' )
226
227 IPW = NP*NB + 1
228
229 IF( UPPER ) THEN
230
231 * Reduce the upper triangle of sub( A ).
232
232
233 KK = MOD( JA + N - 1 , NB )
234 IF( KK.EQ.0 )
234
235 $ KK = NB
236 CALL DESCSET( DESCW , N , NB , NB , NB , IAROW ,
237 $ INDXG2P( JA + N - KK , NB , MYCOL , DESCA( CSRC_ ) ,
238 $ NPCOL ) , ICTXT , MAX( 1 , NP ) )
239
240 DO 10 K = N - KK + 1 , NB + 1 , - NB
240
241 JB = MIN( N - K + 1 , NB )
242 I = IA + K - 1
243 J = JA + K - 1
244
245 * Reduce columns I : I + NB - 1 to tridiagonal form and form
246 * the matrix W which is needed to update the unreduced part of
247 * the matrix
248
249 CALL PZLATRD ( UPLO , K + JB - 1 , JB , A , IA , JA , DESCA , D , E ,
250 $ TAU , WORK , 1 , 1 , DESCW , WORK( IPW ) )
251
252 * Update the unreduced submatrix A(IA : I - 1 , JA : J - 1) , using an
253 * update of the form :
254 * A(IA : I - 1 , JA : J - 1) := A(IA : I - 1 , JA : J - 1) - V*W' - W*V'
255
256 CALL PZHER2K( UPLO , 'No transpose' , K - 1 , JB , - CONE , A ,
257 $ IA , J , DESCA , WORK , 1 , 1 , DESCW , ONE , A ,
258 $ IA , JA , DESCA )
259
260 * Copy last superdiagonal element back into sub( A )
261
262 JX = MIN( INDXG2L( J , NB , 0 , IACOL , NPCOL ) , NQ )
263 CALL PZELSET( A , I - 1 , J , DESCA , DCMPLX( E( JX ) ) )
264
265 DESCW( CSRC_ ) = MOD( DESCW( CSRC_ ) + NPCOL - 1 , NPCOL )
266
267 10 CONTINUE
268
269 * Use unblocked code to reduce the last or only block
270
270
271 CALL PZHETD2 ( UPLO , MIN( N , NB ) , A , IA , JA , DESCA , D , E ,
272 $ TAU , WORK , LWORK , IINFO )
273
274 ELSE
275
276 * Reduce the lower triangle of sub( A )
277
277
278 KK = MOD( JA + N - 1 , NB )
279 IF( KK.EQ.0 )
279
280 $ KK = NB
281 CALL DESCSET( DESCW , N , NB , NB , NB , IAROW , IACOL , ICTXT ,
282 $ MAX( 1 , NP ) )
283
284 DO 20 K = 1 , N - NB , NB
284
285 I = IA + K - 1
286 J = JA + K - 1
287
288 * Reduce columns I : I + NB - 1 to tridiagonal form and form
289 * the matrix W which is needed to update the unreduced part
290 * of the matrix
291
292 CALL PZLATRD ( UPLO , N - K + 1 , NB , A , I , J , DESCA , D , E , TAU ,
293 $ WORK , K , 1 , DESCW , WORK( IPW ) )
294
295 * Update the unreduced submatrix A(I + NB : IA + N - 1 , I + NB : IA + N - 1) ,
296 * using an update of the form : A(I + NB : IA + N - 1 , I + NB : IA + N - 1) :=
297 * A(I + NB : IA + N - 1 , I + NB : IA + N - 1) - V*W' - W*V'
298
299 CALL PZHER2K( UPLO , 'No transpose' , N - K - NB + 1 , NB , - CONE ,
300 $ A , I + NB , J , DESCA , WORK , K + NB , 1 , DESCW ,
301 $ ONE , A , I + NB , J + NB , DESCA )
302
303 * Copy last subdiagonal element back into sub( A )
304
305 JX = MIN( INDXG2L( J + NB - 1 , NB , 0 , IACOL , NPCOL ) , NQ )
306 CALL PZELSET( A , I + NB , J + NB - 1 , DESCA , DCMPLX( E( JX ) ) )
307
308 DESCW( CSRC_ ) = MOD( DESCW( CSRC_ ) + 1 , NPCOL )
309
310 20 CONTINUE
311
312 * Use unblocked code to reduce the last or only block
313
313
314 CALL PZHETD2 ( UPLO , KK , A , IA + K - 1 , JA + K - 1 , DESCA , D , E , TAU ,
315 $ WORK , LWORK , IINFO )
316 END IF
317
318 CALL PB_TOPSET( ICTXT , 'Combine' , 'Columnwise' , COLCTOP )
319 CALL PB_TOPSET( ICTXT , 'Combine' , 'Rowwise' , ROWCTOP )
320
321 END IF
322
323 WORK( 1 ) = DCMPLX( DBLE( TTLWMIN ) )
324 RWORK( 1 ) = DBLE( TTLRWMIN )
325
326 RETURN
327
328 * End of PZHENTRD
329
330 END67
34
|
|
Variables in Routine PZHENTRD()
| Summary Report |
| Data Type | Quantity | Size(byte) |
| CHARACTER | 3 | 3 |
| COMPLEX*16 | 1 | ? |
| DOUBLE PRECISION | 1 | 4 |
| INTEGER | 67 | 288 |
| LOGICAL | 3 | 3 |
| REAL | 2 | 8 |
| TOTAL | 77 | 306 |
List of Variables
CHARACTER
COMPLEX*16
DOUBLE PRECISION
INTEGER
| ANB | BLOCK_CYCLIC_2D | CSRC_ | CTXT_ | CTXTB |
| DESCB( DLEN_ ) | DESCW( DLEN_ ) | DLEN_ | DTYPE_ | I |
| IA | IACOL | IAROW | ICOFFA | ICTXT |
| IDUM1( 3 ) | IDUM2( 3 ) | IINFO | INDB | INDRD |
| INDRE | INDTAU | INDW | INDXG2L | INDXG2P |
| INFO | IPW | IROFFA | J | JA |
| JB | JX | K | KK | LLD_ |
| LLRWORK | LLWORK | LRWMIN | LRWORK | LWMIN |
| LWORK | M_ | MB_ | MINSZ | MYCOL |
| MYCOLB | MYROW | MYROWB | N | N_ |
| NB | NB_ | NP | NPCOL | NPCOLB |
| NPROW | NPROWB | NPS | NQ | NUMROC |
| ONEPMIN | ONEPRMIN | PJLAENV | RSRC_ | SQNPC |
| TTLRWMIN | TTLWMIN | | | |
LOGICAL
REAL
Variables Dependence Graph Put the mouse over a right hand side variable to display the corresponding line of the dependence | | - | | - | - | | ANB | <--- | ICTXTANB = PJLAENV( ICTXT, 3, 'PZHETTRD', 'L', 0, 0, 0, 0 ), PJLAENVANB = PJLAENV( ICTXT, 3, 'PZHETTRD', 'L', 0, 0, 0, 0 ) |
| DESCW | <--- | CSRC_DESCW( CSRC_ ) = MOD( DESCW( CSRC_ )+NPCOL-1, NPCOL ){2DESCW( CSRC_ ) = MOD( DESCW( CSRC_ )+1, NPCOL )}, NPCOLDESCW( CSRC_ ) = MOD( DESCW( CSRC_ )+NPCOL-1, NPCOL ){2DESCW( CSRC_ ) = MOD( DESCW( CSRC_ )+1, NPCOL )}, DESCWDESCW( CSRC_ ) = MOD( DESCW( CSRC_ )+NPCOL-1, NPCOL ){2DESCW( CSRC_ ) = MOD( DESCW( CSRC_ )+1, NPCOL )} |
| I | <--- | IAI = IA + K - 1{2I = IA + K - 1}, KI = IA + K - 1{2I = IA + K - 1} |
| IACOL | <--- | INDXG2PIACOL = INDXG2P( JA, NB, MYCOL, DESCA( CSRC_ ), NPCOL ), JAIACOL = INDXG2P( JA, NB, MYCOL, DESCA( CSRC_ ), NPCOL ), MYCOLIACOL = INDXG2P( JA, NB, MYCOL, DESCA( CSRC_ ), NPCOL ), CSRC_IACOL = INDXG2P( JA, NB, MYCOL, DESCA( CSRC_ ), NPCOL ), NBIACOL = INDXG2P( JA, NB, MYCOL, DESCA( CSRC_ ), NPCOL ), NPCOLIACOL = INDXG2P( JA, NB, MYCOL, DESCA( CSRC_ ), NPCOL ) |
| IAROW | <--- | IAIAROW = INDXG2P( IA, NB, MYROW, DESCA( RSRC_ ), NPROW ), INDXG2PIAROW = INDXG2P( IA, NB, MYROW, DESCA( RSRC_ ), NPROW ), MYROWIAROW = INDXG2P( IA, NB, MYROW, DESCA( RSRC_ ), NPROW ), NBIAROW = INDXG2P( IA, NB, MYROW, DESCA( RSRC_ ), NPROW ), NPROWIAROW = INDXG2P( IA, NB, MYROW, DESCA( RSRC_ ), NPROW ), RSRC_IAROW = INDXG2P( IA, NB, MYROW, DESCA( RSRC_ ), NPROW ) |
| ICOFFA | <--- | JAICOFFA = MOD( JA-1, DESCA( NB_ ) ), NB_ICOFFA = MOD( JA-1, DESCA( NB_ ) ) |
| ICTXT | <--- | CTXT_ICTXT = DESCA( CTXT_ ) |
| INDRE | <--- | INDRDINDRE = INDRD + NPS, NPSINDRE = INDRD + NPS |
| INDTAU | <--- | INDBINDTAU = INDB + NPS*NPS, NPSINDTAU = INDB + NPS*NPS |
| INDW | <--- | INDTAUINDW = INDTAU + NPS, NPSINDW = INDTAU + NPS |
| INFO | <--- | NB_INFO = -( 600+NB_ ), CTXT_INFO = -( 600+CTXT_ ) |
| IPW | <--- | NBIPW = NP*NB + 1, NPIPW = NP*NB + 1 |
| IROFFA | <--- | IAIROFFA = MOD( IA-1, DESCA( MB_ ) ), MB_IROFFA = MOD( IA-1, DESCA( MB_ ) ) |
| J | <--- | JAJ = JA + K - 1{2J = JA + K - 1}, KJ = JA + K - 1{2J = JA + K - 1} |
| JB | <--- | KJB = MIN( N-K+1, NB ), NJB = MIN( N-K+1, NB ), NBJB = MIN( N-K+1, NB ) |
| JX | <--- | IACOLJX = MIN( INDXG2L( J, NB, 0, IACOL, NPCOL ), NQ ){2JX = MIN( INDXG2L( J+NB-1, NB, 0, IACOL, NPCOL ), NQ )}, INDXG2LJX = MIN( INDXG2L( J, NB, 0, IACOL, NPCOL ), NQ ){2JX = MIN( INDXG2L( J+NB-1, NB, 0, IACOL, NPCOL ), NQ )}, JJX = MIN( INDXG2L( J, NB, 0, IACOL, NPCOL ), NQ ){2JX = MIN( INDXG2L( J+NB-1, NB, 0, IACOL, NPCOL ), NQ )}, NBJX = MIN( INDXG2L( J, NB, 0, IACOL, NPCOL ), NQ ){2JX = MIN( INDXG2L( J+NB-1, NB, 0, IACOL, NPCOL ), NQ )}, NPCOLJX = MIN( INDXG2L( J, NB, 0, IACOL, NPCOL ), NQ ){2JX = MIN( INDXG2L( J+NB-1, NB, 0, IACOL, NPCOL ), NQ )}, NQJX = MIN( INDXG2L( J, NB, 0, IACOL, NPCOL ), NQ ){2JX = MIN( INDXG2L( J+NB-1, NB, 0, IACOL, NPCOL ), NQ )} |
| K | <--- | KKDO 10 K = N - KK + 1, NB + 1, -NB, NDO 10 K = N - KK + 1, NB + 1, -NB{2DO 20 K = 1, N - NB, NB}, NBDO 10 K = N - KK + 1, NB + 1, -NB{2DO 20 K = 1, N - NB, NB} |
| KK | <--- | JAKK = MOD( JA+N-1, NB ){2KK = MOD( JA+N-1, NB )}, NKK = MOD( JA+N-1, NB ){2KK = MOD( JA+N-1, NB )}, NBKK = MOD( JA+N-1, NB ){2KK = MOD( JA+N-1, NB )} |
| LLRWORK | <--- | LRWORKLLRWORK = LRWORK |
| LLWORK | <--- | INDWLLWORK = LLWORK - INDW + 1, LLWORKLLWORK = LLWORK - INDW + 1, LWORKLLWORK = LWORK |
| LWMIN | <--- | NBLWMIN = MAX( ( NP+1 )*NB, 3*NB ), NPLWMIN = MAX( ( NP+1 )*NB, 3*NB ) |
| MINSZ | <--- | ICTXTMINSZ = PJLAENV( ICTXT, 5, 'PZHETTRD', 'L', 0, 0, 0, 0 ), PJLAENVMINSZ = PJLAENV( ICTXT, 5, 'PZHETTRD', 'L', 0, 0, 0, 0 ) |
| NB | <--- | NB_NB = DESCA( NB_ ) |
| NP | <--- | IAROWNP = NUMROC( N, NB, MYROW, IAROW, NPROW ), MYROWNP = NUMROC( N, NB, MYROW, IAROW, NPROW ), NNP = NUMROC( N, NB, MYROW, IAROW, NPROW ), NBNP = NUMROC( N, NB, MYROW, IAROW, NPROW ), NPROWNP = NUMROC( N, NB, MYROW, IAROW, NPROW ), NUMROCNP = NUMROC( N, NB, MYROW, IAROW, NPROW ) |
| NPCOLB | <--- | NPROWBNPCOLB = NPROWB |
| NPROWB | <--- | SQNPCNPROWB = SQNPC |
| NPS | <--- | ANBNPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB ), NNPS = N{2NPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB )}, NUMROCNPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB ), SQNPCNPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB ) |
| NQ | <--- | JANQ = MAX( 1, NUMROC( N+JA-1, NB, MYCOL, DESCA( CSRC_ ),, MYCOLNQ = MAX( 1, NUMROC( N+JA-1, NB, MYCOL, DESCA( CSRC_ ),, CSRC_NQ = MAX( 1, NUMROC( N+JA-1, NB, MYCOL, DESCA( CSRC_ ),, NNQ = MAX( 1, NUMROC( N+JA-1, NB, MYCOL, DESCA( CSRC_ ),, NBNQ = MAX( 1, NUMROC( N+JA-1, NB, MYCOL, DESCA( CSRC_ ),, NPCOLNQ = MAX( 1, NUMROC( N+JA-1, NB, MYCOL, DESCA( CSRC_ ),, NUMROCNQ = MAX( 1, NUMROC( N+JA-1, NB, MYCOL, DESCA( CSRC_ ), |
| ONEPMIN | <--- | NONEPMIN = N*N + 3*N + 1 |
| ONEPRMIN | <--- | NONEPRMIN = 2*N |
| RWORK | <--- | TTLRWMINRWORK( 1 ) = DBLE( TTLRWMIN ){2RWORK( 1 ) = DBLE( TTLRWMIN )} |
| SQNPC | <--- | NPCOLSQNPC = INT( SQRT( DBLE( NPROW*NPCOL ) ) ), NPROWSQNPC = INT( SQRT( DBLE( NPROW*NPCOL ) ) ), NPROWBSQNPC = NPROWB |
| TTLRWMIN | <--- | NPSTTLRWMIN = 2*NPS |
| TTLWMIN | <--- | ANBTTLWMIN = 2*( ANB+1 )*( 4*NPS+2 ) + ( NPS+2 )*NPS, NPSTTLWMIN = 2*( ANB+1 )*( 4*NPS+2 ) + ( NPS+2 )*NPS |
| UPPER | <--- | LSAMEUPPER = LSAME( UPLO, 'U' ), UPLOUPPER = LSAME( UPLO, 'U' ) |
| WORK | <--- | TTLWMINWORK( 1 ) = DCMPLX( DBLE( TTLWMIN ) ){2WORK( 1 ) = DCMPLX( DBLE( TTLWMIN ) )} |
|
|
Analysis elements of the routine PZHENTRD() Put the mouse over each element to display detailed matching information
Assigned variables |
| | | ANB , BLOCK_CYCLIC_2D , CONE , CSRC_ , CTXT_ , DLEN_ , DTYPE_ , I , IACOL , IAROW , ICOFFA , ICTXT , IDUM1 , IDUM2 , INDB , INDRD , INDRE , INDTAU , INDW , INFO , IPW , IROFFA , J , JB , JX , K , KK , LLD_ , LLRWORK , LLWORK , LQUERY , LRWMIN , LWMIN , M_ , MB_ , MINSZ , N , N_ , NB , NB_ , NP , NPCOLB , NPROWB , NPS , NQ , ONE , ONEPMIN , ONEPRMIN , RSRC_ , RWORK , SQNPC , TTLRWMIN , TTLWMIN , UPPER , WORK |
|
Active variables |
| | | A , ANB , BLOCK_CYCLIC_2D , COLCTOP , CONE , CSRC_ , CTXT_ , CTXTB , D , DESCA , DESCB , DESCW , DLEN_ , DTYPE_ , E , I , IA , IACOL , IAROW , ICOFFA , ICTXT , IDUM1 , IDUM2 , IINFO , INDB , INDRD , INDRE , INDTAU , INDW , INDXG2L , INDXG2P , INFO , IPW , IROFFA , J , JA , JB , JX , K , KK , LLD_ , LLRWORK , LLWORK , LQUERY , LRWMIN , LRWORK , LSAME , LWMIN , LWORK , M_ , MB_ , MINSZ , MYCOL , MYCOLB , MYROW , MYROWB , N , N_ , NB , NB_ , NP , NPCOL , NPCOLB , NPROW , NPROWB , NPS , NQ , NUMROC , ONE , ONEPMIN , ONEPRMIN , PJLAENV , ROWCTOP , RSRC_ , RWORK , SQNPC , TAU , TTLRWMIN , TTLWMIN , UPLO , UPPER , WORK |
|
Accessed arrays [ array name : associated index ] |
| | A | : I+NB:IA+N-1,I+NB:IA+N-1 , I+NB:IA+N-1,I+NB:IA+N-1 , I+NB:IA+N-1,I+NB:IA+N-1 , IA:I-1,JA:J-1 , IA:I-1,JA:J-1 |
| | DESCA | : CSRC_ , CSRC_ , CSRC_ , CTXT_ , MB_ , MB_ , NB_ , NB_ , NB_ , RSRC_ |
| | DESCB | : DLEN_ |
| | DESCW | : CSRC_ , CSRC_ , DLEN_ |
| | E | : JX , JX |
| | IDUM1 | : 1 , 1 , 2 , 2 , 3 , 3 , 3 |
| | IDUM2 | : 1 , 2 , 3 , 3 |
| | INDXG2L | : J, NB, 0, IACOL, NPCOL , J+NB-1, NB, 0, IACOL, NPCOL |
| | INDXG2P | : IA, NB, MYROW, DESCA( RSRC_ ), NPROW , JA, NB, MYCOL, DESCA( CSRC_ ), NPCOL |
| | LSAME | : UPLO, 'L' , UPLO, 'U' |
| | NUMROC | : N, 1, 0, 0, SQNPC , N, NB, MYROW, IAROW, NPROW |
| | PJLAENV | : ICTXT, 3, 'PZHETTRD', 'L', 0, 0, 0, 0 , ICTXT, 5, 'PZHETTRD', 'L', 0, 0, 0, 0 |
| | RWORK | : 1 , 1 , INDRD , INDRD , INDRD , INDRE , INDRE , INDRE |
| | WORK | : 1 , 1 , INDB , INDB , INDB , INDB , INDTAU , INDTAU , INDTAU , INDW , INDW , IPW , IPW |
|
Conditional statements [ statement : associated predicate ] |
| | do | : ( 10 K = N - KK + 1 , NB + 1 , - NB ) , ( 20 K = 1 , N - NB , NB ) |
| | for | : ( a while ) |
| | if | : ( BLOCK_CYCLIC_2D*CSRC_*CTXT_*DLEN_*DTYPE_*LLD_*MB_*M_*NB_*N_* ) , ( NPROW.EQ. - 1 ) , ( INFO.EQ.0 ) , ( (.NOT.UPPER .AND. .NOT.LSAME( UPLO , 'L' ) ) ) , ( IROFFA.NE.ICOFFA .OR. ICOFFA.NE.0 ) , ( (DESCA( MB_ ).NE.DESCA( NB_ ) ) ) , ( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) , ( LRWORK.LT.LRWMIN .AND. .NOT.LQUERY ) , ( UPPER ) , ( LWORK.EQ. - 1 ) , ( LRWORK.EQ. - 1 ) , ( INFO.NE.0 ) , ( LQUERY ) , ( possible ) , ( N.EQ.0 ) , ( we ) , ( (( N.LT.MINSZ .OR. SQNPC.EQ.1 ) .AND. LLWORK.GE.ONEPMIN .AND. ) , ( LLWORK.GE.TTLWMIN .AND. LLRWORK.GE.TTLRWMIN .AND. .NOT. ) , ( NPROWB.GE.1 ) , ( NPROWB.GT.0 ) , ( NPROWB.EQ.1 ) , ( MYROWB.GE.0 ) , ( UPPER ) , ( KK.EQ.0 ) , ( KK.EQ.0 ) |
|
| List of variables | ANB BLOCK_CYCLIC_2D COLCTOP CONE CSRC_ CTXT_ CTXTB
| DESCB( DLEN_ ) DESCW( DLEN_ ) DLEN_ DTYPE_ I IA IACOL IAROW
| ICOFFA ICTXT IDUM1( 3 ) IDUM2( 3 ) IINFO INDB INDRD INDRE
| INDTAU INDW INDXG2L INDXG2P INFO IPW IROFFA J
| JA JB JX K KK LLD_ LLRWORK LLWORK
| LQUERY LRWMIN LRWORK LSAME LWMIN LWORK M_ MB_
| MINSZ MYCOL MYCOLB MYROW MYROWB N N_ NB
| NB_ NP NPCOL NPCOLB NPROW NPROWB NPS NQ
| NUMROC ONE ONEPMIN ONEPRMIN PJLAENV ROWCTOP RSRC_ RWORK
| SQNPC TTLRWMIN TTLWMIN UPLO UPPER WORK | | close
| |
ANB
BLOCK_CYCLIC_2D
COLCTOP
CONE
CSRC_
CTXT_
CTXTB
DESCB( DLEN_ )
DESCW( DLEN_ )
DLEN_
DTYPE_
I
IA
IACOL
IAROW
ICOFFA
ICTXT
IDUM1( 3 )
IDUM2( 3 )
IINFO
INDB
INDRD
INDRE
INDTAU
INDW
INDXG2L
INDXG2P
INFO
IPW
IROFFA
J
JA
JB
JX
K
KK
LLD_
LLRWORK
LLWORK
LQUERY
LRWMIN
LRWORK
LSAME
LWMIN
LWORK
M_
MB_
MINSZ
MYCOL
MYCOLB
MYROW
MYROWB
N
N_
NB
NB_
NP
NPCOL
NPCOLB
NPROW
NPROWB
NPS
NQ
NUMROC
ONE
ONEPMIN
ONEPRMIN
PJLAENV
ROWCTOP
RSRC_
RWORK
SQNPC
TTLRWMIN
TTLWMIN
UPLO
UPPER
WORK
313#510#220#521#547#508
| |