Routine: PZHETD2()  File: SRC\pzhetd2.f

 
 
# lines: 471
  # code: 471
  # comment: 0
  # blank:0
# Variables:48
# Callers:2
# Callings:0
# Words:185
# Keywords:106
 

 

..
     .. Array Arguments ..
     ..
  Purpose
  =======
  PZHETD2 reduces a complex Hermitian matrix sub( A ) to Hermitian
  tridiagonal form T by an unitary similarity transformation:
  Q' * sub( A ) * Q = T, where sub( A ) = A(IA:IA+N-1,JA:JA+N-1).
  Notes
  =====
  Each global data object is described by an associated description
  vector.  This vector stores the information required to establish
  the mapping between an object element and its corresponding process
  and memory location.
  Let A be a generic term for any 2D block cyclicly distributed array.
  Such a global array has an associated description vector DESCA.
  In the following comments, the character _ should be read as
  "of the global array".
  NOTATION        STORED IN      EXPLANATION
  --------------- -------------- --------------------------------------
  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
                                 DTYPE_A = 1.
  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
                                 the BLACS process grid A is distribu-
                                 ted over. The context itself is glo-
                                 bal, but the handle (the integer
                                 value) may vary.
  M_A    (global) DESCA( M_ )    The number of rows in the global
                                 array A.
  N_A    (global) DESCA( N_ )    The number of columns in the global
                                 array A.
  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
                                 the rows of the array.
  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
                                 the columns of the array.
  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
                                 row of the array A is distributed.
  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
                                 first column of the array A is
                                 distributed.
  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
  Let K be the number of rows or columns of a distributed matrix,
  and assume that its process grid has dimension p x q.
  LOCr( K ) denotes the number of elements of K that a process
  would receive if K were distributed over the p processes of its
  process column.
  Similarly, LOCc( K ) denotes the number of elements of K that a
  process would receive if K were distributed over the q processes of
  its process row.
  The values of LOCr() and LOCc() may be determined via a call to the
  ScaLAPACK tool function, NUMROC:
          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
  An upper bound for these quantities may be computed by:
          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
  Arguments
  =========
  UPLO    (global input) CHARACTER
          Specifies whether the upper or lower triangular part of the
          Hermitian matrix sub( A ) is stored:
          = 'U':  Upper triangular
          = 'L':  Lower triangular
  N       (global input) INTEGER
          The number of rows and columns to be operated on, i.e. the
          order of the distributed submatrix sub( A ). N >= 0.
  A       (local input/local output) COMPLEX*16 pointer into the
          local memory to an array of dimension (LLD_A,LOCc(JA+N-1)).
          On entry, this array contains the local pieces of the
          Hermitian distributed matrix sub( A ).  If UPLO = 'U', the
          leading N-by-N upper triangular part of sub( A ) contains
          the upper triangular part of the matrix, and its strictly
          lower triangular part is not referenced. If UPLO = 'L', the
          leading N-by-N lower triangular part of sub( A ) contains the
          lower triangular part of the matrix, and its strictly upper
          triangular part is not referenced. On exit, if UPLO = 'U',
          the diagonal and first superdiagonal of sub( A ) are over-
          written by the corresponding elements of the tridiagonal
          matrix T, and the elements above the first superdiagonal,
          with the array TAU, represent the unitary matrix Q as a
          product of elementary reflectors; if UPLO = 'L', the diagonal
          and first subdiagonal of sub( A ) are overwritten by the
          corresponding elements of the tridiagonal matrix T, and the
          elements below the first subdiagonal, with the array TAU,
          represent the unitary matrix Q as a product of elementary
          reflectors. See Further Details.
  IA      (global input) INTEGER
          The row index in the global array A indicating the first
          row of sub( A ).
  JA      (global input) INTEGER
          The column index in the global array A indicating the
          first column of sub( A ).
  DESCA   (global and local input) INTEGER array of dimension DLEN_.
          The array descriptor for the distributed matrix A.
  D       (local output) DOUBLE PRECISION array, dimension LOCc(JA+N-1)
          The diagonal elements of the tridiagonal matrix T:
          D(i) = A(i,i). D is tied to the distributed matrix A.
  E       (local output) DOUBLE PRECISION array, dimension LOCc(JA+N-1)
          if UPLO = 'U', LOCc(JA+N-2) otherwise. The off-diagonal
          elements of the tridiagonal matrix T: E(i) = A(i,i+1) if
          UPLO = 'U', E(i) = A(i+1,i) if UPLO = 'L'. E is tied to the
          distributed matrix A.
  TAU     (local output) COMPLEX*16, array, dimension
          LOCc(JA+N-1). This array contains the scalar factors TAU of
          the elementary reflectors. TAU is tied to the distributed
          matrix A.
  WORK    (local workspace/local output) COMPLEX*16 array,
                                                    dimension (LWORK)
          On exit, WORK( 1 ) returns the minimal and optimal LWORK.
  LWORK   (local or global input) INTEGER
          The dimension of the array WORK.
          LWORK is local input and must be at least
          LWORK >= 3*N.
          If LWORK = -1, then LWORK is global input and a workspace
          query is assumed; the routine only calculates the minimum
          and optimal size for all work arrays. Each of these
          values is returned in the first entry of the corresponding
          work array, and no error message is issued by PXERBLA.
  INFO    (local output) INTEGER
          = 0:  successful exit
          < 0:  If the i-th argument is an array and the j-entry had
                an illegal value, then INFO = -(i*100+j), if the i-th
                argument is a scalar and had an illegal value, then
                INFO = -i.
  Further Details
  ===============
  If UPLO = 'U', the matrix Q is represented as a product of elementary
  reflectors
     Q = H(n-1) . . . H(2) H(1).
  Each H(i) has the form
     H(i) = I - tau * v * v'
  where tau is a complex scalar, and v is a complex vector with
  v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in
  A(ia:ia+i-2,ja+i), and tau in TAU(ja+i-1).
  If UPLO = 'L', the matrix Q is represented as a product of elementary
  reflectors
     Q = H(1) H(2) . . . H(n-1).
  Each H(i) has the form
     H(i) = I - tau * v * v'
  where tau is a complex scalar, and v is a complex vector with
  v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in
  A(ia+i+1:ia+n-1,ja+i-1), and tau in TAU(ja+i-1).
  The contents of sub( A ) on exit are illustrated by the following
  examples with n = 5:
  if UPLO = 'U':                       if UPLO = 'L':
    (  d   e   v2  v3  v4 )              (  d                  )
    (      d   e   v3  v4 )              (  e   d              )
    (          d   e   v4 )              (  v1  e   d          )
    (              d   e  )              (  v1  v2  e   d      )
    (                  d  )              (  v1  v2  v3  e   d  )
  where d and e denote diagonal and off-diagonal elements of T, and vi
  denotes an element of the vector defining H(i).
  Alignment requirements
  ======================
  The distributed submatrix sub( A ) must verify some alignment proper-
  ties, namely the following expression should be true:
  ( MB_A.EQ.NB_A .AND. IROFFA.EQ.ICOFFA ) with
  IROFFA = MOD( IA-1, MB_A ) and ICOFFA = MOD( JA-1, NB_A ).
  =====================================================================
     .. Parameters ..

 
Display dynamic version Find AutoScroll Reload FontSize: - + Hide Comments Hide Blanks Frame FullScreen MailPrint

 
001        SUBROUTINE PZHETD2( UPLO , N , A , IA , JA , DESCA , D , E , TAU , WORK ,
002       $LWORK , INFO )
003  
004  *     -- ScaLAPACK auxiliary routine(version 1.7) --
005  *     University of Tennessee , Knoxville , Oak Ridge National Laboratory ,
006  *     and University of California , Berkeley.
007  *     May 1 , 1997
008  
009  *     .. Scalar Arguments ..
010        CHARACTER UPLO
011        INTEGER IA , INFO , JA , LWORK , N
012        INTEGER BLOCK_CYCLIC_2D , CSRC_ , CTXT_ , DLEN_ , DTYPE_ ,
013       $LLD_ , MB_ , M_ , NB_ , N_ , RSRC_
014        PARAMETER( BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 ,
015       $CTXT_ = 2 , M_ = 3 , N_ = 4 , MB_ = 5 , NB_ = 6 ,
016       $RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 )
017        COMPLEX*16 HALF , ONE , ZERO
018        PARAMETER( HALF =( 0.5D + 0 , 0.0D + 0 ) ,
019       $ONE =( 1.0D + 0 , 0.0D + 0 ) ,
020       $ZERO =( 0.0D + 0 , 0.0D + 0 ) )
021  *     ..
022  *     .. Local Scalars ..
023        LOGICAL LQUERY , UPPER
024        INTEGER IACOL , IAROW , ICOFFA , ICTXT , II , IK , IROFFA , J ,
025       $JJ , JK , JN , LDA , LWMIN , MYCOL , MYROW , NPCOL ,
026       $NPROW
027        COMPLEX*16 ALPHA , TAUI
028  *     ..
029  *     .. External Subroutines ..
030        EXTERNAL BLACS_ABORT , BLACS_GRIDINFO , CHK1MAT , INFOG2L ,
031       $PXERBLA , ZAXPY , ZGEBR2D , ZGEBS2D ,
032       $ZHEMV , ZHER2 , ZLARFG
033  *     ..
034  *     .. External Functions ..
035        LOGICAL LSAME
036        COMPLEX*16 ZDOTC
037        EXTERNAL LSAME , ZDOTC
038  *     ..
039  *     .. Intrinsic Functions ..
040        INTRINSIC DBLE , DCMPLX
041  *     ..
042  *     .. Executable Statements ..
043  
044  *     Get grid parameters
045  
046        ICTXT = DESCA( CTXT_ )
047        CALL BLACS_GRIDINFO( ICTXT , NPROW , NPCOL , MYROW , MYCOL )
048  
049  *     Test the input parameters
050  
051        INFO = 0
052        IF( NPROW.EQ. - 1 ) THEN
053            INFO = - (600 + CTXT_)
054        ELSE
055            UPPER = LSAME( UPLO , 'U' )
056            CALL CHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , INFO )
057            LWMIN = 3 * N
058  
059            WORK( 1 ) = DCMPLX( DBLE( LWMIN ) )
060            LQUERY =( LWORK.EQ. - 1 )
061            IF( INFO.EQ.0 ) THEN
062                IROFFA = MOD( IA - 1 , DESCA( MB_ ) )
063                ICOFFA = MOD( JA - 1 , DESCA( NB_ ) )
064                IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO , 'L' ) ) THEN
065                    INFO = - 1
066                ELSE IF( IROFFA.NE.ICOFFA ) THEN
067                    INFO = - 5
068                ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN
069                    INFO = - (600 + NB_)
070                ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN
071                    INFO = - 11
072                END IF
073            END IF
074        END IF
075  
076        IF( INFO.NE.0 ) THEN
077            CALL PXERBLA( ICTXT , 'PZHETD2' , - INFO )
078            CALL BLACS_ABORT( ICTXT , 1 )
079            RETURN
080        ELSE IF( LQUERY ) THEN
081            RETURN
082        END IF
083  
084  *     Quick return if possible
085  
086        IF( N.LE.0 )
087       $    RETURN
088  
089  *         Compute local information
090  
091            LDA = DESCA( LLD_ )
092            CALL INFOG2L( IA , JA , DESCA , NPROW , NPCOL , MYROW , MYCOL , II , JJ ,
093       $    IAROW , IACOL )
094  
095            IF( UPPER ) THEN
096  
097  *             Process(IAROW , IACOL) owns block to be reduced
098  
099                IF( MYCOL.EQ.IACOL ) THEN
100                    IF( MYROW.EQ.IAROW ) THEN
101  
102  *                     Reduce the upper triangle of sub( A )
103  
104                        IK = II + N - 1 + (JJ + N - 2)*LDA
105                        A( IK ) = DBLE( A( IK ) )
106                        DO 10 J = N - 1 , 1 , - 1
107                            IK = II + J - 1
108                            JK = JJ + J - 1
109  
110  *                         Generate elementary reflector H(i) = I - tau * v * v'
111  *                         to annihilate A(IA : IA + J - 1 , JA : JA + J - 1)
112  
113                            ALPHA = A( IK + JK*LDA )
114                            CALL ZLARFG( J , ALPHA , A( II + JK*LDA ) , 1 , TAUI )
115                            E( JK + 1 ) = DBLE( ALPHA )
116  
117                            IF( TAUI.NE.ZERO ) THEN
118  
119  *                             Apply H(i) from both sides to
120  *                             A(IA : IA + J - 1 , JA : JA + J - 1)
121  
122                                A( IK + JK*LDA ) = ONE
123  
124  *                             Compute x := tau * A * v storing x in TAU(1 : i)
125  
126                                CALL ZHEMV( UPLO , J , TAUI , A( II + (JJ - 1)*LDA ) ,
127       $                        LDA , A( II + JK*LDA ) , 1 , ZERO ,
128       $                        TAU( JJ ) , 1 )
129  
130  *                             Compute w := x - 1 / 2 * tau * (x'*v) * v
131  
132                                ALPHA = - HALF*TAUI*ZDOTC( J , TAU( JJ ) , 1 ,
133       $                        A( II + JK*LDA ) , 1 )
134                                CALL ZAXPY( J , ALPHA , A( II + JK*LDA ) , 1 ,
135       $                        TAU( JJ ) , 1 )
136  
137  *                             Apply the transformation as a rank - 2 update :
138  *                             A := A - v * w' - w * v'
139  
140                                CALL ZHER2( UPLO , J , - ONE , A( II + JK*LDA ) , 1 ,
141       $                        TAU( JJ ) , 1 , A( II + (JJ - 1)*LDA ) ,
142       $                        LDA )
143                            END IF
144  
145  *                         Copy D , E , TAU to broadcast them columnwise.
146  
147                            A( IK + JK*LDA ) = DCMPLX( E( JK + 1 ) )
148                            D( JK + 1 ) = DBLE( A( IK + 1 + JK*LDA ) )
149                            WORK( J + 1 ) = DCMPLX( D( JK + 1 ) )
150                            WORK( N + J + 1 ) = DCMPLX( E( JK + 1 ) )
151                            TAU( JK + 1 ) = TAUI
152                            WORK( 2*N + J + 1 ) = TAU( JK + 1 )
153  
154     10                 CONTINUE
155                        D( JJ ) = DBLE( A( II + (JJ - 1)*LDA ) )
156                        WORK( 1 ) = DCMPLX( D( JJ ) )
157                        WORK( N + 1 ) = ZERO
158                        WORK( 2*N + 1 ) = ZERO
159  
160                        CALL ZGEBS2D( ICTXT , 'Columnwise' , ' ' , 1 , 3*N , WORK , 1 )
161  
162                    ELSE
163                        CALL ZGEBR2D( ICTXT , 'Columnwise' , ' ' , 1 , 3*N , WORK , 1 ,
164       $                IAROW , IACOL )
165                        DO 20 J = 2 , N
166                            JN = JJ + J - 1
167                            D( JN ) = DBLE( WORK( J ) )
168                            E( JN ) = DBLE( WORK( N + J ) )
169                            TAU( JN ) = WORK( 2*N + J )
170     20                 CONTINUE
171                        D( JJ ) = DBLE( WORK( 1 ) )
172                    END IF
173                END IF
174  
175            ELSE
176  
177  *             Process(IAROW , IACOL) owns block to be factorized
178  
179                IF( MYCOL.EQ.IACOL ) THEN
180                    IF( MYROW.EQ.IAROW ) THEN
181  
182  *                     Reduce the lower triangle of sub( A )
183  
184                        A( II + (JJ - 1)*LDA ) = DBLE( A( II + (JJ - 1)*LDA ) )
185                        DO 30 J = 1 , N - 1
186                            IK = II + J - 1
187                            JK = JJ + J - 1
188  
189  *                         Generate elementary reflector H(i) = I - tau * v * v'
190  *                         to annihilate A(IA + J - JA + 2 : IA + N - 1 , JA + J - 1)
191  
192                            ALPHA = A( IK + 1 + (JK - 1)*LDA )
193                            CALL ZLARFG( N - J , ALPHA , A( IK + 2 + (JK - 1)*LDA ) , 1 ,
194       $                    TAUI )
195                            E( JK ) = DBLE( ALPHA )
196  
197                            IF( TAUI.NE.ZERO ) THEN
198  
199  *                             Apply H(i) from both sides to
200  *                             A(IA + J - JA + 1 : IA + N - 1 , JA + J + 1 : JA + N - 1)
201  
202                                A( IK + 1 + (JK - 1)*LDA ) = ONE
203  
204  *                             Compute x := tau * A * v storing y in TAU(i : n - 1)
205  
206                                CALL ZHEMV( UPLO , N - J , TAUI , A( IK + 1 + JK*LDA ) ,
207       $                        LDA , A( IK + 1 + (JK - 1)*LDA ) , 1 ,
208       $                        ZERO , TAU( JK ) , 1 )
209  
210  *                             Compute w := x - 1 / 2 * tau * (x'*v) * v
211  
212                                ALPHA = - HALF*TAUI*ZDOTC( N - J , TAU( JK ) , 1 ,
213       $                        A( IK + 1 + (JK - 1)*LDA ) , 1 )
214                                CALL ZAXPY( N - J , ALPHA , A( IK + 1 + (JK - 1)*LDA ) ,
215       $                        1 , TAU( JK ) , 1 )
216  
217  *                             Apply the transformation as a rank - 2 update :
218  *                             A := A - v * w' - w * v'
219  
220                                CALL ZHER2( UPLO , N - J , - ONE ,
221       $                        A( IK + 1 + (JK - 1)*LDA ) , 1 ,
222       $                        TAU( JK ) , 1 , A( IK + 1 + JK*LDA ) ,
223       $                        LDA )
224                            END IF
225  
226  *                         Copy D(JK) , E(JK) , TAU(JK) to broadcast them
227  *                         columnwise.
228  
229                            A( IK + 1 + (JK - 1)*LDA ) = DCMPLX( E( JK ) )
230                            D( JK ) = DBLE( A( IK + (JK - 1)*LDA ) )
231                            WORK( J ) = DCMPLX( D( JK ) )
232                            WORK( N + J ) = DCMPLX( E( JK ) )
233                            TAU( JK ) = TAUI
234                            WORK( 2*N + J ) = TAU( JK )
235     30                 CONTINUE
236                        JN = JJ + N - 1
237                        D( JN ) = DBLE( A( II + N - 1 + (JN - 1)*LDA ) )
238                        WORK( N ) = DCMPLX( D( JN ) )
239                        TAU( JN ) = ZERO
240                        WORK( 2*N ) = ZERO
241  
242                        CALL ZGEBS2D( ICTXT , 'Columnwise' , ' ' , 1 , 3*N - 1 , WORK ,
243       $                1 )
244  
245                    ELSE
246                        CALL ZGEBR2D( ICTXT , 'Columnwise' , ' ' , 1 , 3*N - 1 , WORK ,
247       $                1 , IAROW , IACOL )
248                        DO 40 J = 1 , N - 1
249                            JN = JJ + J - 1
250                            D( JN ) = DBLE( WORK( J ) )
251                            E( JN ) = DBLE( WORK( N + J ) )
252                            TAU( JN ) = WORK( 2*N + J )
253     40                 CONTINUE
254                        JN = JJ + N - 1
255                        D( JN ) = DBLE( WORK( N ) )
256                        TAU( JN ) = ZERO
257                    END IF
258                END IF
259            END IF
260  
261            WORK( 1 ) = DCMPLX( DBLE( LWMIN ) )
262  
263            RETURN
264  
265  *         End of PZHETD2
266  
267        END