Routine: PZHETRD()  File: SRC\pzhetrd.f

 
 
# lines: 428
  # code: 428
  # comment: 0
  # blank:0
# Variables:52
# Callers:2
# Callings:2
# Words:173
# Keywords:94
 

 

..
     .. Array Arguments ..
     ..
  Purpose
  =======
  PZHETRD reduces a complex Hermitian matrix sub( A ) to Hermitian
  tridiagonal form T by an unitary similarity transformation:
  Q' * sub( A ) * Q = T, where sub( A ) = A(IA:IA+N-1,JA:JA+N-1).
  Notes
  =====
  Each global data object is described by an associated description
  vector.  This vector stores the information required to establish
  the mapping between an object element and its corresponding process
  and memory location.
  Let A be a generic term for any 2D block cyclicly distributed array.
  Such a global array has an associated description vector DESCA.
  In the following comments, the character _ should be read as
  "of the global array".
  NOTATION        STORED IN      EXPLANATION
  --------------- -------------- --------------------------------------
  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
                                 DTYPE_A = 1.
  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
                                 the BLACS process grid A is distribu-
                                 ted over. The context itself is glo-
                                 bal, but the handle (the integer
                                 value) may vary.
  M_A    (global) DESCA( M_ )    The number of rows in the global
                                 array A.
  N_A    (global) DESCA( N_ )    The number of columns in the global
                                 array A.
  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
                                 the rows of the array.
  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
                                 the columns of the array.
  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
                                 row of the array A is distributed.
  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
                                 first column of the array A is
                                 distributed.
  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
  Let K be the number of rows or columns of a distributed matrix,
  and assume that its process grid has dimension p x q.
  LOCr( K ) denotes the number of elements of K that a process
  would receive if K were distributed over the p processes of its
  process column.
  Similarly, LOCc( K ) denotes the number of elements of K that a
  process would receive if K were distributed over the q processes of
  its process row.
  The values of LOCr() and LOCc() may be determined via a call to the
  ScaLAPACK tool function, NUMROC:
          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
  An upper bound for these quantities may be computed by:
          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
  Arguments
  =========
  UPLO    (global input) CHARACTER
          Specifies whether the upper or lower triangular part of the
          Hermitian matrix sub( A ) is stored:
          = 'U':  Upper triangular
          = 'L':  Lower triangular
  N       (global input) INTEGER
          The number of rows and columns to be operated on, i.e. the
          order of the distributed submatrix sub( A ). N >= 0.
  A       (local input/local output) COMPLEX*16 pointer into the
          local memory to an array of dimension (LLD_A,LOCc(JA+N-1)).
          On entry, this array contains the local pieces of the
          Hermitian distributed matrix sub( A ).  If UPLO = 'U', the
          leading N-by-N upper triangular part of sub( A ) contains
          the upper triangular part of the matrix, and its strictly
          lower triangular part is not referenced. If UPLO = 'L', the
          leading N-by-N lower triangular part of sub( A ) contains the
          lower triangular part of the matrix, and its strictly upper
          triangular part is not referenced. On exit, if UPLO = 'U',
          the diagonal and first superdiagonal of sub( A ) are over-
          written by the corresponding elements of the tridiagonal
          matrix T, and the elements above the first superdiagonal,
          with the array TAU, represent the unitary matrix Q as a
          product of elementary reflectors; if UPLO = 'L', the diagonal
          and first subdiagonal of sub( A ) are overwritten by the
          corresponding elements of the tridiagonal matrix T, and the
          elements below the first subdiagonal, with the array TAU,
          represent the unitary matrix Q as a product of elementary
          reflectors. See Further Details.
  IA      (global input) INTEGER
          The row index in the global array A indicating the first
          row of sub( A ).
  JA      (global input) INTEGER
          The column index in the global array A indicating the
          first column of sub( A ).
  DESCA   (global and local input) INTEGER array of dimension DLEN_.
          The array descriptor for the distributed matrix A.
  D       (local output) DOUBLE PRECISION array, dimension LOCc(JA+N-1)
          The diagonal elements of the tridiagonal matrix T:
          D(i) = A(i,i). D is tied to the distributed matrix A.
  E       (local output) DOUBLE PRECISION array, dimension LOCc(JA+N-1)
          if UPLO = 'U', LOCc(JA+N-2) otherwise. The off-diagonal
          elements of the tridiagonal matrix T: E(i) = A(i,i+1) if
          UPLO = 'U', E(i) = A(i+1,i) if UPLO = 'L'. E is tied to the
          distributed matrix A.
  TAU     (local output) COMPLEX*16, array, dimension
          LOCc(JA+N-1). This array contains the scalar factors TAU of
          the elementary reflectors. TAU is tied to the distributed
          matrix A.
  WORK    (local workspace/local output) COMPLEX*16 array,
                                                  dimension (LWORK)
          On exit, WORK( 1 ) returns the minimal and optimal LWORK.
  LWORK   (local or global input) INTEGER
          The dimension of the array WORK.
          LWORK is local input and must be at least
          LWORK >= MAX( NB * ( NP +1 ), 3 * NB )
          where NB = MB_A = NB_A,
          NP = NUMROC( N, NB, MYROW, IAROW, NPROW ),
          IAROW = INDXG2P( IA, NB, MYROW, RSRC_A, NPROW ).
          INDXG2P and NUMROC are ScaLAPACK tool functions;
          MYROW, MYCOL, NPROW and NPCOL can be determined by calling
          the subroutine BLACS_GRIDINFO.
          If LWORK = -1, then LWORK is global input and a workspace
          query is assumed; the routine only calculates the minimum
          and optimal size for all work arrays. Each of these
          values is returned in the first entry of the corresponding
          work array, and no error message is issued by PXERBLA.
  INFO    (global output) INTEGER
          = 0:  successful exit
          < 0:  If the i-th argument is an array and the j-entry had
                an illegal value, then INFO = -(i*100+j), if the i-th
                argument is a scalar and had an illegal value, then
                INFO = -i.
  Further Details
  ===============
  If UPLO = 'U', the matrix Q is represented as a product of elementary
  reflectors
     Q = H(n-1) . . . H(2) H(1).
  Each H(i) has the form
     H(i) = I - tau * v * v'
  where tau is a complex scalar, and v is a complex vector with
  v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in
  A(ia:ia+i-2,ja+i), and tau in TAU(ja+i-1).
  If UPLO = 'L', the matrix Q is represented as a product of elementary
  reflectors
     Q = H(1) H(2) . . . H(n-1).
  Each H(i) has the form
     H(i) = I - tau * v * v'
  where tau is a complex scalar, and v is a complex vector with
  v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in
  A(ia+i+1:ia+n-1,ja+i-1), and tau in TAU(ja+i-1).
  The contents of sub( A ) on exit are illustrated by the following
  examples with n = 5:
  if UPLO = 'U':                       if UPLO = 'L':
    (  d   e   v2  v3  v4 )              (  d                  )
    (      d   e   v3  v4 )              (  e   d              )
    (          d   e   v4 )              (  v1  e   d          )
    (              d   e  )              (  v1  v2  e   d      )
    (                  d  )              (  v1  v2  v3  e   d  )
  where d and e denote diagonal and off-diagonal elements of T, and vi
  denotes an element of the vector defining H(i).
  Alignment requirements
  ======================
  The distributed submatrix sub( A ) must verify some alignment proper-
  ties, namely the following expression should be true:
  ( MB_A.EQ.NB_A .AND. IROFFA.EQ.ICOFFA .AND. IROFFA.EQ.0 ) with
  IROFFA = MOD( IA-1, MB_A ) and ICOFFA = MOD( JA-1, NB_A ).
  =====================================================================
     .. Parameters ..

 
Display dynamic version Find AutoScroll Reload FontSize: - + Hide Comments Hide Blanks Frame FullScreen MailPrint

 
001        SUBROUTINE PZHETRD( UPLO , N , A , IA , JA , DESCA , D , E , TAU , WORK ,
002       $LWORK , INFO )
003  
004  *     -- ScaLAPACK routine(version 1.7) --
005  *     University of Tennessee , Knoxville , Oak Ridge National Laboratory ,
006  *     and University of California , Berkeley.
007  *     May 1 , 1997
008  
009  *     .. Scalar Arguments ..
010        CHARACTER UPLO
011        INTEGER IA , INFO , JA , LWORK , N
012        INTEGER BLOCK_CYCLIC_2D , CSRC_ , CTXT_ , DLEN_ , DTYPE_ ,
013       $LLD_ , MB_ , M_ , NB_ , N_ , RSRC_
014        PARAMETER( BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 ,
015       $CTXT_ = 2 , M_ = 3 , N_ = 4 , MB_ = 5 , NB_ = 6 ,
016       $RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 )
017        DOUBLE PRECISION ONE
018        PARAMETER( ONE = 1.0D + 0 )
019        COMPLEX*16 CONE
020        PARAMETER( CONE =( 1.0D + 0 , 0.0D + 0 ) )
021  *     ..
022  *     .. Local Scalars ..
023        LOGICAL LQUERY , UPPER
024        CHARACTER COLCTOP , ROWCTOP
025        INTEGER I , IACOL , IAROW , ICOFFA , ICTXT , IINFO , IPW ,
026       $IROFFA , J , JB , JX , K , KK , LWMIN , MYCOL , MYROW ,
027       $NB , NP , NPCOL , NPROW , NQ
028  *     ..
029  *     .. Local Arrays ..
030        INTEGER DESCW( DLEN_ ) , IDUM1( 2 ) , IDUM2( 2 )
031  *     ..
032  *     .. External Subroutines ..
033        EXTERNAL BLACS_GRIDINFO , CHK1MAT , DESCSET , PCHK1MAT ,
034       $PB_TOPGET , PB_TOPSET , PXERBLA , PZHER2K ,
035       $PZHETD2 , PZLATRD  
036  *     ..
037  *     .. External Functions ..
038        LOGICAL LSAME
039        INTEGER INDXG2L , INDXG2P , NUMROC
040        EXTERNAL LSAME , INDXG2L , INDXG2P , NUMROC
041  *     ..
042  *     .. Intrinsic Functions ..
043        INTRINSIC DBLE , DCMPLX , ICHAR , MAX , MIN , MOD
044  *     ..
045  *     .. Executable Statements ..
046  
047  *     Get grid parameters
048  
049        ICTXT = DESCA( CTXT_ )
050        CALL BLACS_GRIDINFO( ICTXT , NPROW , NPCOL , MYROW , MYCOL )
051  
052  *     Test the input parameters
053  
054        INFO = 0
055        IF( NPROW.EQ. - 1 ) THEN
056            INFO = - (600 + CTXT_)
057        ELSE
058            CALL CHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , INFO )
059            UPPER = LSAME( UPLO , 'U' )
060            IF( INFO.EQ.0 ) THEN
061                NB = DESCA( NB_ )
062                IROFFA = MOD( IA - 1 , DESCA( MB_ ) )
063                ICOFFA = MOD( JA - 1 , DESCA( NB_ ) )
064                IAROW = INDXG2P( IA , NB , MYROW , DESCA( RSRC_ ) , NPROW )
065                IACOL = INDXG2P( JA , NB , MYCOL , DESCA( CSRC_ ) , NPCOL )
066                NP = NUMROC( N , NB , MYROW , IAROW , NPROW )
067                NQ = MAX( 1 , NUMROC( N + JA - 1 , NB , MYCOL , DESCA( CSRC_ ) ,
068       $        NPCOL ) )
069                LWMIN = MAX((NP + 1)*NB , 3*NB )
070  
071                WORK( 1 ) = DCMPLX( DBLE( LWMIN ) )
072                LQUERY =( LWORK.EQ. - 1 )
073                IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO , 'L' ) ) THEN
074                    INFO = - 1
075                ELSE IF( IROFFA.NE.ICOFFA .OR. ICOFFA.NE.0 ) THEN
076                    INFO = - 5
077                ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN
078                    INFO = - (600 + NB_)
079                ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN
080                    INFO = - 11
081                END IF
082            END IF
083            IF( UPPER ) THEN
084                IDUM1( 1 ) = ICHAR( 'U' )
085            ELSE
086                IDUM1( 1 ) = ICHAR( 'L' )
087            END IF
088            IDUM2( 1 ) = 1
089            IF( LWORK.EQ. - 1 ) THEN
090                IDUM1( 2 ) = - 1
091            ELSE
092                IDUM1( 2 ) = 1
093            END IF
094            IDUM2( 2 ) = 11
095            CALL PCHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , 2 , IDUM1 , IDUM2 ,
096       $    INFO )
097        END IF
098  
099        IF( INFO.NE.0 ) THEN
100            CALL PXERBLA( ICTXT , 'PZHETRD' , - INFO )
101            RETURN
102        ELSE IF( LQUERY ) THEN
103            RETURN
104        END IF
105  
106  *     Quick return if possible
107  
108        IF( N.EQ.0 )
109       $    RETURN
110  
111            CALL PB_TOPGET( ICTXT , 'Combine' , 'Columnwise' , COLCTOP )
112            CALL PB_TOPGET( ICTXT , 'Combine' , 'Rowwise' , ROWCTOP )
113            CALL PB_TOPSET( ICTXT , 'Combine' , 'Columnwise' , '1 - tree' )
114            CALL PB_TOPSET( ICTXT , 'Combine' , 'Rowwise' , '1 - tree' )
115  
116            IPW = NP * NB + 1
117  
118            IF( UPPER ) THEN
119  
120  *             Reduce the upper triangle of sub( A ).
121  
122                KK = MOD( JA + N - 1 , NB )
123                IF( KK.EQ.0 )
124       $            KK = NB
125                    CALL DESCSET( DESCW , N , NB , NB , NB , IAROW , INDXG2P( JA + N - KK ,
126       $            NB , MYCOL , DESCA( CSRC_ ) , NPCOL ) , ICTXT ,
127       $            MAX( 1 , NP ) )
128  
129                    DO 10 K = N - KK + 1 , NB + 1 , - NB
130                        JB = MIN( N - K + 1 , NB )
131                        I = IA + K - 1
132                        J = JA + K - 1
133  
134  *                     Reduce columns I : I + NB - 1 to tridiagonal form and form
135  *                     the matrix W which is needed to update the unreduced part of
136  *                     the matrix
137  
138                        CALL PZLATRD ( UPLO , K + JB - 1 , JB , A , IA , JA , DESCA , D , E , TAU ,
139       $                WORK , 1 , 1 , DESCW , WORK( IPW ) )
140  
141  *                     Update the unreduced submatrix A(IA : I - 1 , JA : J - 1) , using an
142  *                     update of the form :
143  *                     A(IA : I - 1 , JA : J - 1) := A(IA : I - 1 , JA : J - 1) - V*W' - W*V'
144  
145                        CALL PZHER2K( UPLO , 'No transpose' , K - 1 , JB , - CONE , A , IA ,
146       $                J , DESCA , WORK , 1 , 1 , DESCW , ONE , A , IA , JA ,
147       $                DESCA )
148  
149  *                     Copy last superdiagonal element back into sub( A )
150  
151                        JX = MIN( INDXG2L( J , NB , 0 , IACOL , NPCOL ) , NQ )
152                        CALL PZELSET( A , I - 1 , J , DESCA , DCMPLX( E( JX ) ) )
153  
154                        DESCW( CSRC_ ) = MOD( DESCW( CSRC_ ) + NPCOL - 1 , NPCOL )
155  
156     10             CONTINUE
157  
158  *                 Use unblocked code to reduce the last or only block
159  
160                    CALL PZHETD2 ( UPLO , MIN( N , NB ) , A , IA , JA , DESCA , D , E ,
161       $            TAU , WORK , LWORK , IINFO )
162  
163                ELSE
164  
165  *                 Reduce the lower triangle of sub( A )
166  
167                    KK = MOD( JA + N - 1 , NB )
168                    IF( KK.EQ.0 )
169       $                KK = NB
170                        CALL DESCSET( DESCW , N , NB , NB , NB , IAROW , IACOL , ICTXT ,
171       $                MAX( 1 , NP ) )
172  
173                        DO 20 K = 1 , N - NB , NB
174                            I = IA + K - 1
175                            J = JA + K - 1
176  
177  *                         Reduce columns I : I + NB - 1 to tridiagonal form and form
178  *                         the matrix W which is needed to update the unreduced part
179  *                         of the matrix
180  
181                            CALL PZLATRD ( UPLO , N - K + 1 , NB , A , I , J , DESCA , D , E , TAU ,
182       $                    WORK , K , 1 , DESCW , WORK( IPW ) )
183  
184  *                         Update the unreduced submatrix A(I + NB : IA + N - 1 , I + NB : IA + N - 1) ,
185  *                         using an update of the form : A(I + NB : IA + N - 1 , I + NB : IA + N - 1) :=
186  *                         A(I + NB : IA + N - 1 , I + NB : IA + N - 1) - V*W' - W*V'
187  
188                            CALL PZHER2K( UPLO , 'No transpose' , N - K - NB + 1 , NB , - CONE , A ,
189       $                    I + NB , J , DESCA , WORK , K + NB , 1 , DESCW , ONE , A ,
190       $                    I + NB , J + NB , DESCA )
191  
192  *                         Copy last subdiagonal element back into sub( A )
193  
194                            JX = MIN( INDXG2L( J + NB - 1 , NB , 0 , IACOL , NPCOL ) , NQ )
195                            CALL PZELSET( A , I + NB , J + NB - 1 , DESCA , DCMPLX( E( JX ) ) )
196  
197                            DESCW( CSRC_ ) = MOD( DESCW( CSRC_ ) + 1 , NPCOL )
198  
199     20                 CONTINUE
200  
201  *                     Use unblocked code to reduce the last or only block
202  
203                        CALL PZHETD2 ( UPLO , KK , A , IA + K - 1 , JA + K - 1 , DESCA , D , E ,
204       $                TAU , WORK , LWORK , IINFO )
205                    END IF
206  
207                    CALL PB_TOPSET( ICTXT , 'Combine' , 'Columnwise' , COLCTOP )
208                    CALL PB_TOPSET( ICTXT , 'Combine' , 'Rowwise' , ROWCTOP )
209  
210                    WORK( 1 ) = DCMPLX( DBLE( LWMIN ) )
211  
212                    RETURN
213  
214  *                 End of PZHETRD
215  
216                END