Routine: PZHETTRD()  File: SRC\pzhettrd.f

 
 
# lines: 1205
  # code: 1205
  # comment: 0
  # blank:0
# Variables:113
# Callers:1
# Callings:3
# Words:472
# Keywords:296
 

 

..
     .. Array Arguments ..
     ..
     Purpose
     =======
     PZHETTRD reduces a complex Hermitian matrix sub( A ) to Hermitian
     tridiagonal form T by an unitary similarity transformation:
     Q' * sub( A ) * Q = T, where sub( A ) = A(IA:IA+N-1,JA:JA+N-1).
     Notes
     =====
     Each global data object is described by an associated description
     vector.  This vector stores the information required to establish
     the mapping between an object element and its corresponding
     process and memory location.
     Let A be a generic term for any 2D block cyclicly distributed
     array.
     Such a global array has an associated description vector DESCA.
     In the following comments, the character _ should be read as
     "of the global array".
     NOTATION        STORED IN      EXPLANATION
     --------------- -------------- -----------------------------------
     DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
     DTYPE_A = 1.
     CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle,
     indicating the BLACS process grid A is distribu-
     ted over. The context itself is glo-
     bal, but the handle (the integer
     value) may vary.
     M_A    (global) DESCA( M_ )    The number of rows in the global
     array A.
     N_A    (global) DESCA( N_ )    The number of columns in the global
     array A.
     MB_A   (global) DESCA( MB_ )   The blocking factor used to
     distribute the rows of the array.
     NB_A   (global) DESCA( NB_ )   The blocking factor used to
     distribute the columns of the array.
     RSRC_A (global) DESCA( RSRC_ ) The process row over which the
     first row of the array A is distributed.
     CSRC_A (global) DESCA( CSRC_ ) The process column over which the
     first column of the array A is
     distributed.
     LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
     array.  LLD_A >= MAX(1,LOCp(M_A)).
     Let K be the number of rows or columns of a distributed matrix,
     and assume that its process grid has dimension p x q.
     LOCp( K ) denotes the number of elements of K that a process
     would receive if K were distributed over the p processes of its
     process column.
     Similarly, LOCq( K ) denotes the number of elements of K that a
     process would receive if K were distributed over the q processes
     of its process row.
     The values of LOCp() and LOCq() may be determined via a call to
     the ScaLAPACK tool function, NUMROC:
     LOCp( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
     LOCq( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
     An upper bound for these quantities may be computed by:
     LOCp( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
     LOCq( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
     Arguments
     =========
     UPLO    (global input) CHARACTER
     Specifies whether the upper or lower triangular part of the
     Hermitian matrix sub( A ) is stored:
     = 'U':  Upper triangular
     = 'L':  Lower triangular
     N       (global input) INTEGER
     The number of rows and columns to be operated on, i.e. the
     order of the distributed submatrix sub( A ). N >= 0.
     A       (local input/local output) COMPLEX*16 pointer into the
     local memory to an array of dimension (LLD_A,LOCq(JA+N-1)).
     On entry, this array contains the local pieces of the
     Hermitian distributed matrix sub( A ).  If UPLO = 'U', the
     leading N-by-N upper triangular part of sub( A ) contains
     the upper triangular part of the matrix, and its strictly
     lower triangular part is not referenced. If UPLO = 'L', the
     leading N-by-N lower triangular part of sub( A ) contains the
     lower triangular part of the matrix, and its strictly upper
     triangular part is not referenced. On exit, if UPLO = 'U',
     the diagonal and first superdiagonal of sub( A ) are over-
     written by the corresponding elements of the tridiagonal
     matrix T, and the elements above the first superdiagonal,
     with the array TAU, represent the unitary matrix Q as a
     product of elementary reflectors; if UPLO = 'L', the diagonal
     and first subdiagonal of sub( A ) are overwritten by the
     corresponding elements of the tridiagonal matrix T, and the
     elements below the first subdiagonal, with the array TAU,
     represent the unitary matrix Q as a product of elementary
     reflectors. See Further Details.
     IA      (global input) INTEGER
     The row index in the global array A indicating the first
     row of sub( A ).
     JA      (global input) INTEGER
     The column index in the global array A indicating the
     first column of sub( A ).
     DESCA   (global and local input) INTEGER array of dimension DLEN_.
     The array descriptor for the distributed matrix A.
     D       (local output) DOUBLE PRECISION array, dim LOCq(JA+N-1)
     The diagonal elements of the tridiagonal matrix T:
     D(i) = A(i,i). D is tied to the distributed matrix A.
     E       (local output) DOUBLE PRECISION array, dim LOCq(JA+N-1)
     if UPLO = 'U', LOCq(JA+N-2) otherwise. The off-diagonal
     elements of the tridiagonal matrix T: E(i) = A(i,i+1) if
     UPLO = 'U', E(i) = A(i+1,i) if UPLO = 'L'. E is tied to the
     distributed matrix A.
     TAU     (local output) COMPLEX*16, array, dimension
     LOCq(JA+N-1). This array contains the scalar factors TAU of
     the elementary reflectors. TAU is tied to the distributed
     matrix A.
     WORK    (local workspace) COMPLEX*16 array, dimension (LWORK)
     On exit, WORK( 1 ) returns the minimal and optimal workspace
     LWORK   (local input) INTEGER
     The dimension of the array WORK.
     LWORK >= 2*( ANB+1 )*( 4*NPS+2 ) + NPS
     Where:
         NPS = MAX( NUMROC( N, 1, 0, 0, NPROW ), 2*ANB )
         ANB = PJLAENV( DESCA( CTXT_ ), 3, 'PZHETTRD', 'L', 0, 0,
           0, 0 )
         NUMROC is a ScaLAPACK tool function;
         PJLAENV is a ScaLAPACK envionmental inquiry function
         MYROW, MYCOL, NPROW and NPCOL can be determined by calling
         the subroutine BLACS_GRIDINFO.
     INFO    (global output) INTEGER
     = 0:  successful exit
     < 0:  If the i-th argument is an array and the j-entry had
     an illegal value, then INFO = -(i*100+j), if the i-th
     argument is a scalar and had an illegal value, then
     INFO = -i.
     Further Details
     ===============
     If UPLO = 'U', the matrix Q is represented as a product of
     elementary reflectors
     Q = H(n-1) . . . H(2) H(1).
     Each H(i) has the form
     H(i) = I - tau * v * v'
     where tau is a complex scalar, and v is a complex vector with
     v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in
     A(ia:ia+i-2,ja+i), and tau in TAU(ja+i-1).
     If UPLO = 'L', the matrix Q is represented as a product of
     elementary reflectors
     Q = H(1) H(2) . . . H(n-1).
     Each H(i) has the form
     H(i) = I - tau * v * v'
     where tau is a complex scalar, and v is a complex vector with
     v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in
     A(ia+i+1:ia+n-1,ja+i-1), and tau in TAU(ja+i-1).
     The contents of sub( A ) on exit are illustrated by the following
     examples with n = 5:
     if UPLO = 'U':                       if UPLO = 'L':
     (  d   e   v2  v3  v4 )              (  d                  )
     (      d   e   v3  v4 )              (  e   d              )
     (          d   e   v4 )              (  v1  e   d          )
     (              d   e  )              (  v1  v2  e   d      )
     (                  d  )              (  v1  v2  v3  e   d  )
     where d and e denote diagonal and off-diagonal elements of T, and
     vi denotes an element of the vector defining H(i).
     Data storage requirements
     =========================
     PZHETTRD is not intended to be called directly.  All users are
     encourage to call PZHETRD which will then call PZHETTRD if
     appropriate.  A must be in cyclic format (i.e. MB = NB = 1),
     the process grid must be square ( i.e. NPROW = NPCOL ) and
     only lower triangular storage is supported.
     Local variables
     ===============
     PZHETTRD uses five local arrays:
       WORK ( InV ) dimension ( NP, ANB+1): array V
       WORK ( InH ) dimension ( NP, ANB+1): array H
       WORK ( InVT ) dimension ( NQ, ANB+1): transpose of the array V
       WORK ( InHT ) dimension ( NQ, ANB+1): transpose of the array H
       WORK ( InVTT ) dimension ( NQ, 1): transpose of the array VT
     Arrays V and H are replicated across all processor columns.
     Arrays V^T and H^T are replicated across all processor rows.
         WORK ( InVT ), or V^T, is stored as a tall skinny
         array ( NQ x ANB-1 ) for efficiency.  Since only the lower
         triangular portion of A is updated, Av is computed as:
         tril(A) * v + v^T * tril(A,-1).  This is performed as
         two local triangular matrix-vector multiplications (both in
         MVR2) followed by a transpose and a sum across the columns.
         In the local computation, WORK( InVT ) is used to compute
         tril(A) * v and WORK( InV ) is used to compute
         v^T * tril(A,-1)
     The following variables are global indices into A:
       INDEX:  The current global row and column number.
       MAXINDEX:  The global row and column for the first row and
       column in the trailing block of A.
       LIIB, LIJB:  The first row, column in
     The following variables point into the arrays A, V, H, V^T, H^T:
       BINDEX  =INDEX-MININDEX: The column index in V, H, V^T, H^T.
       LII:  local index I:  The local row number for row INDEX
       LIJ:  local index J:  The local column number for column INDEX
       LIIP1:  local index I+1:  The local row number for row INDEX+1
       LIJP1:  local index J+1:  The local col number for col INDEX+1
       LTLI: lower triangular local index I:  The local row for the
         upper left entry in tril( A(INDEX, INDEX) )
       LTLIP1: lower triangular local index I+1:  The local row for the
         upper left entry in tril( A(INDEX+1, INDEX+1) )
         Details:  The distinction between LII and LTLI (and between
         LIIP1 and LTLIP1) is subtle.  Within the current processor
         column (i.e. MYCOL .eq. CURCOL) they are the same.  However,
         on some processors, A( LII, LIJ ) points to an element
         above the diagonal, on these processors, LTLI = LII+1.
     The following variables give the number of rows and/or columns
     in various matrices:
       NP:  The number of local rows in A( 1:N, 1:N )
       NQ:  The number of local columns in A( 1:N, 1:N )
       NPM0:  The number of local rows in A( INDEX:N, INDEX:N )
       NQM0:  The number of local columns in A( INDEX:N, INDEX:N )
       NPM1:  The number of local rows in A( INDEX+1:N, INDEX:N )
       NQM1:  The number of local columns in A( INDEX+1:N, INDEX:N )
       LTNM0:  The number of local rows & columns in
         tril( A( INDEX:N, INDEX:N ) )
       LTNM1:  The number of local rows & columns in
         tril( A( INDEX+1:N, INDEX+1:N ) )
         NOTE:  LTNM0 == LTNM1 on all processors except the diagonal
         processors, i.e. those where MYCOL == MYROW.
         Invariants:
           NP = NPM0 + LII - 1
           NQ = NQM0 + LIJ - 1
           NP = NPM1 + LIIP1 - 1
           NQ = NQM1 + LIJP1 - 1
           NP = LTLI + LTNM0 - 1
           NP = LTLIP1 + LTNM1 - 1
       Temporary variables.  The following variables are used within
       a few lines after they are set and do hold state from one loop
       iteration to the next:
     The matrix A:
       The matrix A does not hold the same values that it would
       in an unblocked code nor the values that it would hold in
       in a blocked code.
       The value of A is confusing.  It is easiest to state the
       difference between trueA and A at the point that MVR2 is called,
       so we will start there.
       Let trueA be the value that A would
       have at a given point in an unblocked code and A
       be the value that A has in this code at the same point.
       At the time of the call to MVR2,
       trueA = A + V' * H + H' * V
       where H = H( MAXINDEX:N, 1:BINDEX ) and
       V = V( MAXINDEX:N, 1:BINDEX ).
       At the bottom of the inner loop,
       trueA = A +  V' * H + H' * V + v' * h + h' * v
       where H = H( MAXINDEX:N, 1:BINDEX ) and
       V = V( MAXINDEX:N, 1:BINDEX ) and
       v = V( liip1:N, BINDEX+1 ) and
       h = H( liip1:N, BINDEX+1 )
       At the top of the loop, BINDEX gets incremented, hence:
       trueA = A +  V' * H + H' * V + v' * h + h' * v
       where H = H( MAXINDEX:N, 1:BINDEX-1 ) and
       V = V( MAXINDEX:N, 1:BINDEX-1 ) and
       v = V( liip1:N, BINDEX ) and
       h = H( liip1:N, BINDEX )
       A gets updated at the bottom of the outer loop
       After this update, trueA = A + v' * h + h' * v
       where v = V( liip1:N, BINDEX ) and
       h = H( liip1:N, BINDEX ) and BINDEX = 0
       Indeed, the previous loop invariant as stated above for the
       top of the loop still holds, but with BINDEX = 0, H and V
       are null matrices.
       After the current column of A is updated,
         trueA( INDEX, INDEX:N ) = A( INDEX, INDEX:N )
       the rest of A is untouched.
       After the current block column of A is updated,
       trueA = A + V' * H + H' * V
       where H = H( MAXINDEX:N, 1:BINDEX ) and
       V = V( MAXINDEX:N, 1:BINDEX )
       This brings us back to the point at which mvr2 is called.
     Details of the parallelization:
       We delay spreading v across to all processor columns (which
       would naturally happen at the bottom of the loop) in order to
       combine the spread of v( : , i-1 ) with the spread of h( : , i )
       In order to compute h( :, i ), we must update A( :, i )
       which means that the processor column owning A( :, i ) must
       have: c, tau, v( i, i ) and h( i, i ).
       The traditional
       way of computing v (and the one used in pzlatrd.f and
       zlatrd.f) is:
         v = tau * v
         c = v' * h
         alpha = - tau * c / 2
         v = v + alpha * h
       However, the traditional way of computing v requires that tau
       be broadcast to all processors in the current column (to compute
       v = tau * v) and then a sum-to-all is required (to
       compute v' * h ).  We use the following formula instead:
         c = v' * h
         v = tau * ( v - c * tau' * h / 2 )
       The above formula allows tau to be spread down in the
       same call to DGSUM2D which performs the sum-to-all of c.
       The computation of v, which could be performed in any processor
       column (or other procesor subsets), is performed in the
       processor column that owns A( :, i+1 ) so that A( :, i+1 )
       can be updated prior to spreading v across.
       We keep the block column of A up-to-date to minimize the
       work required in updating the current column of A.  Updating
       the block column of A is reasonably load balanced whereas
       updating the current column of A is not (only the current
       processor column is involved).
     In the following overview of the steps performed, M in the
     margin indicates message traffic and C indicates O(n^2 nb/sqrt(p))
     or more flops per processor.
     Inner loop:
       A( index:n, index ) -= ( v * ht(bindex) + h * vt( bindex) )
M      h = house( A(index:n, index) )
M      Spread v, h across
M      vt = v^T; ht = h^T
       A( index+1:n, index+1:maxindex ) -=
         ( v * ht(index+1:maxindex) + h *vt(index+1:maxindex) )
C      v = tril(A) * h; vt = ht * tril(A,-1)
MorC   v = v - H*V*h - V*H*h
M      v = v + vt^T
M      c = v' * h
       v = tau * ( v - c * tau' * h / 2 )
C    A = A - H*V - V*H
     =================================================================
     .. Parameters ..

 
Display dynamic version Find AutoScroll Reload FontSize: - + Hide Comments Hide Blanks Frame FullScreen MailPrint

 
001        SUBROUTINE PZHETTRD( UPLO , N , A , IA , JA , DESCA , D , E , TAU , WORK ,
002       $LWORK , INFO )
003  
004  *     -- ScaLAPACK routine(version 1.7) --
005  *     University of Tennessee , Knoxville , Oak Ridge National Laboratory ,
006  *     and University of California , Berkeley.
007  *     October 15 , 1999
008  
009  *     .. Scalar Arguments ..
010        CHARACTER UPLO
011        INTEGER IA , INFO , JA , LWORK , N
012        INTEGER BLOCK_CYCLIC_2D , DLEN_ , DTYPE_ , CTXT_ , M_ , N_ ,
013       $MB_ , NB_ , RSRC_ , CSRC_ , LLD_
014        PARAMETER( BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 ,
015       $CTXT_ = 2 , M_ = 3 , N_ = 4 , MB_ = 5 , NB_ = 6 ,
016       $RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 )
017        DOUBLE PRECISION ONE
018        PARAMETER( ONE = 1.0D0 )
019        COMPLEX*16 Z_ONE , Z_NEGONE , Z_ZERO
020        PARAMETER( Z_ONE = 1.0D0 , Z_NEGONE = - 1.0D0 ,
021       $Z_ZERO = 0.0D0 )
022        DOUBLE PRECISION ZERO
023        PARAMETER( ZERO = 0.0D + 0 )
024  *     ..
025  
026  *     .. Local Scalars ..
027  
028        LOGICAL BALANCED , INTERLEAVE , TWOGEMMS , UPPER
029        INTEGER ANB , BINDEX , CURCOL , CURROW , I , ICTXT , INDEX ,
030       $INDEXA , INDEXINH , INDEXINV , INH , INHB , INHT ,
031       $INHTB , INTMP , INV , INVB , INVT , INVTB , J , LDA ,
032       $LDV , LDZG , LII , LIIB , LIIP1 , LIJ , LIJB , LIJP1 ,
033       $LTLIP1 , LTNM1 , LWMIN , MAXINDEX , MININDEX ,
034       $MYCOL , MYFIRSTROW , MYROW , MYSETNUM , NBZG , NP ,
035       $NPB , NPCOL , NPM0 , NPM1 , NPROW , NPS , NPSET , NQ ,
036       $NQB , NQM1 , NUMROWS , NXTCOL , NXTROW , PBMAX ,
037       $PBMIN , PBSIZE , PNB , ROWSPERPROC
038        DOUBLE PRECISION NORM , SAFMAX , SAFMIN
039        COMPLEX*16 ALPHA , BETA , C , CONJTOPH , CONJTOPV ,
040       $ONEOVERBETA , TOPH , TOPNV , TOPTAU , TOPV
041  *     ..
042  *     .. Local Arrays ..
043  
044        INTEGER IDUM1( 1 ) , IDUM2( 1 )
045        DOUBLE PRECISION DTMP( 5 )
046        COMPLEX*16 CC( 3 )
047  *     ..
048  *     .. External Subroutines ..
049        EXTERNAL BLACS_GRIDINFO , CHK1MAT , DCOMBNRM2 , DGEBR2D ,
050       $DGEBS2D , DGSUM2D , PCHK1MAT , PDTREECOMB ,
051       $PXERBLA , ZGEBR2D , ZGEBS2D , ZGEMM , ZGEMV ,
052       $ZGERV2D , ZGESD2D , ZGSUM2D , ZLACPY , ZSCAL ,
053       $ZTRMVT  
054  *     ..
055  *     .. External Functions ..
056  
057        LOGICAL LSAME
058        INTEGER ICEIL , NUMROC , PJLAENV
059        DOUBLE PRECISION DZNRM2 , PDLAMCH
060        EXTERNAL LSAME , ICEIL , NUMROC , PJLAENV , DZNRM2 , PDLAMCH
061  *     ..
062  *     .. Intrinsic Functions ..
063        INTRINSIC DBLE , DCMPLX , DCONJG , DIMAG , ICHAR , MAX , MIN ,
064       $MOD , SIGN , SQRT
065  *     ..
066  
067  *     .. Executable Statements ..
068  *     This is just to keep ftnchek and toolpack / 1 happy
069        IF( BLOCK_CYCLIC_2D*CSRC_*CTXT_*DLEN_*DTYPE_*LLD_*MB_*M_*NB_*N_*
070       $    RSRC_.LT.0 )RETURN
071  
072  *         Further details
073  *         ===============
074  
075  *         At the top of the loop , v and nh have been computed but not
076  *         spread across. Hence , A is out - of - date even after the
077  *         rank 2k update. Furthermore , we compute the next v before
078  *         nh is spread across.
079  
080  *         I claim that if we used a sum - to - all on NV , by summing CC within
081  *         each column , that we could compute NV locally and could avoid
082  *         spreading V across. Bruce claims that sum - to - all can be made
083  *         to cost no more than sum - to - one on the Paragon. If that is
084  *         true , this would be a win. But ,
085  *         the BLACS sum - to - all is just a sum - to - one followed by a broadcast ,
086  *         and hence the present scheme is better for now.
087  
088  *         Get grid parameters
089  
090            ICTXT = DESCA( CTXT_ )
091            CALL BLACS_GRIDINFO( ICTXT , NPROW , NPCOL , MYROW , MYCOL )
092  
093            SAFMAX = SQRT( PDLAMCH( ICTXT , 'O' ) ) / N
094            SAFMIN = SQRT( PDLAMCH( ICTXT , 'S' ) )
095  
096  *         Test the input parameters
097  
098            INFO = 0
099            IF( NPROW.EQ. - 1 ) THEN
100                INFO = - ( 600 + CTXT_ )
101            ELSE
102  
103  *             Here we set execution options for PZHETTRD
104  
105                PNB = PJLAENV( ICTXT , 2 , 'PZHETTRD' , 'L' , 0 , 0 , 0 , 0 )
106                ANB = PJLAENV( ICTXT , 3 , 'PZHETTRD' , 'L' , 0 , 0 , 0 , 0 )
107  
108                INTERLEAVE =( PJLAENV( ICTXT , 4 , 'PZHETTRD' , 'L' , 1 , 0 , 0 ,
109       $        0 ).EQ.1 )
110                TWOGEMMS =( PJLAENV( ICTXT , 4 , 'PZHETTRD' , 'L' , 2 , 0 , 0 ,
111       $        0 ).EQ.1 )
112                BALANCED =( PJLAENV( ICTXT , 4 , 'PZHETTRD' , 'L' , 3 , 0 , 0 ,
113       $        0 ).EQ.1 )
114  
115                CALL CHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , INFO )
116  
117                UPPER = LSAME( UPLO , 'U' )
118                IF( INFO.EQ.0 .AND. DESCA( NB_ ).NE.1 )
119       $            INFO = 600 + NB_
120                    IF( INFO.EQ.0 ) THEN
121  
122  *                     Here is the arithmetic :
123  *                     Let maxnpq = max( np , nq , 2 * ANB )
124  *                     LDV = 4 * max( np , nq ) + 2
125  *                     LWMIN = 2 * ( ANB + 1 ) * LDV + MAX( np , 2 * ANB )
126  *                     = 2 * ( ANB + 1 ) * ( 4 * NPS + 2 ) + NPS
127  
128  *                     This overestimates memory requirements when ANB > NP / 2
129  *                     Memory requirements are lower when interleave = .false.
130  *                     Hence , we could have two sets of memory requirements ,
131  *                     one for interleave and one for
132  
133                        NPS = MAX( NUMROC( N , 1 , 0 , 0 , NPROW ) , 2*ANB )
134                        LWMIN = 2*( ANB + 1 )*( 4*NPS + 2 ) + NPS
135  
136                        WORK( 1 ) = DCMPLX( LWMIN )
137                        IF( .NOT.LSAME( UPLO , 'L' ) ) THEN
138                            INFO = - 1
139                        ELSE IF( IA.NE.1 ) THEN
140                            INFO = - 4
141                        ELSE IF( JA.NE.1 ) THEN
142                            INFO = - 5
143                        ELSE IF( NPROW.NE.NPCOL ) THEN
144                            INFO = - ( 600 + CTXT_ )
145                        ELSE IF( DESCA( DTYPE_ ).NE.1 ) THEN
146                            INFO = - ( 600 + DTYPE_ )
147                        ELSE IF( DESCA( MB_ ).NE.1 ) THEN
148                            INFO = - ( 600 + MB_ )
149                        ELSE IF( DESCA( NB_ ).NE.1 ) THEN
150                            INFO = - ( 600 + NB_ )
151                        ELSE IF( DESCA( RSRC_ ).NE.0 ) THEN
152                            INFO = - ( 600 + RSRC_ )
153                        ELSE IF( DESCA( CSRC_ ).NE.0 ) THEN
154                            INFO = - ( 600 + CSRC_ )
155                        ELSE IF( LWORK.LT.LWMIN ) THEN
156                            INFO = - 11
157                        END IF
158                    END IF
159                    IF( UPPER ) THEN
160                        IDUM1( 1 ) = ICHAR( 'U' )
161                    ELSE
162                        IDUM1( 1 ) = ICHAR( 'L' )
163                    END IF
164                    IDUM2( 1 ) = 1
165  
166                    CALL PCHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , 1 , IDUM1 , IDUM2 ,
167       $            INFO )
168                END IF
169  
170                IF( INFO.NE.0 ) THEN
171                    CALL PXERBLA( ICTXT , 'PZHETTRD' , - INFO )
172                    RETURN
173                END IF
174  
175  *             Quick return if possible
176  
177                IF( N.EQ.0 )
178       $            RETURN
179  
180  *                 Reduce the lower triangle of sub( A )
181                    NP = NUMROC( N , 1 , MYROW , 0 , NPROW )
182                    NQ = NUMROC( N , 1 , MYCOL , 0 , NPCOL )
183  
184                    NXTROW = 0
185                    NXTCOL = 0
186  
187                    LIIP1 = 1
188                    LIJP1 = 1
189                    NPM1 = NP
190                    NQM1 = NQ
191  
192                    LDA = DESCA( LLD_ )
193                    ICTXT = DESCA( CTXT_ )
194  
195  *                 Miscellaneous details :
196  *                 Put tau , D and E in the right places
197  *                 Check signs
198  *                 Place all the arrays in WORK , control their placement
199  *                 in memory.
200  
201  *                 Loop invariants
202  *                 A(LIIP1 , LIJ) points to the first element of A(I + 1 , J)
203  *                 NPM1 , NQM1 = the number of rows , cols in A( LII + 1 : N , LIJ + 1 : N )
204  *                 A(LII : N , LIJ : N) is one step out of date.
205  *                 proc( CURROW , CURCOL ) owns A(LII , LIJ)
206  *                 proc( NXTROW , CURCOL ) owns A(LIIP1 , LIJ)
207  
208                    INH = 1
209  
210                    IF( INTERLEAVE ) THEN
211  
212  *                     H and V are interleaved to minimize memory movement
213  *                     LDV has to be twice as large to accomodate interleaving.
214  *                     In addition , LDV is doubled again to allow v , h and
215  *                     toptau to be spreaad across and transposed in a
216  *                     single communication operation with minimum memory
217  *                     movement.
218  
219  *                     We could reduce LDV back to 2*MAX(NPM1 , NQM1)
220  *                     by increasing the memory movement required in
221  *                     the spread and transpose of v , h and toptau.
222  *                     However , since the non - interleaved path already
223  *                     provides a mear minimum memory requirement option ,
224  *                     we did not provide this additional path.
225  
226                        LDV = 4*( MAX( NPM1 , NQM1 ) ) + 2
227  
228                        INH = 1
229  
230                        INV = INH + LDV / 2
231                        INVT = INH + ( ANB + 1 )*LDV
232  
233                        INHT = INVT + LDV / 2
234                        INTMP = INVT + LDV*( ANB + 1 )
235  
236                    ELSE
237                        LDV = MAX( NPM1 , NQM1 )
238  
239                        INHT = INH + LDV*( ANB + 1 )
240                        INV = INHT + LDV*( ANB + 1 )
241  
242  *                     The code works without this + 1 , but only because of a
243  *                     coincidence. Without the + 1 , WORK(INVT) gets trashed , but
244  *                     WORK(INVT) is only used once and when it is used , it is
245  *                     multiplied by WORK( INH ) which is zero. Hence , the fact
246  *                     that WORK(INVT) is trashed has no effect.
247  
248                        INVT = INV + LDV*( ANB + 1 ) + 1
249                        INTMP = INVT + LDV*( 2*ANB )
250  
251                    END IF
252  
253                    IF( INFO.NE.0 ) THEN
254                        CALL PXERBLA( ICTXT , 'PZHETTRD' , - INFO )
255                        WORK( 1 ) = DCMPLX( LWMIN )
256                        RETURN
257                    END IF
258  
259  *                 The satisfies the loop invariant : trueA = A - V * HT - H * VT ,
260  *                 (where V , H , VT and HT all have BINDEX + 1 rows / columns)
261  *                 the first ANB times through the loop.
262  
263  *                 Setting either( InH and InHT ) or InV to Z_ZERO
264  *                 is adequate except in the face of NaNs.
265  
266                    DO 10 I = 1 , NP
267                        WORK( INH + I - 1 ) = Z_ZERO
268                        WORK( INV + I - 1 ) = Z_ZERO
269     10             CONTINUE
270                    DO 20 I = 1 , NQ
271                        WORK( INHT + I - 1 ) = Z_ZERO
272     20             CONTINUE
273  
274                    TOPNV = Z_ZERO
275  
276                    LTLIP1 = LIJP1
277                    LTNM1 = NPM1
278                    IF( MYCOL.GT.MYROW ) THEN
279                        LTLIP1 = LTLIP1 + 1
280                        LTNM1 = LTNM1 - 1
281                    END IF
282  
283                    DO 210 MININDEX = 1 , N - 1 , ANB
284  
285                        MAXINDEX = MIN( MININDEX + ANB - 1 , N )
286                        LIJB = NUMROC( MAXINDEX , 1 , MYCOL , 0 , NPCOL ) + 1
287                        LIIB = NUMROC( MAXINDEX , 1 , MYROW , 0 , NPROW ) + 1
288  
289                        NQB = NQ - LIJB + 1
290                        NPB = NP - LIIB + 1
291                        INHTB = INHT + LIJB - 1
292                        INVTB = INVT + LIJB - 1
293                        INHB = INH + LIIB - 1
294                        INVB = INV + LIIB - 1
295  
296                        DO 160 INDEX = MININDEX , MIN( MAXINDEX , N - 1 )
297  
298                            BINDEX = INDEX - MININDEX
299  
300                            CURROW = NXTROW
301                            CURCOL = NXTCOL
302  
303                            NXTROW = MOD( CURROW + 1 , NPROW )
304                            NXTCOL = MOD( CURCOL + 1 , NPCOL )
305  
306                            LII = LIIP1
307                            LIJ = LIJP1
308                            NPM0 = NPM1
309  
310                            IF( MYROW.EQ.CURROW ) THEN
311                                NPM1 = NPM1 - 1
312                                LIIP1 = LIIP1 + 1
313                            END IF
314                            IF( MYCOL.EQ.CURCOL ) THEN
315                                NQM1 = NQM1 - 1
316                                LIJP1 = LIJP1 + 1
317                                LTLIP1 = LTLIP1 + 1
318                                LTNM1 = LTNM1 - 1
319                            END IF
320  
321  *                         V = NV , VT = NVT , H = NH , HT = NHT
322  
323  *                         Update the current column of A
324  
325                            IF( MYCOL.EQ.CURCOL ) THEN
326  
327                                INDEXA = LII + ( LIJ - 1 )*LDA
328                                INDEXINV = INV + LII - 1 + ( BINDEX - 1 )*LDV
329                                INDEXINH = INH + LII - 1 + ( BINDEX - 1 )*LDV
330                                CONJTOPH = DCONJG( WORK( INHT + LIJ - 1 + BINDEX*LDV ) )
331                                CONJTOPV = DCONJG( TOPNV )
332  
333                                IF( INDEX.GT.1 ) THEN
334                                    DO 30 I = 0 , NPM0 - 1
335  *                                     A( INDEXA + I ) = A( INDEXA + I )
336                                        A( INDEXA + I ) = A( INDEXA + I ) -
337       $                                WORK( INDEXINV + LDV + I )*CONJTOPH -
338       $                                WORK( INDEXINH + LDV + I )*CONJTOPV
339     30                             CONTINUE
340                                END IF
341  
342                            END IF
343  
344                            IF( MYCOL.EQ.CURCOL ) THEN
345  
346  *                             Compute the householder vector
347  
348                                IF( MYROW.EQ.CURROW ) THEN
349                                    DTMP( 2 ) = DBLE( A( LII + ( LIJ - 1 )*LDA ) )
350                                ELSE
351                                    DTMP( 2 ) = ZERO
352                                END IF
353                                IF( MYROW.EQ.NXTROW ) THEN
354                                    DTMP( 3 ) = DBLE( A( LIIP1 + ( LIJ - 1 )*LDA ) )
355                                    DTMP( 4 ) = DIMAG( A( LIIP1 + ( LIJ - 1 )*LDA ) )
356                                ELSE
357                                    DTMP( 3 ) = ZERO
358                                    DTMP( 4 ) = ZERO
359                                END IF
360  
361                                NORM = DZNRM2( NPM1 , A( LIIP1 + ( LIJ - 1 )*LDA ) , 1 )
362                                DTMP( 1 ) = NORM
363  
364  *                             IF DTMP(5) = 1.0 , NORM is too large and might cause
365  *                             overflow , hence PDTREECOMB must be called. IF DTMP(5)
366  *                             is zero on output , DTMP(1) can be trusted.
367  
368                                DTMP( 5 ) = ZERO
369                                IF( DTMP( 1 ).GE.SAFMAX .OR. DTMP( 1 ).LT.SAFMIN ) THEN
370                                    DTMP( 5 ) = ONE
371                                    DTMP( 1 ) = ZERO
372                                END IF
373  
374                                DTMP( 1 ) = DTMP( 1 )*DTMP( 1 )
375                                CALL DGSUM2D( ICTXT , 'C' , ' ' , 5 , 1 , DTMP , 5 , - 1 ,
376       $                        CURCOL )
377                                IF( DTMP( 5 ).EQ.ZERO ) THEN
378                                    DTMP( 1 ) = SQRT( DTMP( 1 ) )
379                                ELSE
380                                    DTMP( 1 ) = NORM
381                                    CALL PDTREECOMB( ICTXT , 'C' , 1 , DTMP , - 1 , MYCOL ,
382       $                            DCOMBNRM2 )
383                                END IF
384  
385                                NORM = DTMP( 1 )
386  
387                                D( LIJ ) = DTMP( 2 )
388                                IF( MYROW.EQ.CURROW .AND. MYCOL.EQ.CURCOL ) THEN
389                                    A( LII + ( LIJ - 1 )*LDA ) = DCMPLX( D( LIJ ) , ZERO )
390                                END IF
391  
392                                ALPHA = DCMPLX( DTMP( 3 ) , DTMP( 4 ) )
393  
394                                NORM = SIGN( NORM , DBLE( ALPHA ) )
395  
396                                IF( NORM.EQ.ZERO ) THEN
397                                    TOPTAU = ZERO
398                                ELSE
399                                    BETA = NORM + ALPHA
400                                    TOPTAU = BETA / NORM
401                                    ONEOVERBETA = 1.0D0 / BETA
402  
403                                    CALL ZSCAL( NPM1 , ONEOVERBETA ,
404       $                            A( LIIP1 + ( LIJ - 1 )*LDA ) , 1 )
405                                END IF
406  
407                                IF( MYROW.EQ.NXTROW ) THEN
408                                    A( LIIP1 + ( LIJ - 1 )*LDA ) = Z_ONE
409                                END IF
410  
411                                TAU( LIJ ) = TOPTAU
412                                E( LIJ ) = - NORM
413  
414                            END IF
415  
416  *                         Spread v , nh , toptau across
417  
418                            DO 40 I = 0 , NPM1 - 1
419                                WORK( INV + LIIP1 - 1 + BINDEX*LDV + NPM1 + I ) = A( LIIP1 + I +
420       $( LIJ - 1 )*LDA )
421     40                     CONTINUE
422  
423                            IF( MYCOL.EQ.CURCOL ) THEN
424                                WORK( INV + LIIP1 - 1 + BINDEX*LDV + NPM1 + NPM1 ) = TOPTAU
425                                CALL ZGEBS2D( ICTXT , 'R' , ' ' , NPM1 + NPM1 + 1 , 1 ,
426       $                        WORK( INV + LIIP1 - 1 + BINDEX*LDV ) ,
427       $                        NPM1 + NPM1 + 1 )
428                            ELSE
429                                CALL ZGEBR2D( ICTXT , 'R' , ' ' , NPM1 + NPM1 + 1 , 1 ,
430       $                        WORK( INV + LIIP1 - 1 + BINDEX*LDV ) ,
431       $                        NPM1 + NPM1 + 1 , MYROW , CURCOL )
432                                TOPTAU = WORK( INV + LIIP1 - 1 + BINDEX*LDV + NPM1 + NPM1 )
433                            END IF
434                            DO 50 I = 0 , NPM1 - 1
435                                WORK( INH + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I ) = WORK( INV + LIIP1 -
436       $                        1 + BINDEX*LDV + NPM1 + I )
437     50                     CONTINUE
438  
439                            IF( INDEX.LT.N ) THEN
440                                IF( MYROW.EQ.NXTROW .AND. MYCOL.EQ.CURCOL )
441       $                            A( LIIP1 + ( LIJ - 1 )*LDA ) = E( LIJ )
442                                END IF
443  
444  *                             Transpose v , nh
445  
446                                IF( MYROW.EQ.MYCOL ) THEN
447                                    DO 60 I = 0 , NPM1 + NPM1
448                                        WORK( INVT + LIJP1 - 1 + BINDEX*LDV + I ) = WORK( INV + LIIP1 - 1 +
449       $                                BINDEX*LDV + I )
450     60                 CONTINUE
451                    ELSE
452                        CALL ZGESD2D( ICTXT , NPM1 + NPM1 , 1 ,
453       $                WORK( INV + LIIP1 - 1 + BINDEX*LDV ) , NPM1 + NPM1 ,
454       $                MYCOL , MYROW )
455                        CALL ZGERV2D( ICTXT , NQM1 + NQM1 , 1 ,
456       $                WORK( INVT + LIJP1 - 1 + BINDEX*LDV ) , NQM1 + NQM1 ,
457       $                MYCOL , MYROW )
458                    END IF
459  
460                    DO 70 I = 0 , NQM1 - 1
461                        WORK( INHT + LIJP1 - 1 + ( BINDEX + 1 )*LDV + I ) = WORK( INVT +
462       $                LIJP1 - 1 + BINDEX*LDV + NQM1 + I )
463     70             CONTINUE
464  
465  *                 Update the current block column of A
466  
467                    IF( INDEX.GT.1 ) THEN
468                        DO 90 J = LIJP1 , LIJB - 1
469                            DO 80 I = 0 , NPM1 - 1
470  
471                                A( LIIP1 + I + ( J - 1 )*LDA ) = A( LIIP1 + I + ( J - 1 )*LDA )
472       $                        - WORK( INV + LIIP1 - 1 + BINDEX*LDV + I )*
473       $                        DCONJG( WORK( INHT + J - 1 + BINDEX*LDV ) ) -
474       $                        WORK( INH + LIIP1 - 1 + BINDEX*LDV + I )*
475       $                        DCONJG( WORK( INVT + J - 1 + BINDEX*LDV ) )
476     80                     CONTINUE
477     90                 CONTINUE
478                    END IF
479  
480  *                 Compute NV = A * NHT ; NVT = A * NH
481  
482  *                 These two lines are necessary because these elements
483  *                 are not always involved in the calls to ZTRMVT
484  *                 for two reasons :
485  *                 1) On diagonal processors , the call to TRMVT
486  *                 involves only LTNM1 - 1 elements
487  *                 2) On some processes , NQM1 < LTM1 or LIIP1 < LTLIP1
488  *                 and when the results are combined across all processes ,
489  *                 uninitialized values may be included.
490                    WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV ) = Z_ZERO
491                    WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*LDV + NQM1 - 1 ) = Z_ZERO
492  
493                    IF( MYROW.EQ.MYCOL ) THEN
494                        IF( LTNM1.GT.1 ) THEN
495                            CALL ZTRMVT ( 'L' , LTNM1 - 1 ,
496       $                    A( LTLIP1 + 1 + ( LIJP1 - 1 )*LDA ) , LDA ,
497       $                    WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*LDV ) , 1 ,
498       $                    WORK( INH + LTLIP1 + 1 - 1 + ( BINDEX + 1 )*LDV ) ,
499       $                    1 , WORK( INV + LTLIP1 + 1 - 1 + ( BINDEX + 1 )*
500       $                    LDV ) , 1 , WORK( INHT + LIJP1 - 1 + ( BINDEX +
501       $                    1 )*LDV ) , 1 )
502                        END IF
503                        DO 100 I = 1 , LTNM1
504                            WORK( INVT + LIJP1 + I - 1 - 1 + ( BINDEX + 1 )*LDV )
505       $                    = WORK( INVT + LIJP1 + I - 1 - 1 + ( BINDEX + 1 )*LDV ) +
506       $                    A( LTLIP1 + I - 1 + ( LIJP1 + I - 1 - 1 )*LDA )*
507       $                    WORK( INH + LTLIP1 + I - 1 - 1 + ( BINDEX + 1 )*LDV )
508    100                 CONTINUE
509                    ELSE
510                        IF( LTNM1.GT.0 )
511       $                    CALL ZTRMVT ( 'L' , LTNM1 , A( LTLIP1 + ( LIJP1 - 1 )*LDA ) ,
512       $                    LDA , WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*
513       $                    LDV ) , 1 , WORK( INH + LTLIP1 - 1 + ( BINDEX +
514       $                    1 )*LDV ) , 1 , WORK( INV + LTLIP1 - 1 +
515       $( BINDEX + 1 )*LDV ) , 1 ,
516       $                    WORK( INHT + LIJP1 - 1 + ( BINDEX + 1 )*LDV ) ,
517       $                    1 )
518  
519                        END IF
520  
521  *                     We take advantage of the fact that :
522  *                     A * sum( B ) = sum( A * B ) for matrices A , B
523  
524  *                     trueA = A + V * HT + H * VT
525  *                     hence :(trueA)v = Av' + V * HT * v + H * VT * v
526  *                     VT * v = sum_p_in_NPROW( VTp * v )
527  *                     H * VT * v = H * sum(VTp * v) = sum( H * VTp * v )
528  
529  *                     v = v + V * HT * h + H * VT * h
530  
531  *                     tmp = HT * nh1
532                        DO 110 I = 1 , 2*( BINDEX + 1 )
533                            WORK( INTMP - 1 + I ) = 0
534    110                 CONTINUE
535  
536                        IF( BALANCED ) THEN
537                            NPSET = NPROW
538                            MYSETNUM = MYROW
539                            ROWSPERPROC = ICEIL( NQB , NPSET )
540                            MYFIRSTROW = MIN( NQB + 1 , 1 + ROWSPERPROC*MYSETNUM )
541                            NUMROWS = MIN( ROWSPERPROC , NQB - MYFIRSTROW + 1 )
542  
543  *                         tmp = HT * v
544  
545                            CALL ZGEMV( 'C' , NUMROWS , BINDEX + 1 , Z_ONE ,
546       $                    WORK( INHTB + MYFIRSTROW - 1 ) , LDV ,
547       $                    WORK( INHTB + MYFIRSTROW - 1 + ( BINDEX + 1 )*LDV ) ,
548       $                    1 , Z_ZERO , WORK( INTMP ) , 1 )
549  *                         tmp2 = VT * v
550                            CALL ZGEMV( 'C' , NUMROWS , BINDEX + 1 , Z_ONE ,
551       $                    WORK( INVTB + MYFIRSTROW - 1 ) , LDV ,
552       $                    WORK( INHTB + MYFIRSTROW - 1 + ( BINDEX + 1 )*LDV ) ,
553       $                    1 , Z_ZERO , WORK( INTMP + BINDEX + 1 ) , 1 )
554  
555                            CALL ZGSUM2D( ICTXT , 'C' , ' ' , 2*( BINDEX + 1 ) , 1 ,
556       $                    WORK( INTMP ) , 2*( BINDEX + 1 ) , - 1 , - 1 )
557                        ELSE
558  *                         tmp = HT * v
559  
560                            CALL ZGEMV( 'C' , NQB , BINDEX + 1 , Z_ONE , WORK( INHTB ) ,
561       $                    LDV , WORK( INHTB + ( BINDEX + 1 )*LDV ) , 1 ,
562       $                    Z_ZERO , WORK( INTMP ) , 1 )
563  *                         tmp2 = VT * v
564                            CALL ZGEMV( 'C' , NQB , BINDEX + 1 , Z_ONE , WORK( INVTB ) ,
565       $                    LDV , WORK( INHTB + ( BINDEX + 1 )*LDV ) , 1 ,
566       $                    Z_ZERO , WORK( INTMP + BINDEX + 1 ) , 1 )
567  
568                        END IF
569  
570                        IF( BALANCED ) THEN
571                            MYSETNUM = MYCOL
572  
573                            ROWSPERPROC = ICEIL( NPB , NPSET )
574                            MYFIRSTROW = MIN( NPB + 1 , 1 + ROWSPERPROC*MYSETNUM )
575                            NUMROWS = MIN( ROWSPERPROC , NPB - MYFIRSTROW + 1 )
576  
577                            CALL ZGSUM2D( ICTXT , 'R' , ' ' , 2*( BINDEX + 1 ) , 1 ,
578       $                    WORK( INTMP ) , 2*( BINDEX + 1 ) , - 1 , - 1 )
579  
580  *                         v = v + V * tmp
581                            IF( INDEX.GT.1. ) THEN
582                                CALL ZGEMV( 'N' , NUMROWS , BINDEX + 1 , Z_NEGONE ,
583       $                        WORK( INVB + MYFIRSTROW - 1 ) , LDV ,
584       $                        WORK( INTMP ) , 1 , Z_ONE ,
585       $                        WORK( INVB + MYFIRSTROW - 1 + ( BINDEX + 1 )*
586       $                        LDV ) , 1 )
587  
588  *                             v = v + H * tmp2
589                                CALL ZGEMV( 'N' , NUMROWS , BINDEX + 1 , Z_NEGONE ,
590       $                        WORK( INHB + MYFIRSTROW - 1 ) , LDV ,
591       $                        WORK( INTMP + BINDEX + 1 ) , 1 , Z_ONE ,
592       $                        WORK( INVB + MYFIRSTROW - 1 + ( BINDEX + 1 )*
593       $                        LDV ) , 1 )
594                            END IF
595  
596                        ELSE
597  *                         v = v + V * tmp
598                            CALL ZGEMV( 'N' , NPB , BINDEX + 1 , Z_NEGONE , WORK( INVB ) ,
599       $                    LDV , WORK( INTMP ) , 1 , Z_ONE ,
600       $                    WORK( INVB + ( BINDEX + 1 )*LDV ) , 1 )
601  
602  *                         v = v + H * tmp2
603                            CALL ZGEMV( 'N' , NPB , BINDEX + 1 , Z_NEGONE , WORK( INHB ) ,
604       $                    LDV , WORK( INTMP + BINDEX + 1 ) , 1 , Z_ONE ,
605       $                    WORK( INVB + ( BINDEX + 1 )*LDV ) , 1 )
606  
607                        END IF
608  
609  *                     Transpose NV and add it back into NVT
610  
611                        IF( MYROW.EQ.MYCOL ) THEN
612                            DO 120 I = 0 , NQM1 - 1
613                                WORK( INTMP + I ) = WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*LDV +
614       $                        I )
615    120                     CONTINUE
616                        ELSE
617                            CALL ZGESD2D( ICTXT , NQM1 , 1 ,
618       $                    WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*LDV ) ,
619       $                    NQM1 , MYCOL , MYROW )
620                            CALL ZGERV2D( ICTXT , NPM1 , 1 , WORK( INTMP ) , NPM1 , MYCOL ,
621       $                    MYROW )
622  
623                        END IF
624                        DO 130 I = 0 , NPM1 - 1
625                            WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I ) = WORK( INV + LIIP1 -
626       $                    1 + ( BINDEX + 1 )*LDV + I ) + WORK( INTMP + I )
627    130                 CONTINUE
628  
629  *                     Sum - to - one NV rowwise(within a row)
630  
631                        CALL ZGSUM2D( ICTXT , 'R' , ' ' , NPM1 , 1 ,
632       $                WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV ) , NPM1 ,
633       $                MYROW , NXTCOL )
634  
635  *                     Dot product c = NV * NH
636  *                     Sum - to - all c within next processor column
637  
638                        IF( MYCOL.EQ.NXTCOL ) THEN
639                            CC( 1 ) = Z_ZERO
640                            DO 140 I = 0 , NPM1 - 1
641                                CC( 1 ) = CC( 1 ) + DCONJG( WORK( INV + LIIP1 - 1 +
642       $( BINDEX + 1 )*LDV + I ) )*
643       $                        WORK( INH + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I )
644    140                     CONTINUE
645                            IF( MYROW.EQ.NXTROW ) THEN
646                                CC( 2 ) = WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV )
647                                CC( 3 ) = WORK( INH + LIIP1 - 1 + ( BINDEX + 1 )*LDV )
648                            ELSE
649                                CC( 2 ) = Z_ZERO
650                                CC( 3 ) = Z_ZERO
651                            END IF
652                            CALL ZGSUM2D( ICTXT , 'C' , ' ' , 3 , 1 , CC , 3 , - 1 , NXTCOL )
653  
654                            TOPV = CC( 2 )
655                            C = CC( 1 )
656                            TOPH = CC( 3 )
657  
658                            TOPNV = TOPTAU*( TOPV - C*DCONJG( TOPTAU ) / 2*TOPH )
659  
660  *                         Compute V = Tau * (V - C * Tau' / 2 * H )
661  
662                            DO 150 I = 0 , NPM1 - 1
663                                WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I ) = TOPTAU*
664       $( WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I ) - C*
665       $                        DCONJG( TOPTAU ) / 2*WORK( INH + LIIP1 - 1 + ( BINDEX +
666       $                        1 )*LDV + I ) )
667    150                     CONTINUE
668  
669                        END IF
670  
671    160                 CONTINUE
672  
673  *                     Perform the rank2k update
674  
675                        IF( MAXINDEX.LT.N ) THEN
676  
677                            DO 170 I = 0 , NPM1 - 1
678                                WORK( INTMP + I ) = WORK( INH + LIIP1 - 1 + ANB*LDV + I )
679    170                     CONTINUE
680  
681                            IF( .NOT.TWOGEMMS ) THEN
682                                IF( INTERLEAVE ) THEN
683                                    LDZG = LDV / 2
684                                ELSE
685                                    CALL ZLACPY( 'A' , LTNM1 , ANB , WORK( INHT + LIJP1 - 1 ) ,
686       $                            LDV , WORK( INVT + LIJP1 - 1 + ANB*LDV ) , LDV )
687  
688                                    CALL ZLACPY( 'A' , LTNM1 , ANB , WORK( INV + LTLIP1 - 1 ) ,
689       $                            LDV , WORK( INH + LTLIP1 - 1 + ANB*LDV ) , LDV )
690                                    LDZG = LDV
691                                END IF
692                                NBZG = ANB*2
693                            ELSE
694                                LDZG = LDV
695                                NBZG = ANB
696                            END IF
697  
698                            DO 180 PBMIN = 1 , LTNM1 , PNB
699  
700                                PBSIZE = MIN( PNB , LTNM1 - PBMIN + 1 )
701                                PBMAX = MIN( LTNM1 , PBMIN + PNB - 1 )
702                                CALL ZGEMM( 'N' , 'C' , PBSIZE , PBMAX , NBZG , Z_NEGONE ,
703       $                        WORK( INH + LTLIP1 - 1 + PBMIN - 1 ) , LDZG ,
704       $                        WORK( INVT + LIJP1 - 1 ) , LDZG , Z_ONE ,
705       $                        A( LTLIP1 + PBMIN - 1 + ( LIJP1 - 1 )*LDA ) , LDA )
706                                IF( TWOGEMMS ) THEN
707                                    CALL ZGEMM( 'N' , 'C' , PBSIZE , PBMAX , ANB , Z_NEGONE ,
708       $                            WORK( INV + LTLIP1 - 1 + PBMIN - 1 ) , LDZG ,
709       $                            WORK( INHT + LIJP1 - 1 ) , LDZG , Z_ONE ,
710       $                            A( LTLIP1 + PBMIN - 1 + ( LIJP1 - 1 )*LDA ) , LDA )
711                                END IF
712    180                     CONTINUE
713  
714                            DO 190 I = 0 , NPM1 - 1
715                                WORK( INV + LIIP1 - 1 + I ) = WORK( INV + LIIP1 - 1 + ANB*LDV + I )
716                                WORK( INH + LIIP1 - 1 + I ) = WORK( INTMP + I )
717    190                     CONTINUE
718                            DO 200 I = 0 , NQM1 - 1
719                                WORK( INHT + LIJP1 - 1 + I ) = WORK( INHT + LIJP1 - 1 + ANB*LDV + I )
720    200                     CONTINUE
721  
722                        END IF
723  
724  *                     End of the update A code
725  
726    210             CONTINUE
727  
728                    IF( MYCOL.EQ.NXTCOL ) THEN
729                        IF( MYROW.EQ.NXTROW ) THEN
730  
731                            D( NQ ) = DBLE( A( NP + ( NQ - 1 )*LDA ) )
732                            A( NP + ( NQ - 1 )*LDA ) = D( NQ )
733  
734                            CALL DGEBS2D( ICTXT , 'C' , ' ' , 1 , 1 , D( NQ ) , 1 )
735                        ELSE
736                            CALL DGEBR2D( ICTXT , 'C' , ' ' , 1 , 1 , D( NQ ) , 1 , NXTROW ,
737       $                    NXTCOL )
738                        END IF
739                    END IF
740  
741                    WORK( 1 ) = DCMPLX( LWMIN )
742                    RETURN
743  
744  *                 End of PZHETTRD
745  
746                END