Routine: PDSYTTRD()  File: SRC\pdsyttrd.f

 
 
# lines: 1199
  # code: 1199
  # comment: 0
  # blank:0
# Variables:113
# Callers:1
# Callings:3
# Words:451
# Keywords:294
 

 

..
     .. Array Arguments ..
     ..
     Purpose
     =======
     PDSYTTRD reduces a complex Hermitian matrix sub( A ) to Hermitian
     tridiagonal form T by an unitary similarity transformation:
     Q' * sub( A ) * Q = T, where sub( A ) = A(IA:IA+N-1,JA:JA+N-1).
     Notes
     =====
     Each global data object is described by an associated description
     vector.  This vector stores the information required to establish
     the mapping between an object element and its corresponding
     process and memory location.
     Let A be a generic term for any 2D block cyclicly distributed
     array.
     Such a global array has an associated description vector DESCA.
     In the following comments, the character _ should be read as
     "of the global array".
     NOTATION        STORED IN      EXPLANATION
     --------------- -------------- -----------------------------------
     DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
     DTYPE_A = 1.
     CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle,
     indicating the BLACS process grid A is distribu-
     ted over. The context itself is glo-
     bal, but the handle (the integer
     value) may vary.
     M_A    (global) DESCA( M_ )    The number of rows in the global
     array A.
     N_A    (global) DESCA( N_ )    The number of columns in the global
     array A.
     MB_A   (global) DESCA( MB_ )   The blocking factor used to
     distribute the rows of the array.
     NB_A   (global) DESCA( NB_ )   The blocking factor used to
     distribute the columns of the array.
     RSRC_A (global) DESCA( RSRC_ ) The process row over which the
     first row of the array A is distributed.
     CSRC_A (global) DESCA( CSRC_ ) The process column over which the
     first column of the array A is
     distributed.
     LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
     array.  LLD_A >= MAX(1,LOCp(M_A)).
     Let K be the number of rows or columns of a distributed matrix,
     and assume that its process grid has dimension p x q.
     LOCp( K ) denotes the number of elements of K that a process
     would receive if K were distributed over the p processes of its
     process column.
     Similarly, LOCq( K ) denotes the number of elements of K that a
     process would receive if K were distributed over the q processes
     of its process row.
     The values of LOCp() and LOCq() may be determined via a call to
     the ScaLAPACK tool function, NUMROC:
     LOCp( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
     LOCq( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
     An upper bound for these quantities may be computed by:
     LOCp( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
     LOCq( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
     Arguments
     =========
     UPLO    (global input) CHARACTER
     Specifies whether the upper or lower triangular part of the
     Hermitian matrix sub( A ) is stored:
     = 'U':  Upper triangular
     = 'L':  Lower triangular
     N       (global input) INTEGER
     The number of rows and columns to be operated on, i.e. the
     order of the distributed submatrix sub( A ). N >= 0.
     A  (local input/local output) DOUBLE PRECISION pointer into the
     local memory to an array of dimension (LLD_A,LOCq(JA+N-1)).
     On entry, this array contains the local pieces of the
     Hermitian distributed matrix sub( A ).  If UPLO = 'U', the
     leading N-by-N upper triangular part of sub( A ) contains
     the upper triangular part of the matrix, and its strictly
     lower triangular part is not referenced. If UPLO = 'L', the
     leading N-by-N lower triangular part of sub( A ) contains the
     lower triangular part of the matrix, and its strictly upper
     triangular part is not referenced. On exit, if UPLO = 'U',
     the diagonal and first superdiagonal of sub( A ) are over-
     written by the corresponding elements of the tridiagonal
     matrix T, and the elements above the first superdiagonal,
     with the array TAU, represent the unitary matrix Q as a
     product of elementary reflectors; if UPLO = 'L', the diagonal
     and first subdiagonal of sub( A ) are overwritten by the
     corresponding elements of the tridiagonal matrix T, and the
     elements below the first subdiagonal, with the array TAU,
     represent the unitary matrix Q as a product of elementary
     reflectors. See Further Details.
     IA      (global input) INTEGER
     The row index in the global array A indicating the first
     row of sub( A ).
     JA      (global input) INTEGER
     The column index in the global array A indicating the
     first column of sub( A ).
     DESCA   (global and local input) INTEGER array of dimension DLEN_.
     The array descriptor for the distributed matrix A.
     D       (local output) DOUBLE PRECISION array, dim LOCq(JA+N-1)
     The diagonal elements of the tridiagonal matrix T:
     D(i) = A(i,i). D is tied to the distributed matrix A.
     E       (local output) DOUBLE PRECISION array, dim LOCq(JA+N-1)
     if UPLO = 'U', LOCq(JA+N-2) otherwise. The off-diagonal
     elements of the tridiagonal matrix T: E(i) = A(i,i+1) if
     UPLO = 'U', E(i) = A(i+1,i) if UPLO = 'L'. E is tied to the
     distributed matrix A.
     TAU     (local output) DOUBLE PRECISION array, dimension
     LOCq(JA+N-1). This array contains the scalar factors TAU of
     the elementary reflectors. TAU is tied to the distributed
     matrix A.
     WORK  (local workspace) DOUBLE PRECISION array, dimension (LWORK)
     On exit, WORK( 1 ) returns the minimal and optimal workspace
     LWORK   (local input) INTEGER
     The dimension of the array WORK.
     LWORK >= 2*( ANB+1 )*( 4*NPS+2 ) + NPS
     Where:
         NPS = MAX( NUMROC( N, 1, 0, 0, NPROW ), 2*ANB )
         ANB = PJLAENV( DESCA( CTXT_ ), 3, 'PDSYTTRD', 'L', 0, 0,
           0, 0 )
         NUMROC is a ScaLAPACK tool function;
         PJLAENV is a ScaLAPACK envionmental inquiry function
         MYROW, MYCOL, NPROW and NPCOL can be determined by calling
         the subroutine BLACS_GRIDINFO.
     INFO    (global output) INTEGER
     = 0:  successful exit
     < 0:  If the i-th argument is an array and the j-entry had
     an illegal value, then INFO = -(i*100+j), if the i-th
     argument is a scalar and had an illegal value, then
     INFO = -i.
     Further Details
     ===============
     If UPLO = 'U', the matrix Q is represented as a product of
     elementary reflectors
     Q = H(n-1) . . . H(2) H(1).
     Each H(i) has the form
     H(i) = I - tau * v * v'
     where tau is a complex scalar, and v is a complex vector with
     v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in
     A(ia:ia+i-2,ja+i), and tau in TAU(ja+i-1).
     If UPLO = 'L', the matrix Q is represented as a product of
     elementary reflectors
     Q = H(1) H(2) . . . H(n-1).
     Each H(i) has the form
     H(i) = I - tau * v * v'
     where tau is a complex scalar, and v is a complex vector with
     v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in
     A(ia+i+1:ia+n-1,ja+i-1), and tau in TAU(ja+i-1).
     The contents of sub( A ) on exit are illustrated by the following
     examples with n = 5:
     if UPLO = 'U':                       if UPLO = 'L':
     (  d   e   v2  v3  v4 )              (  d                  )
     (      d   e   v3  v4 )              (  e   d              )
     (          d   e   v4 )              (  v1  e   d          )
     (              d   e  )              (  v1  v2  e   d      )
     (                  d  )              (  v1  v2  v3  e   d  )
     where d and e denote diagonal and off-diagonal elements of T, and
     vi denotes an element of the vector defining H(i).
     Data storage requirements
     =========================
     PDSYTTRD is not intended to be called directly.  All users are
     encourage to call PDSYTRD which will then call PDHETTRD if
     appropriate.  A must be in cyclic format (i.e. MB = NB = 1),
     the process grid must be square ( i.e. NPROW = NPCOL ) and
     only lower triangular storage is supported.
     Local variables
     ===============
     PDSYTTRD uses five local arrays:
       WORK ( InV ) dimension ( NP, ANB+1): array V
       WORK ( InH ) dimension ( NP, ANB+1): array H
       WORK ( InVT ) dimension ( NQ, ANB+1): transpose of the array V
       WORK ( InHT ) dimension ( NQ, ANB+1): transpose of the array H
       WORK ( InVTT ) dimension ( NQ, 1): transpose of the array VT
     Arrays V and H are replicated across all processor columns.
     Arrays V^T and H^T are replicated across all processor rows.
         WORK ( InVT ), or V^T, is stored as a tall skinny
         array ( NQ x ANB-1 ) for efficiency.  Since only the lower
         triangular portion of A is updated, Av is computed as:
         tril(A) * v + v^T * tril(A,-1).  This is performed as
         two local triangular matrix-vector multiplications (both in
         MVR2) followed by a transpose and a sum across the columns.
         In the local computation, WORK( InVT ) is used to compute
         tril(A) * v and WORK( InV ) is used to compute
         v^T * tril(A,-1)
     The following variables are global indices into A:
       INDEX:  The current global row and column number.
       MAXINDEX:  The global row and column for the first row and
       column in the trailing block of A.
       LIIB, LIJB:  The first row, column in
     The following variables point into the arrays A, V, H, V^T, H^T:
       BINDEX  =INDEX-MININDEX: The column index in V, H, V^T, H^T.
       LII:  local index I:  The local row number for row INDEX
       LIJ:  local index J:  The local column number for column INDEX
       LIIP1:  local index I+1:  The local row number for row INDEX+1
       LIJP1:  local index J+1:  The local col number for col INDEX+1
       LTLI: lower triangular local index I:  The local row for the
         upper left entry in tril( A(INDEX, INDEX) )
       LTLIP1: lower triangular local index I+1:  The local row for the
         upper left entry in tril( A(INDEX+1, INDEX+1) )
         Details:  The distinction between LII and LTLI (and between
         LIIP1 and LTLIP1) is subtle.  Within the current processor
         column (i.e. MYCOL .eq. CURCOL) they are the same.  However,
         on some processors, A( LII, LIJ ) points to an element
         above the diagonal, on these processors, LTLI = LII+1.
     The following variables give the number of rows and/or columns
     in various matrices:
       NP:  The number of local rows in A( 1:N, 1:N )
       NQ:  The number of local columns in A( 1:N, 1:N )
       NPM0:  The number of local rows in A( INDEX:N, INDEX:N )
       NQM0:  The number of local columns in A( INDEX:N, INDEX:N )
       NPM1:  The number of local rows in A( INDEX+1:N, INDEX:N )
       NQM1:  The number of local columns in A( INDEX+1:N, INDEX:N )
       LTNM0:  The number of local rows & columns in
         tril( A( INDEX:N, INDEX:N ) )
       LTNM1:  The number of local rows & columns in
         tril( A( INDEX+1:N, INDEX+1:N ) )
         NOTE:  LTNM0 == LTNM1 on all processors except the diagonal
         processors, i.e. those where MYCOL == MYROW.
         Invariants:
           NP = NPM0 + LII - 1
           NQ = NQM0 + LIJ - 1
           NP = NPM1 + LIIP1 - 1
           NQ = NQM1 + LIJP1 - 1
           NP = LTLI + LTNM0 - 1
           NP = LTLIP1 + LTNM1 - 1
       Temporary variables.  The following variables are used within
       a few lines after they are set and do hold state from one loop
       iteration to the next:
     The matrix A:
       The matrix A does not hold the same values that it would
       in an unblocked code nor the values that it would hold in
       in a blocked code.
       The value of A is confusing.  It is easiest to state the
       difference between trueA and A at the point that MVR2 is called,
       so we will start there.
       Let trueA be the value that A would
       have at a given point in an unblocked code and A
       be the value that A has in this code at the same point.
       At the time of the call to MVR2,
       trueA = A + V' * H + H' * V
       where H = H( MAXINDEX:N, 1:BINDEX ) and
       V = V( MAXINDEX:N, 1:BINDEX ).
       At the bottom of the inner loop,
       trueA = A +  V' * H + H' * V + v' * h + h' * v
       where H = H( MAXINDEX:N, 1:BINDEX ) and
       V = V( MAXINDEX:N, 1:BINDEX ) and
       v = V( liip1:N, BINDEX+1 ) and
       h = H( liip1:N, BINDEX+1 )
       At the top of the loop, BINDEX gets incremented, hence:
       trueA = A +  V' * H + H' * V + v' * h + h' * v
       where H = H( MAXINDEX:N, 1:BINDEX-1 ) and
       V = V( MAXINDEX:N, 1:BINDEX-1 ) and
       v = V( liip1:N, BINDEX ) and
       h = H( liip1:N, BINDEX )
       A gets updated at the bottom of the outer loop
       After this update, trueA = A + v' * h + h' * v
       where v = V( liip1:N, BINDEX ) and
       h = H( liip1:N, BINDEX ) and BINDEX = 0
       Indeed, the previous loop invariant as stated above for the
       top of the loop still holds, but with BINDEX = 0, H and V
       are null matrices.
       After the current column of A is updated,
         trueA( INDEX, INDEX:N ) = A( INDEX, INDEX:N )
       the rest of A is untouched.
       After the current block column of A is updated,
       trueA = A + V' * H + H' * V
       where H = H( MAXINDEX:N, 1:BINDEX ) and
       V = V( MAXINDEX:N, 1:BINDEX )
       This brings us back to the point at which mvr2 is called.
     Details of the parallelization:
       We delay spreading v across to all processor columns (which
       would naturally happen at the bottom of the loop) in order to
       combine the spread of v( : , i-1 ) with the spread of h( : , i )
       In order to compute h( :, i ), we must update A( :, i )
       which means that the processor column owning A( :, i ) must
       have: c, tau, v( i, i ) and h( i, i ).
       The traditional
       way of computing v (and the one used in pzlatrd.f and
       zlatrd.f) is:
         v = tau * v
         c = v' * h
         alpha = - tau * c / 2
         v = v + alpha * h
       However, the traditional way of computing v requires that tau
       be broadcast to all processors in the current column (to compute
       v = tau * v) and then a sum-to-all is required (to
       compute v' * h ).  We use the following formula instead:
         c = v' * h
         v = tau * ( v - c * tau' * h / 2 )
       The above formula allows tau to be spread down in the
       same call to DGSUM2D which performs the sum-to-all of c.
       The computation of v, which could be performed in any processor
       column (or other procesor subsets), is performed in the
       processor column that owns A( :, i+1 ) so that A( :, i+1 )
       can be updated prior to spreading v across.
       We keep the block column of A up-to-date to minimize the
       work required in updating the current column of A.  Updating
       the block column of A is reasonably load balanced whereas
       updating the current column of A is not (only the current
       processor column is involved).
     In the following overview of the steps performed, M in the
     margin indicates message traffic and C indicates O(n^2 nb/sqrt(p))
     or more flops per processor.
     Inner loop:
       A( index:n, index ) -= ( v * ht(bindex) + h * vt( bindex) )
M      h = house( A(index:n, index) )
M      Spread v, h across
M      vt = v^T; ht = h^T
       A( index+1:n, index+1:maxindex ) -=
         ( v * ht(index+1:maxindex) + h *vt(index+1:maxindex) )
C      v = tril(A) * h; vt = ht * tril(A,-1)
MorC   v = v - H*V*h - V*H*h
M      v = v + vt^T
M      c = v' * h
       v = tau * ( v - c * tau' * h / 2 )
C    A = A - H*V - V*H
     =================================================================
     .. Parameters ..

 
Display dynamic version Find AutoScroll Reload FontSize: - + Hide Comments Hide Blanks Frame FullScreen MailPrint

 
001        SUBROUTINE PDSYTTRD( UPLO , N , A , IA , JA , DESCA , D , E , TAU , WORK ,
002       $LWORK , INFO )
003  
004  *     -- ScaLAPACK routine(version 1.7) --
005  *     University of Tennessee , Knoxville , Oak Ridge National Laboratory ,
006  *     and University of California , Berkeley.
007  *     May 25 , 2001
008  
009  *     .. Scalar Arguments ..
010        CHARACTER UPLO
011        INTEGER IA , INFO , JA , LWORK , N
012        INTEGER BLOCK_CYCLIC_2D , DLEN_ , DTYPE_ , CTXT_ , M_ , N_ ,
013       $MB_ , NB_ , RSRC_ , CSRC_ , LLD_
014        PARAMETER( BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 ,
015       $CTXT_ = 2 , M_ = 3 , N_ = 4 , MB_ = 5 , NB_ = 6 ,
016       $RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 )
017        DOUBLE PRECISION ONE
018        PARAMETER( ONE = 1.0D0 )
019        DOUBLE PRECISION Z_ONE , Z_NEGONE , Z_ZERO
020        PARAMETER( Z_ONE = 1.0D0 , Z_NEGONE = - 1.0D0 ,
021       $Z_ZERO = 0.0D0 )
022        DOUBLE PRECISION ZERO
023        PARAMETER( ZERO = 0.0D + 0 )
024  *     ..
025  
026  *     .. Local Scalars ..
027  
028        LOGICAL BALANCED , INTERLEAVE , TWOGEMMS , UPPER
029        INTEGER ANB , BINDEX , CURCOL , CURROW , I , ICTXT , INDEX ,
030       $INDEXA , INDEXINH , INDEXINV , INH , INHB , INHT ,
031       $INHTB , INTMP , INV , INVB , INVT , INVTB , J , LDA ,
032       $LDV , LDZG , LII , LIIB , LIIP1 , LIJ , LIJB , LIJP1 ,
033       $LTLIP1 , LTNM1 , LWMIN , MAXINDEX , MININDEX ,
034       $MYCOL , MYFIRSTROW , MYROW , MYSETNUM , NBZG , NP ,
035       $NPB , NPCOL , NPM0 , NPM1 , NPROW , NPS , NPSET , NQ ,
036       $NQB , NQM1 , NUMROWS , NXTCOL , NXTROW , PBMAX ,
037       $PBMIN , PBSIZE , PNB , ROWSPERPROC
038        DOUBLE PRECISION ALPHA , BETA , C , CONJTOPH , CONJTOPV , NORM ,
039       $ONEOVERBETA , SAFMAX , SAFMIN , TOPH , TOPNV ,
040       $TOPTAU , TOPV
041  *     ..
042  *     .. Local Arrays ..
043  
044        INTEGER IDUM1( 1 ) , IDUM2( 1 )
045        DOUBLE PRECISION CC( 3 ) , DTMP( 5 )
046  *     ..
047  *     .. External Subroutines ..
048        EXTERNAL BLACS_GRIDINFO , CHK1MAT , DCOMBNRM2 , DGEBR2D ,
049       $DGEBS2D , DGEMM , DGEMV , DGERV2D , DGESD2D ,
050       $DGSUM2D , DLACPY , DSCAL , DTRMVT , PCHK1MAT ,
051       $PDTREECOMB , PXERBLA
052  *     ..
053  *     .. External Functions ..
054  
055        LOGICAL LSAME
056        INTEGER ICEIL , NUMROC , PJLAENV
057        DOUBLE PRECISION DNRM2 , PDLAMCH
058        EXTERNAL LSAME , ICEIL , NUMROC , PJLAENV , DNRM2 , PDLAMCH
059  *     ..
060  *     .. Intrinsic Functions ..
061        INTRINSIC DBLE , ICHAR , MAX , MIN , MOD , SIGN , SQRT
062  *     ..
063  
064  *     .. Executable Statements ..
065  *     This is just to keep ftnchek and toolpack / 1 happy
066        IF( BLOCK_CYCLIC_2D*CSRC_*CTXT_*DLEN_*DTYPE_*LLD_*MB_*M_*NB_*N_*
067       $    RSRC_.LT.0 )RETURN
068  
069  *         Further details
070  *         ===============
071  
072  *         At the top of the loop , v and nh have been computed but not
073  *         spread across. Hence , A is out - of - date even after the
074  *         rank 2k update. Furthermore , we compute the next v before
075  *         nh is spread across.
076  
077  *         I claim that if we used a sum - to - all on NV , by summing CC within
078  *         each column , that we could compute NV locally and could avoid
079  *         spreading V across. Bruce claims that sum - to - all can be made
080  *         to cost no more than sum - to - one on the Paragon. If that is
081  *         true , this would be a win. But ,
082  *         the BLACS sum - to - all is just a sum - to - one followed by a broadcast ,
083  *         and hence the present scheme is better for now.
084  
085  *         Get grid parameters
086  
087            ICTXT = DESCA( CTXT_ )
088            CALL BLACS_GRIDINFO( ICTXT , NPROW , NPCOL , MYROW , MYCOL )
089  
090            SAFMAX = SQRT( PDLAMCH( ICTXT , 'O' ) ) / N
091            SAFMIN = SQRT( PDLAMCH( ICTXT , 'S' ) )
092  
093  *         Test the input parameters
094  
095            INFO = 0
096            IF( NPROW.EQ. - 1 ) THEN
097                INFO = - ( 600 + CTXT_ )
098            ELSE
099  
100  *             Here we set execution options for PDSYTTRD
101  
102                PNB = PJLAENV( ICTXT , 2 , 'PDSYTTRD' , 'L' , 0 , 0 , 0 , 0 )
103                ANB = PJLAENV( ICTXT , 3 , 'PDSYTTRD' , 'L' , 0 , 0 , 0 , 0 )
104  
105                INTERLEAVE =( PJLAENV( ICTXT , 4 , 'PDSYTTRD' , 'L' , 1 , 0 , 0 ,
106       $        0 ).EQ.1 )
107                TWOGEMMS =( PJLAENV( ICTXT , 4 , 'PDSYTTRD' , 'L' , 2 , 0 , 0 ,
108       $        0 ).EQ.1 )
109                BALANCED =( PJLAENV( ICTXT , 4 , 'PDSYTTRD' , 'L' , 3 , 0 , 0 ,
110       $        0 ).EQ.1 )
111  
112                CALL CHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , INFO )
113  
114                UPPER = LSAME( UPLO , 'U' )
115                IF( INFO.EQ.0 .AND. DESCA( NB_ ).NE.1 )
116       $            INFO = 600 + NB_
117                    IF( INFO.EQ.0 ) THEN
118  
119  *                     Here is the arithmetic :
120  *                     Let maxnpq = max( np , nq , 2 * ANB )
121  *                     LDV = 4 * max( np , nq ) + 2
122  *                     LWMIN = 2 * ( ANB + 1 ) * LDV + MAX( np , 2 * ANB )
123  *                     = 2 * ( ANB + 1 ) * ( 4 * NPS + 2 ) + NPS
124  
125  *                     This overestimates memory requirements when ANB > NP / 2
126  *                     Memory requirements are lower when interleave = .false.
127  *                     Hence , we could have two sets of memory requirements ,
128  *                     one for interleave and one for
129  
130                        NPS = MAX( NUMROC( N , 1 , 0 , 0 , NPROW ) , 2*ANB )
131                        LWMIN = 2*( ANB + 1 )*( 4*NPS + 2 ) + NPS
132  
133                        WORK( 1 ) = DBLE( LWMIN )
134                        IF( .NOT.LSAME( UPLO , 'L' ) ) THEN
135                            INFO = - 1
136                        ELSE IF( IA.NE.1 ) THEN
137                            INFO = - 4
138                        ELSE IF( JA.NE.1 ) THEN
139                            INFO = - 5
140                        ELSE IF( NPROW.NE.NPCOL ) THEN
141                            INFO = - ( 600 + CTXT_ )
142                        ELSE IF( DESCA( DTYPE_ ).NE.1 ) THEN
143                            INFO = - ( 600 + DTYPE_ )
144                        ELSE IF( DESCA( MB_ ).NE.1 ) THEN
145                            INFO = - ( 600 + MB_ )
146                        ELSE IF( DESCA( NB_ ).NE.1 ) THEN
147                            INFO = - ( 600 + NB_ )
148                        ELSE IF( DESCA( RSRC_ ).NE.0 ) THEN
149                            INFO = - ( 600 + RSRC_ )
150                        ELSE IF( DESCA( CSRC_ ).NE.0 ) THEN
151                            INFO = - ( 600 + CSRC_ )
152                        ELSE IF( LWORK.LT.LWMIN ) THEN
153                            INFO = - 11
154                        END IF
155                    END IF
156                    IF( UPPER ) THEN
157                        IDUM1( 1 ) = ICHAR( 'U' )
158                    ELSE
159                        IDUM1( 1 ) = ICHAR( 'L' )
160                    END IF
161                    IDUM2( 1 ) = 1
162  
163                    CALL PCHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , 1 , IDUM1 , IDUM2 ,
164       $            INFO )
165                END IF
166  
167                IF( INFO.NE.0 ) THEN
168                    CALL PXERBLA( ICTXT , 'PDSYTTRD' , - INFO )
169                    RETURN
170                END IF
171  
172  *             Quick return if possible
173  
174                IF( N.EQ.0 )
175       $            RETURN
176  
177  *                 Reduce the lower triangle of sub( A )
178                    NP = NUMROC( N , 1 , MYROW , 0 , NPROW )
179                    NQ = NUMROC( N , 1 , MYCOL , 0 , NPCOL )
180  
181                    NXTROW = 0
182                    NXTCOL = 0
183  
184                    LIIP1 = 1
185                    LIJP1 = 1
186                    NPM1 = NP
187                    NQM1 = NQ
188  
189                    LDA = DESCA( LLD_ )
190                    ICTXT = DESCA( CTXT_ )
191  
192  *                 Miscellaneous details :
193  *                 Put tau , D and E in the right places
194  *                 Check signs
195  *                 Place all the arrays in WORK , control their placement
196  *                 in memory.
197  
198  *                 Loop invariants
199  *                 A(LIIP1 , LIJ) points to the first element of A(I + 1 , J)
200  *                 NPM1 , NQM1 = the number of rows , cols in A( LII + 1 : N , LIJ + 1 : N )
201  *                 A(LII : N , LIJ : N) is one step out of date.
202  *                 proc( CURROW , CURCOL ) owns A(LII , LIJ)
203  *                 proc( NXTROW , CURCOL ) owns A(LIIP1 , LIJ)
204  
205                    INH = 1
206  
207                    IF( INTERLEAVE ) THEN
208  
209  *                     H and V are interleaved to minimize memory movement
210  *                     LDV has to be twice as large to accomodate interleaving.
211  *                     In addition , LDV is doubled again to allow v , h and
212  *                     toptau to be spreaad across and transposed in a
213  *                     single communication operation with minimum memory
214  *                     movement.
215  
216  *                     We could reduce LDV back to 2*MAX(NPM1 , NQM1)
217  *                     by increasing the memory movement required in
218  *                     the spread and transpose of v , h and toptau.
219  *                     However , since the non - interleaved path already
220  *                     provides a mear minimum memory requirement option ,
221  *                     we did not provide this additional path.
222  
223                        LDV = 4*( MAX( NPM1 , NQM1 ) ) + 2
224  
225                        INH = 1
226  
227                        INV = INH + LDV / 2
228                        INVT = INH + ( ANB + 1 )*LDV
229  
230                        INHT = INVT + LDV / 2
231                        INTMP = INVT + LDV*( ANB + 1 )
232  
233                    ELSE
234                        LDV = MAX( NPM1 , NQM1 )
235  
236                        INHT = INH + LDV*( ANB + 1 )
237                        INV = INHT + LDV*( ANB + 1 )
238  
239  *                     The code works without this + 1 , but only because of a
240  *                     coincidence. Without the + 1 , WORK(INVT) gets trashed , but
241  *                     WORK(INVT) is only used once and when it is used , it is
242  *                     multiplied by WORK( INH ) which is zero. Hence , the fact
243  *                     that WORK(INVT) is trashed has no effect.
244  
245                        INVT = INV + LDV*( ANB + 1 ) + 1
246                        INTMP = INVT + LDV*( 2*ANB )
247  
248                    END IF
249  
250                    IF( INFO.NE.0 ) THEN
251                        CALL PXERBLA( ICTXT , 'PDSYTTRD' , - INFO )
252                        WORK( 1 ) = DBLE( LWMIN )
253                        RETURN
254                    END IF
255  
256  *                 The satisfies the loop invariant : trueA = A - V * HT - H * VT ,
257  *                 (where V , H , VT and HT all have BINDEX + 1 rows / columns)
258  *                 the first ANB times through the loop.
259  
260  *                 Setting either( InH and InHT ) or InV to Z_ZERO
261  *                 is adequate except in the face of NaNs.
262  
263                    DO 10 I = 1 , NP
264                        WORK( INH + I - 1 ) = Z_ZERO
265                        WORK( INV + I - 1 ) = Z_ZERO
266     10             CONTINUE
267                    DO 20 I = 1 , NQ
268                        WORK( INHT + I - 1 ) = Z_ZERO
269     20             CONTINUE
270  
271                    TOPNV = Z_ZERO
272  
273                    LTLIP1 = LIJP1
274                    LTNM1 = NPM1
275                    IF( MYCOL.GT.MYROW ) THEN
276                        LTLIP1 = LTLIP1 + 1
277                        LTNM1 = LTNM1 - 1
278                    END IF
279  
280                    DO 210 MININDEX = 1 , N - 1 , ANB
281  
282                        MAXINDEX = MIN( MININDEX + ANB - 1 , N )
283                        LIJB = NUMROC( MAXINDEX , 1 , MYCOL , 0 , NPCOL ) + 1
284                        LIIB = NUMROC( MAXINDEX , 1 , MYROW , 0 , NPROW ) + 1
285  
286                        NQB = NQ - LIJB + 1
287                        NPB = NP - LIIB + 1
288                        INHTB = INHT + LIJB - 1
289                        INVTB = INVT + LIJB - 1
290                        INHB = INH + LIIB - 1
291                        INVB = INV + LIIB - 1
292  
293                        DO 160 INDEX = MININDEX , MIN( MAXINDEX , N - 1 )
294  
295                            BINDEX = INDEX - MININDEX
296  
297                            CURROW = NXTROW
298                            CURCOL = NXTCOL
299  
300                            NXTROW = MOD( CURROW + 1 , NPROW )
301                            NXTCOL = MOD( CURCOL + 1 , NPCOL )
302  
303                            LII = LIIP1
304                            LIJ = LIJP1
305                            NPM0 = NPM1
306  
307                            IF( MYROW.EQ.CURROW ) THEN
308                                NPM1 = NPM1 - 1
309                                LIIP1 = LIIP1 + 1
310                            END IF
311                            IF( MYCOL.EQ.CURCOL ) THEN
312                                NQM1 = NQM1 - 1
313                                LIJP1 = LIJP1 + 1
314                                LTLIP1 = LTLIP1 + 1
315                                LTNM1 = LTNM1 - 1
316                            END IF
317  
318  *                         V = NV , VT = NVT , H = NH , HT = NHT
319  
320  *                         Update the current column of A
321  
322                            IF( MYCOL.EQ.CURCOL ) THEN
323  
324                                INDEXA = LII + ( LIJ - 1 )*LDA
325                                INDEXINV = INV + LII - 1 + ( BINDEX - 1 )*LDV
326                                INDEXINH = INH + LII - 1 + ( BINDEX - 1 )*LDV
327                                CONJTOPH = WORK( INHT + LIJ - 1 + BINDEX*LDV )
328                                CONJTOPV = TOPNV
329  
330                                IF( INDEX.GT.1 ) THEN
331                                    DO 30 I = 0 , NPM0 - 1
332  *                                     A( INDEXA + I ) = A( INDEXA + I )
333                                        A( INDEXA + I ) = A( INDEXA + I ) -
334       $                                WORK( INDEXINV + LDV + I )*CONJTOPH -
335       $                                WORK( INDEXINH + LDV + I )*CONJTOPV
336     30                             CONTINUE
337                                END IF
338  
339                            END IF
340  
341                            IF( MYCOL.EQ.CURCOL ) THEN
342  
343  *                             Compute the householder vector
344  
345                                IF( MYROW.EQ.CURROW ) THEN
346                                    DTMP( 2 ) = A( LII + ( LIJ - 1 )*LDA )
347                                ELSE
348                                    DTMP( 2 ) = ZERO
349                                END IF
350                                IF( MYROW.EQ.NXTROW ) THEN
351                                    DTMP( 3 ) = A( LIIP1 + ( LIJ - 1 )*LDA )
352                                    DTMP( 4 ) = ZERO
353                                ELSE
354                                    DTMP( 3 ) = ZERO
355                                    DTMP( 4 ) = ZERO
356                                END IF
357  
358                                NORM = DNRM2( NPM1 , A( LIIP1 + ( LIJ - 1 )*LDA ) , 1 )
359                                DTMP( 1 ) = NORM
360  
361  *                             IF DTMP(5) = 1.0 , NORM is too large and might cause
362  *                             overflow , hence PDTREECOMB must be called. IF DTMP(5)
363  *                             is zero on output , DTMP(1) can be trusted.
364  
365                                DTMP( 5 ) = ZERO
366                                IF( DTMP( 1 ).GE.SAFMAX .OR. DTMP( 1 ).LT.SAFMIN ) THEN
367                                    DTMP( 5 ) = ONE
368                                    DTMP( 1 ) = ZERO
369                                END IF
370  
371                                DTMP( 1 ) = DTMP( 1 )*DTMP( 1 )
372                                CALL DGSUM2D( ICTXT , 'C' , ' ' , 5 , 1 , DTMP , 5 , - 1 ,
373       $                        CURCOL )
374                                IF( DTMP( 5 ).EQ.ZERO ) THEN
375                                    DTMP( 1 ) = SQRT( DTMP( 1 ) )
376                                ELSE
377                                    DTMP( 1 ) = NORM
378                                    CALL PDTREECOMB( ICTXT , 'C' , 1 , DTMP , - 1 , MYCOL ,
379       $                            DCOMBNRM2 )
380                                END IF
381  
382                                NORM = DTMP( 1 )
383  
384                                D( LIJ ) = DTMP( 2 )
385                                IF( MYROW.EQ.CURROW .AND. MYCOL.EQ.CURCOL ) THEN
386                                    A( LII + ( LIJ - 1 )*LDA ) = D( LIJ )
387                                END IF
388  
389                                ALPHA = DTMP( 3 )
390  
391                                NORM = SIGN( NORM , ALPHA )
392  
393                                IF( NORM.EQ.ZERO ) THEN
394                                    TOPTAU = ZERO
395                                ELSE
396                                    BETA = NORM + ALPHA
397                                    TOPTAU = BETA / NORM
398                                    ONEOVERBETA = 1.0D0 / BETA
399  
400                                    CALL DSCAL( NPM1 , ONEOVERBETA ,
401       $                            A( LIIP1 + ( LIJ - 1 )*LDA ) , 1 )
402                                END IF
403  
404                                IF( MYROW.EQ.NXTROW ) THEN
405                                    A( LIIP1 + ( LIJ - 1 )*LDA ) = Z_ONE
406                                END IF
407  
408                                TAU( LIJ ) = TOPTAU
409                                E( LIJ ) = - NORM
410  
411                            END IF
412  
413  *                         Spread v , nh , toptau across
414  
415                            DO 40 I = 0 , NPM1 - 1
416                                WORK( INV + LIIP1 - 1 + BINDEX*LDV + NPM1 + I ) = A( LIIP1 + I +
417       $( LIJ - 1 )*LDA )
418     40                     CONTINUE
419  
420                            IF( MYCOL.EQ.CURCOL ) THEN
421                                WORK( INV + LIIP1 - 1 + BINDEX*LDV + NPM1 + NPM1 ) = TOPTAU
422                                CALL DGEBS2D( ICTXT , 'R' , ' ' , NPM1 + NPM1 + 1 , 1 ,
423       $                        WORK( INV + LIIP1 - 1 + BINDEX*LDV ) ,
424       $                        NPM1 + NPM1 + 1 )
425                            ELSE
426                                CALL DGEBR2D( ICTXT , 'R' , ' ' , NPM1 + NPM1 + 1 , 1 ,
427       $                        WORK( INV + LIIP1 - 1 + BINDEX*LDV ) ,
428       $                        NPM1 + NPM1 + 1 , MYROW , CURCOL )
429                                TOPTAU = WORK( INV + LIIP1 - 1 + BINDEX*LDV + NPM1 + NPM1 )
430                            END IF
431                            DO 50 I = 0 , NPM1 - 1
432                                WORK( INH + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I ) = WORK( INV + LIIP1 -
433       $                        1 + BINDEX*LDV + NPM1 + I )
434     50                     CONTINUE
435  
436                            IF( INDEX.LT.N ) THEN
437                                IF( MYROW.EQ.NXTROW .AND. MYCOL.EQ.CURCOL )
438       $                            A( LIIP1 + ( LIJ - 1 )*LDA ) = E( LIJ )
439                                END IF
440  
441  *                             Transpose v , nh
442  
443                                IF( MYROW.EQ.MYCOL ) THEN
444                                    DO 60 I = 0 , NPM1 + NPM1
445                                        WORK( INVT + LIJP1 - 1 + BINDEX*LDV + I ) = WORK( INV + LIIP1 - 1 +
446       $                                BINDEX*LDV + I )
447     60                 CONTINUE
448                    ELSE
449                        CALL DGESD2D( ICTXT , NPM1 + NPM1 , 1 ,
450       $                WORK( INV + LIIP1 - 1 + BINDEX*LDV ) , NPM1 + NPM1 ,
451       $                MYCOL , MYROW )
452                        CALL DGERV2D( ICTXT , NQM1 + NQM1 , 1 ,
453       $                WORK( INVT + LIJP1 - 1 + BINDEX*LDV ) , NQM1 + NQM1 ,
454       $                MYCOL , MYROW )
455                    END IF
456  
457                    DO 70 I = 0 , NQM1 - 1
458                        WORK( INHT + LIJP1 - 1 + ( BINDEX + 1 )*LDV + I ) = WORK( INVT +
459       $                LIJP1 - 1 + BINDEX*LDV + NQM1 + I )
460     70             CONTINUE
461  
462  *                 Update the current block column of A
463  
464                    IF( INDEX.GT.1 ) THEN
465                        DO 90 J = LIJP1 , LIJB - 1
466                            DO 80 I = 0 , NPM1 - 1
467  
468                                A( LIIP1 + I + ( J - 1 )*LDA ) = A( LIIP1 + I + ( J - 1 )*LDA )
469       $                        - WORK( INV + LIIP1 - 1 + BINDEX*LDV + I )*
470       $                        WORK( INHT + J - 1 + BINDEX*LDV ) -
471       $                        WORK( INH + LIIP1 - 1 + BINDEX*LDV + I )*
472       $                        WORK( INVT + J - 1 + BINDEX*LDV )
473     80                     CONTINUE
474     90                 CONTINUE
475                    END IF
476  
477  *                 Compute NV = A * NHT ; NVT = A * NH
478  
479  *                 These two lines are necessary because these elements
480  *                 are not always involved in the calls to DTRMVT
481  *                 for two reasons :
482  *                 1) On diagonal processors , the call to TRMVT
483  *                 involves only LTNM1 - 1 elements
484  *                 2) On some processes , NQM1 < LTM1 or LIIP1 < LTLIP1
485  *                 and when the results are combined across all processes ,
486  *                 uninitialized values may be included.
487                    WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV ) = Z_ZERO
488                    WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*LDV + NQM1 - 1 ) = Z_ZERO
489  
490                    IF( MYROW.EQ.MYCOL ) THEN
491                        IF( LTNM1.GT.1 ) THEN
492                            CALL DTRMVT ( 'L' , LTNM1 - 1 ,
493       $                    A( LTLIP1 + 1 + ( LIJP1 - 1 )*LDA ) , LDA ,
494       $                    WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*LDV ) , 1 ,
495       $                    WORK( INH + LTLIP1 + 1 - 1 + ( BINDEX + 1 )*LDV ) ,
496       $                    1 , WORK( INV + LTLIP1 + 1 - 1 + ( BINDEX + 1 )*
497       $                    LDV ) , 1 , WORK( INHT + LIJP1 - 1 + ( BINDEX +
498       $                    1 )*LDV ) , 1 )
499                        END IF
500                        DO 100 I = 1 , LTNM1
501                            WORK( INVT + LIJP1 + I - 1 - 1 + ( BINDEX + 1 )*LDV )
502       $                    = WORK( INVT + LIJP1 + I - 1 - 1 + ( BINDEX + 1 )*LDV ) +
503       $                    A( LTLIP1 + I - 1 + ( LIJP1 + I - 1 - 1 )*LDA )*
504       $                    WORK( INH + LTLIP1 + I - 1 - 1 + ( BINDEX + 1 )*LDV )
505    100                 CONTINUE
506                    ELSE
507                        IF( LTNM1.GT.0 )
508       $                    CALL DTRMVT ( 'L' , LTNM1 , A( LTLIP1 + ( LIJP1 - 1 )*LDA ) ,
509       $                    LDA , WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*
510       $                    LDV ) , 1 , WORK( INH + LTLIP1 - 1 + ( BINDEX +
511       $                    1 )*LDV ) , 1 , WORK( INV + LTLIP1 - 1 +
512       $( BINDEX + 1 )*LDV ) , 1 ,
513       $                    WORK( INHT + LIJP1 - 1 + ( BINDEX + 1 )*LDV ) ,
514       $                    1 )
515  
516                        END IF
517  
518  *                     We take advantage of the fact that :
519  *                     A * sum( B ) = sum( A * B ) for matrices A , B
520  
521  *                     trueA = A + V * HT + H * VT
522  *                     hence :(trueA)v = Av' + V * HT * v + H * VT * v
523  *                     VT * v = sum_p_in_NPROW( VTp * v )
524  *                     H * VT * v = H * sum(VTp * v) = sum( H * VTp * v )
525  
526  *                     v = v + V * HT * h + H * VT * h
527  
528  *                     tmp = HT * nh1
529                        DO 110 I = 1 , 2*( BINDEX + 1 )
530                            WORK( INTMP - 1 + I ) = 0
531    110                 CONTINUE
532  
533                        IF( BALANCED ) THEN
534                            NPSET = NPROW
535                            MYSETNUM = MYROW
536                            ROWSPERPROC = ICEIL( NQB , NPSET )
537                            MYFIRSTROW = MIN( NQB + 1 , 1 + ROWSPERPROC*MYSETNUM )
538                            NUMROWS = MIN( ROWSPERPROC , NQB - MYFIRSTROW + 1 )
539  
540  *                         tmp = HT * v
541  
542                            CALL DGEMV( 'C' , NUMROWS , BINDEX + 1 , Z_ONE ,
543       $                    WORK( INHTB + MYFIRSTROW - 1 ) , LDV ,
544       $                    WORK( INHTB + MYFIRSTROW - 1 + ( BINDEX + 1 )*LDV ) ,
545       $                    1 , Z_ZERO , WORK( INTMP ) , 1 )
546  *                         tmp2 = VT * v
547                            CALL DGEMV( 'C' , NUMROWS , BINDEX + 1 , Z_ONE ,
548       $                    WORK( INVTB + MYFIRSTROW - 1 ) , LDV ,
549       $                    WORK( INHTB + MYFIRSTROW - 1 + ( BINDEX + 1 )*LDV ) ,
550       $                    1 , Z_ZERO , WORK( INTMP + BINDEX + 1 ) , 1 )
551  
552                            CALL DGSUM2D( ICTXT , 'C' , ' ' , 2*( BINDEX + 1 ) , 1 ,
553       $                    WORK( INTMP ) , 2*( BINDEX + 1 ) , - 1 , - 1 )
554                        ELSE
555  *                         tmp = HT * v
556  
557                            CALL DGEMV( 'C' , NQB , BINDEX + 1 , Z_ONE , WORK( INHTB ) ,
558       $                    LDV , WORK( INHTB + ( BINDEX + 1 )*LDV ) , 1 ,
559       $                    Z_ZERO , WORK( INTMP ) , 1 )
560  *                         tmp2 = VT * v
561                            CALL DGEMV( 'C' , NQB , BINDEX + 1 , Z_ONE , WORK( INVTB ) ,
562       $                    LDV , WORK( INHTB + ( BINDEX + 1 )*LDV ) , 1 ,
563       $                    Z_ZERO , WORK( INTMP + BINDEX + 1 ) , 1 )
564  
565                        END IF
566  
567                        IF( BALANCED ) THEN
568                            MYSETNUM = MYCOL
569  
570                            ROWSPERPROC = ICEIL( NPB , NPSET )
571                            MYFIRSTROW = MIN( NPB + 1 , 1 + ROWSPERPROC*MYSETNUM )
572                            NUMROWS = MIN( ROWSPERPROC , NPB - MYFIRSTROW + 1 )
573  
574                            CALL DGSUM2D( ICTXT , 'R' , ' ' , 2*( BINDEX + 1 ) , 1 ,
575       $                    WORK( INTMP ) , 2*( BINDEX + 1 ) , - 1 , - 1 )
576  
577  *                         v = v + V * tmp
578                            IF( INDEX.GT.1. ) THEN
579                                CALL DGEMV( 'N' , NUMROWS , BINDEX + 1 , Z_NEGONE ,
580       $                        WORK( INVB + MYFIRSTROW - 1 ) , LDV ,
581       $                        WORK( INTMP ) , 1 , Z_ONE ,
582       $                        WORK( INVB + MYFIRSTROW - 1 + ( BINDEX + 1 )*
583       $                        LDV ) , 1 )
584  
585  *                             v = v + H * tmp2
586                                CALL DGEMV( 'N' , NUMROWS , BINDEX + 1 , Z_NEGONE ,
587       $                        WORK( INHB + MYFIRSTROW - 1 ) , LDV ,
588       $                        WORK( INTMP + BINDEX + 1 ) , 1 , Z_ONE ,
589       $                        WORK( INVB + MYFIRSTROW - 1 + ( BINDEX + 1 )*
590       $                        LDV ) , 1 )
591                            END IF
592  
593                        ELSE
594  *                         v = v + V * tmp
595                            CALL DGEMV( 'N' , NPB , BINDEX + 1 , Z_NEGONE , WORK( INVB ) ,
596       $                    LDV , WORK( INTMP ) , 1 , Z_ONE ,
597       $                    WORK( INVB + ( BINDEX + 1 )*LDV ) , 1 )
598  
599  *                         v = v + H * tmp2
600                            CALL DGEMV( 'N' , NPB , BINDEX + 1 , Z_NEGONE , WORK( INHB ) ,
601       $                    LDV , WORK( INTMP + BINDEX + 1 ) , 1 , Z_ONE ,
602       $                    WORK( INVB + ( BINDEX + 1 )*LDV ) , 1 )
603  
604                        END IF
605  
606  *                     Transpose NV and add it back into NVT
607  
608                        IF( MYROW.EQ.MYCOL ) THEN
609                            DO 120 I = 0 , NQM1 - 1
610                                WORK( INTMP + I ) = WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*LDV +
611       $                        I )
612    120                     CONTINUE
613                        ELSE
614                            CALL DGESD2D( ICTXT , NQM1 , 1 ,
615       $                    WORK( INVT + LIJP1 - 1 + ( BINDEX + 1 )*LDV ) ,
616       $                    NQM1 , MYCOL , MYROW )
617                            CALL DGERV2D( ICTXT , NPM1 , 1 , WORK( INTMP ) , NPM1 , MYCOL ,
618       $                    MYROW )
619  
620                        END IF
621                        DO 130 I = 0 , NPM1 - 1
622                            WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I ) = WORK( INV + LIIP1 -
623       $                    1 + ( BINDEX + 1 )*LDV + I ) + WORK( INTMP + I )
624    130                 CONTINUE
625  
626  *                     Sum - to - one NV rowwise(within a row)
627  
628                        CALL DGSUM2D( ICTXT , 'R' , ' ' , NPM1 , 1 ,
629       $                WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV ) , NPM1 ,
630       $                MYROW , NXTCOL )
631  
632  *                     Dot product c = NV * NH
633  *                     Sum - to - all c within next processor column
634  
635                        IF( MYCOL.EQ.NXTCOL ) THEN
636                            CC( 1 ) = Z_ZERO
637                            DO 140 I = 0 , NPM1 - 1
638                                CC( 1 ) = CC( 1 ) + WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*
639       $                        LDV + I )*WORK( INH + LIIP1 - 1 + ( BINDEX + 1 )*LDV +
640       $                        I )
641    140                     CONTINUE
642                            IF( MYROW.EQ.NXTROW ) THEN
643                                CC( 2 ) = WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV )
644                                CC( 3 ) = WORK( INH + LIIP1 - 1 + ( BINDEX + 1 )*LDV )
645                            ELSE
646                                CC( 2 ) = Z_ZERO
647                                CC( 3 ) = Z_ZERO
648                            END IF
649                            CALL DGSUM2D( ICTXT , 'C' , ' ' , 3 , 1 , CC , 3 , - 1 , NXTCOL )
650  
651                            TOPV = CC( 2 )
652                            C = CC( 1 )
653                            TOPH = CC( 3 )
654  
655                            TOPNV = TOPTAU*( TOPV - C*TOPTAU / 2*TOPH )
656  
657  *                         Compute V = Tau * (V - C * Tau' / 2 * H )
658  
659                            DO 150 I = 0 , NPM1 - 1
660                                WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I ) = TOPTAU*
661       $( WORK( INV + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I ) - C*TOPTAU /
662       $                        2*WORK( INH + LIIP1 - 1 + ( BINDEX + 1 )*LDV + I ) )
663    150                     CONTINUE
664  
665                        END IF
666  
667    160                 CONTINUE
668  
669  *                     Perform the rank2k update
670  
671                        IF( MAXINDEX.LT.N ) THEN
672  
673                            DO 170 I = 0 , NPM1 - 1
674                                WORK( INTMP + I ) = WORK( INH + LIIP1 - 1 + ANB*LDV + I )
675    170                     CONTINUE
676  
677                            IF( .NOT.TWOGEMMS ) THEN
678                                IF( INTERLEAVE ) THEN
679                                    LDZG = LDV / 2
680                                ELSE
681                                    CALL DLACPY( 'A' , LTNM1 , ANB , WORK( INHT + LIJP1 - 1 ) ,
682       $                            LDV , WORK( INVT + LIJP1 - 1 + ANB*LDV ) , LDV )
683  
684                                    CALL DLACPY( 'A' , LTNM1 , ANB , WORK( INV + LTLIP1 - 1 ) ,
685       $                            LDV , WORK( INH + LTLIP1 - 1 + ANB*LDV ) , LDV )
686                                    LDZG = LDV
687                                END IF
688                                NBZG = ANB*2
689                            ELSE
690                                LDZG = LDV
691                                NBZG = ANB
692                            END IF
693  
694                            DO 180 PBMIN = 1 , LTNM1 , PNB
695  
696                                PBSIZE = MIN( PNB , LTNM1 - PBMIN + 1 )
697                                PBMAX = MIN( LTNM1 , PBMIN + PNB - 1 )
698                                CALL DGEMM( 'N' , 'C' , PBSIZE , PBMAX , NBZG , Z_NEGONE ,
699       $                        WORK( INH + LTLIP1 - 1 + PBMIN - 1 ) , LDZG ,
700       $                        WORK( INVT + LIJP1 - 1 ) , LDZG , Z_ONE ,
701       $                        A( LTLIP1 + PBMIN - 1 + ( LIJP1 - 1 )*LDA ) , LDA )
702                                IF( TWOGEMMS ) THEN
703                                    CALL DGEMM( 'N' , 'C' , PBSIZE , PBMAX , ANB , Z_NEGONE ,
704       $                            WORK( INV + LTLIP1 - 1 + PBMIN - 1 ) , LDZG ,
705       $                            WORK( INHT + LIJP1 - 1 ) , LDZG , Z_ONE ,
706       $                            A( LTLIP1 + PBMIN - 1 + ( LIJP1 - 1 )*LDA ) , LDA )
707                                END IF
708    180                     CONTINUE
709  
710                            DO 190 I = 0 , NPM1 - 1
711                                WORK( INV + LIIP1 - 1 + I ) = WORK( INV + LIIP1 - 1 + ANB*LDV + I )
712                                WORK( INH + LIIP1 - 1 + I ) = WORK( INTMP + I )
713    190                     CONTINUE
714                            DO 200 I = 0 , NQM1 - 1
715                                WORK( INHT + LIJP1 - 1 + I ) = WORK( INHT + LIJP1 - 1 + ANB*LDV + I )
716    200                     CONTINUE
717  
718                        END IF
719  
720  *                     End of the update A code
721  
722    210             CONTINUE
723  
724                    IF( MYCOL.EQ.NXTCOL ) THEN
725                        IF( MYROW.EQ.NXTROW ) THEN
726  
727                            D( NQ ) = A( NP + ( NQ - 1 )*LDA )
728  
729                            CALL DGEBS2D( ICTXT , 'C' , ' ' , 1 , 1 , D( NQ ) , 1 )
730                        ELSE
731                            CALL DGEBR2D( ICTXT , 'C' , ' ' , 1 , 1 , D( NQ ) , 1 , NXTROW ,
732       $                    NXTCOL )
733                        END IF
734                    END IF
735  
736                    WORK( 1 ) = DBLE( LWMIN )
737                    RETURN
738  
739  *                 End of PDSYTTRD
740  
741                END